#9
Post
by penpen » 02 Nov 2013 19:07
I'm not sure, but i think the problem (of the line: cmd /d /u /c type hi.txt | find /v "") seems to be caused by the usage of 1A (SUB) as an eof character under (old) dos and windows (seems still be in use, when using type and find).
In the good old ms-dos 6.22 days a file was fully read (if buffered or not) up to the end of file. After the file was read fully all further read attemps result in the MS-DOS version of the end of file (eof) character 0x1A, also in use as SUB(SUBSTITUTE).
When reading the file hi.txt (stored in ANSI i think) the content is in hex: 68 69 1A 74 68 65 72 65 0D 0A.
cmd /d /u /c type hi.txt connects a unicode file input stream, so the read results in: 6800 6900 1A00 7400 6800 6500 7200 6500 (Unicode 1.0, UTF-16).
This is piped to the Unicode output interpreted as ANSI: 68 00 69 00 1A 00 74 00 68 00 65 00 72 00 65 00.
The program find then interprets 0x00 chars as line endings, so it treats 1A 00 as a full read line => EOF reached. Finished. So it outputs only the "hi" part.
The other command (type hi.txt | find /v "") does all operations using ANSI, so there is no single read that result is only the char 0x1A, so it writes all.
I'm not sure if the input/outputstreams (file pipe) are set up to putthrough, so all characters after the 0x1A character could be read by the find command, or if it reads only reads the whole actual content that is in the file reading buffer.
So it may result in errors if the rest of the file is big to fit into the file input stream, and only the actual content is read by find: Then all that is not in the buffer is cut of.
The command type itself handles the SUB char as an EOF char, too, just try: type hi.txt.
Same for the command (type hi.txt >con).
penpen
Edit 1/2: You may store the file hi.txt using Unicode, and if the streams are configured to putthrough and are connected alltogether, then your code may work, as you expect, as the input streams are initialized wit FFEE and know that they have to autocast from Unicode to ANSI.
Last edited by
penpen on 03 Nov 2013 02:30, edited 2 times in total.