Page 1 of 2

WMIC and FOR /F : A fix for the trailing <CR> problem

Posted: 01 Feb 2013 19:49
by dbenham
Somewhere on DosTips is at least one thread that talks about problems processing WMIC output with FOR /F. The problem stems from the fact that WMIC produces unicode output. Normally, FOR /F cannot handle unicode output. But somehow FOR /F processes the output of WMIC and converts it to ASCII. But there is an odd side effect that causes each parsed line to have an unwanted trailing <CR> character. Normally FOR /F does not preserve empty lines. But now the empty lines are no longer empty since they have a trailing <CR>.

The unwanted trailing <CR> can create complications when trying to parse the output.

I believe there are other commands like WMIC that have the same problem. For example, PING on XP.

Here is a demonstration of the <CR> problem

Code: Select all

C:\test>for /f "delims=" %A in ('wmic os get localDateTime') do @echo [%A]
]LocalDateTime
]20130201203753.938000-300
]

The trailing <CR> causes the cursor to reset to the beginning of the line and then the [ is overwritten by the ].

Each line actually ended with <CR><CR><LF>. FOR /F breaks at <LF> and strips off the last remaining character if it happens to be a <CR>. But it only strips a single <CR>, hence the problem with the unwanted trailing <CR> on each line.

Just today it dawned on me that there is a really simple solution - Simply pass the line through another FOR /F :idea:

Code: Select all

C:\test>for /f "delims=" %A in ('wmic os get localDateTime') do @for /f "delims=" %B in ("%A") do @echo [%B]
[LocalDateTime              ]
[20130201204419.057000-300  ]

Much better :)


Dave Benham

Re: WMIC and FOR /F : A fix for the trailing <CR> problem

Posted: 01 Feb 2013 23:06
by foxidrive
dbenham wrote:Each line actually ended with <CR><CR><LF>. FOR /F breaks at <LF> and strips off the last remaining character if it happens to be a <CR>. But it only strips a single <CR>, hence the problem with the unwanted trailing <CR> on each line.

Just today it dawned on me that there is a really simple solution - Simply pass the line through another FOR /F :idea:

Code: Select all

C:\test>for /f "delims=" %A in ('wmic os get localDateTime') do @for /f "delims=" %B in ("%A") do @echo [%B]
[LocalDateTime              ]
[20130201204419.057000-300  ]

Much better :)


This will work too Dave, and is how I solved the issue in the past. It also preserves the blank lines in the IPconfig output.

for /f "delims=" %A in ('wmic os get localDateTime') do @cmd /c echo.%A

Re: WMIC and FOR /F : A fix for the trailing <CR> problem

Posted: 01 Feb 2013 23:35
by dbenham
That certainly is less code, and will work in most cases. But it is less efficient, and also will not handle text with special characters properly.

The extra FOR /F solution is very fast, and can handle any combination of characters.


Dave Benham

Re: WMIC and FOR /F : A fix for the trailing <CR> problem

Posted: 01 Feb 2013 23:54
by foxidrive
dbenham wrote:That certainly is less code, and will work in most cases. But it is less efficient, and also will not handle text with special characters properly.

The extra FOR /F solution is very fast, and can handle any combination of characters.


I guess you can use mine if you want to preserve the blank lines. :)

I'm curious though Dave, what special characters will yours handle but mine won't?

Re: WMIC and FOR /F : A fix for the trailing <CR> problem

Posted: 02 Feb 2013 00:04
by Liviu
dbenham wrote:Just today it dawned on me that there is a really simple solution - Simply pass the line through another FOR /F :idea:

FWIW below is sample output under XP.

Code: Select all

C:\tmp>for /f "delims=" %A in ('wmic os get localDateTime') do @echo [%A]
]LocalDateTime
]20130201235820.779000-360

C:\tmp>for /f "delims=" %A in ('wmic os get localDateTime') do @(echo [%A] | more)
[LocalDateTime              ]

[20130201235824.342000-360  ]


C:\tmp>
It's not a typo, the second run has that additional blank line. But otherwise it's as fast as an extra for/f wrapper loop.

Liviu

Re: WMIC and FOR /F : A fix for the trailing <CR> problem

Posted: 02 Feb 2013 00:38
by dbenham
@Liviu - yes, but typically I want to save the the result in an environment variable, not just echo the result :wink: Also, the MORE command converts tab into spaces.

foxidrive wrote:I'm curious though Dave, what special characters will yours handle but mine won't?
Any of the usual culprits: & | < >
They would have to be escaped or quoted.


Dave Benham

Re: WMIC and FOR /F : A fix for the trailing <CR> problem

Posted: 02 Feb 2013 03:09
by foxidrive
dbenham wrote:
foxidrive wrote:I'm curious though Dave, what special characters will yours handle but mine won't?
Any of the usual culprits: & | < >
They would have to be escaped or quoted.


Yes, fair enough.

Re: WMIC and FOR /F : A fix for the trailing <CR> problem

Posted: 02 Feb 2013 17:05
by Liviu
dbenham wrote:@Liviu - yes, but typically I want to save the the result in an environment variable, not just echo the result :wink:

Point taken. But then you could get it into a variable directly, then strip off the trailing <CR>.

Code: Select all

C:\tmp>for /f "delims=" %A in ('wmic os get localDateTime') do @set "z=%A" & echo [!z:~0,-1!]
[LocalDateTime              ]
[20130202165407.233000-360  ]
The above works here at a cmd/v prompt in XP. One obvious limitation is that it strips the last character unconditionally, while your secondary for loop works whether there is a trailing <CR> or not. However, the nested loops also presume some knowledge of the wmic output format, for example that there is at most one trailing <CR>.

Code: Select all

C:\tmp>for /f "delims=" %A in ('wmic os get localDateTime') do @(set "z=%A" & for /f "delims=" %B in ("!z!!z:~-1!") do @echo [%B])
]LocalDateTime
]20130202165438.295000-360

Liviu

Re: WMIC and FOR /F : A fix for the trailing <CR> problem

Posted: 13 Apr 2014 14:20
by aGerman
Because the issue comes up again and again I did some tests that should show what actually happens.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The output of

Code: Select all

WMIC TimeZone get Caption /value

shows


Caption=(UTC+01:00) Amsterdam, Berlin, Bern, Rom, Stockholm, Wien




So far it seems to be OK. But to see what was actually outputted by WMIC we have to redirect it into a file and open it with a HEX editor.

Code: Select all

>test1.txt WMIC TimeZone get Caption /value

test1.txt
FFFE 0D00 0A00 0D00 0A00 4300 6100 7000
7400 6900 6F00 6E00 3D00 2800 5500 5400
4300 2B00 3000 3100 3A00 3000 3000 2900
2000 4100 6D00 7300 7400 6500 7200 6400
6100 6D00 2C00 2000 4200 6500 7200 6C00
6900 6E00 2C00 2000 4200 6500 7200 6E00
2C00 2000 5200 6F00 6D00 2C00 2000 5300
7400 6F00 6300 6B00 6800 6F00 6C00 6D00
2C00 2000 5700 6900 6500 6E00
0D00 0A00
0D00 0A00 0D00 0A00


As you can see it comes as unicode stream (UTF-16 Little Endian to be more clear). Also a Byte Order Mark (FF FE) was prepended that specifies the encoding as UTF-16 LE.
Every character has a width of 16 Bits. Due to the Little-Endianess the leading zeros are not prepended but appended to the 8 Bit ASCII characters. UTF-16 LE is also the reason why any Windows linebreak (Carriage Return plus Line Feed = 0D 0A) shows up as 0D 00 0A 00.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

What happens if we try to process the output inside of a FOR /F loop?
I prepended an @ (40) to mark every beginning of a line that was not recognized to be empty by FOR /F. It would be removed automatically otherwise.

Code: Select all

>test2.txt (
  for /f "delims=" %%i in ('WMIC TimeZone get Caption /value') do echo @%%i
)

test2.txt
400D 0D0A 400D 0D0A 4043 6170 7469 6F6E
3D28 5554 432B 3031 3A30 3029 2041 6D73
7465 7264 616D 2C20 4265 726C 696E 2C20
4265 726E 2C20 526F 6D2C 2053 746F 636B
686F 6C6D 2C20 5769 656E
0D0D 0A40 0D0D
0A
40 0D0D 0A40 0D0D 0A

While the normal linebreak was 0D 0A we now see the strange 0D 0D 0A. But everything was redirected using ECHO. As we know ECHO appends a linebreak to every string. To see the actual contents of %%i we should use SET /P instead.

Code: Select all

>test2a.txt (
  for /f "delims=" %%i in ('WMIC TimeZone get Caption /value') do <nul set /p "=@%%i"
)

test2a.txt
400D 400D 4043 6170 7469 6F6E 3D28 5554
432B 3031 3A30 3029 2041 6D73 7465 7264
616D 2C20 4265 726C 696E 2C20 4265 726E
2C20 526F 6D2C 2053 746F 636B 686F 6C6D
2C20 5769 656E
0D40 0D40 0D40 0D

Now things are getting more clear. While the the NUL Bytes for any unicode characters were simply removed something strange must be happened to the 0A 00. Probably because of the NUL Byte only 0A was recognized to be the linebreak. However the 0D was left at the end of the lines now.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

A nested FOR /F loop helps as Dave already mentioned in his initial post

Code: Select all

>test3.txt (
  for /f "delims=" %%i in ('WMIC TimeZone get Caption /value') do for /f "delims=" %%j in ("%%i") do echo @%%j
)

test3.txt
4043 6170 7469 6F6E 3D28 5554 432B 3031
3A30 3029 2041 6D73 7465 7264 616D 2C20
4265 726C 696E 2C20 4265 726E 2C20 526F
6D2C 2053 746F 636B 686F 6C6D 2C20 5769
656E
0D0A

Now everything is "normalized" to ASCII. The final 0D 0A comes from the ECHO command.

Some issues remain open:
- Why does CMD remove the BOM? (Or is it only prepended if you redirect to a file?)
- Why doesn't CMD recognize NUL Bytes as string terminators?
- If it doesn't recognize NUL Bytes as string terminators why doesn't it work properly for 0D 00 0A 00?
- Why was the Carriage Return (0D) recognized to be a line break in the inner loop even if it was already ignored in the outer loop?
- But the most important question ever is: Why the heck did M$ develop a console tool with unicode output even if they knew very well that CMD (along with a lot of other console tools) is not able to process its output accordingly?

Regards
aGerman

Re: WMIC and FOR /F : A fix for the trailing <CR> problem

Posted: 13 Apr 2014 15:54
by Liviu
aGerman wrote:Because the issue comes up again and again I did some tests that should show what actually happens.
Nice detective work there ;-) Now, some guesswork below...

aGerman wrote:Some issues remain open:
- Why does CMD remove the BOM? (Or is it only prepended if you redirect to a file?)
Think it's the latter. CMD itself doesn't insert a BOM not even when run with /U and redirected to a file. On the other hand, a program such as WMIC can detect whether its output stream goes to the console vs. is redirected to a file, and can choose to act differently in the two cases - for example add a BOM in the latter case.

Somewhat related, a program can also decide on its own the output mode - UTF16 vs extended ASCII - regardless of the host CMD running with /U or /A. This explains why WMIC can output UTF16 even at a regular non-/U prompt. As it happens, I only noticed recently that the same can be done from JScript by explicitly using the GetStandardStream(1, true) instead of Echo - see my "This allowed redirection to work, but (surprisingly) the output is _always_ Unicode, even at a 'cmd /a' prompt" in the P.S. at http://www.dostips.com/forum/viewtopic.php?p=33473#p33473.

aGerman wrote:- Why doesn't CMD recognize NUL Bytes as string terminators?
- If it doesn't recognize NUL Bytes as string terminators why doesn't it work properly for 0D 00 0A 00?
Think it's a matter of (screwed up) parsing. Looks to me like the line breaks are detected before the character set translation phase, and at that point LF is recognized as a newline, but since there is no CR _immediately_ preceding, the CR+00 bytes are left in the stream. Then, when the line is converted to the active 8-bit codepage, the characters are "narrowed down", which for regular ASCII means simply discarding the nul 00 bytes.

aGerman wrote:- Why was the Carriage Return (0D) recognized to be a line break in the inner loop even if it was already ignored in the outer loop?
Don't think it's recognized as a line break. Rather, it looks like a side effect to the "for/f discarding one trailing CR" quirk, that was noted before, not too long ago at http://www.dostips.com/forum/viewtopic.php?p=32766#p32766.

aGerman wrote:- But the most important question ever is: Why the heck did M$ develop a console tool with unicode output even if they knew very well that CMD (along with a lot of other console tools) is not able to process its output accordingly?
Can you imagine how boring the world would be without this silly, crazy, tireless cmd ;-)

Liviu

Re: WMIC and FOR /F : A fix for the trailing <CR> problem

Posted: 13 Apr 2014 16:39
by aGerman
Liviu wrote:Nice detective work there ;-)

It's just an illustration of the stuff that was already discussed here :)
Thanks for looking at the issues though. There is one thing I don't understand yet

Liviu wrote:
aGerman wrote:- Why was the Carriage Return (0D) recognized to be a line break in the inner loop even if it was already ignored in the outer loop?
Don't think it's recognized as a line break. Rather, it looks like a side effect to the "for/f discarding one trailing CR" quirk, that was noted before, not too long ago at http://www.dostips.com/forum/viewtopic.php?p=32766#p32766.

Does this mean that trailing CRs are stripped by FOR /F from each line? The stand-alone CRs become "empty strings" so to speak and can't be parsed by the FOR /F for that reason?

Liviu wrote:
aGerman wrote:- But the most important question ever is: Why the heck did M$ develop a console tool with unicode output even if they knew very well that CMD (along with a lot of other console tools) is not able to process its output accordingly?
Can you imagine how boring the world would be without this silly, crazy, tireless cmd ;-)

I feel very much inclined to agree with you as long as Batch coding is just for fun. Woe to those who need a well-functioning script -- stumbling blocks everywhere :lol:

Regards
aGerman

Re: WMIC and FOR /F : A fix for the trailing <CR> problem

Posted: 13 Apr 2014 17:04
by Ed Dyreen
aGerman wrote:
Liviu wrote:Can you imagine how boring the world would be without this silly, crazy, tireless cmd ;-)
I feel very much inclined to agree with you as long as Batch coding is just for fun. Woe to those who need a well-functioning script -- stumbling blocks everywhere :lol:
Stop bashing batch !
I used to be playing with java but the inheritance was really annoying, I switched to vbscript but the objects were so annoying. Now I use batch and happily simulate everything very inefficiently. I need a well-functioning script :)

Re: WMIC and FOR /F : A fix for the trailing <CR> problem

Posted: 13 Apr 2014 17:08
by penpen
I've tricked myself, so this post is wrong, see dbenham
's next post.

Under Win XP the \r characters were all ignored/removed when using with "for/F (string)":
test.bat in hex wrote:40 65 63 68 6F 20 6F 66 66 0D 0A 66 6F 72 20 2F 46 20 22 64 65 6C 69 6D 73 3D 22 20 25 25 61 20 69 6E 20 28 22 0D 30 0D 31 0D 32 0D 33 0D 22 29 20 64 6F 20 65 63 68 6F 20 40 25 25 61 80 0D 0A

test.bat using c style escape characters wrote:@echo off
setlocal enableDelayedExpansion
for /F "delims=" %%a in ("\r0\r1\r2\r3\r") do echo @%%a€

Code: Select all

Z:\>test
@0123Ç

penpen

Edit: Marked this post as wrong.

Re: WMIC and FOR /F : A fix for the trailing <CR> problem

Posted: 13 Apr 2014 17:33
by penpen
I could imagine, that wmic changes the encoding of the std output stream to unicode.
If this is true then, when reading from the redirected "inner for /F input" the characters are treated as unicode codepoints, so the cmd session may never see any zero characters, and the BOM may be automatically removed in such a case.
In addition the additional "\r" is appended by the for loop processing; maybe for uses ansi on line recognizing, because the cmd uses ansi or the output is ansi (, or whatever) and removes "000A" as it expects to be "0D0A" (screwed up parsing as Liviu assumes).

If wmic doesn't switch the encoding, then it should output similar like:
OutTest.bat:

Code: Select all

@echo off
for /F "delims=" %%a in ('OutTest.exe') do echo @%%a;
OutTest.cs

Code: Select all

using System;
using System.Text;
using System.IO;

public class GotoXY {
   public static void Main (string [] args) {
      Console.Out.Write ("1\02\03\0\r\0\n\0");
   }
}

Result:

Code: Select all

Z:\>OutTest.bat
@1;

penpen

Edit: Added the example.

Re: WMIC and FOR /F : A fix for the trailing <CR> problem

Posted: 13 Apr 2014 17:55
by aGerman
@Ed
Everything has pros and cons. Even if this forum is meant to provide Batch knowledge it doesn't make any sense not to face the facts. Don't get me wrong Batch has its raison d'être but it has also its limitations and pitfalls that we shouldn't keep in the dark. This thread is a good example how we all try to improve things :wink: I don't know why you thought it's an attack against Batch.

@penpen
Interesting read. I'm pretty sure that WMIC alters the output depending on where the stream was sent. The BOM is a good indicator. But I don't understand why the WMIC should change the encoding. Although I know that there are WMIC versions that don't output in unicode.

Regards
aGerman