Because the issue comes up again and again I did some tests that should show what actually happens.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The output of
shows
Caption=(UTC+01:00) Amsterdam, Berlin, Bern, Rom, Stockholm, Wien
So far it seems to be OK. But to see what was actually outputted by WMIC we have to redirect it into a file and open it with a HEX editor.
Code: Select all
>test1.txt WMIC TimeZone get Caption /value
test1.txt
FFFE 0D00 0A00 0D00 0A00 4300 6100 7000
7400 6900 6F00 6E00 3D00 2800 5500 5400
4300 2B00 3000 3100 3A00 3000 3000 2900
2000 4100 6D00 7300 7400 6500 7200 6400
6100 6D00 2C00 2000 4200 6500 7200 6C00
6900 6E00 2C00 2000 4200 6500 7200 6E00
2C00 2000 5200 6F00 6D00 2C00 2000 5300
7400 6F00 6300 6B00 6800 6F00 6C00 6D00
2C00 2000 5700 6900 6500 6E00 0D00 0A00
0D00 0A00 0D00 0A00
As you can see it comes as unicode stream (UTF-16 Little Endian to be more clear). Also a
Byte Order Mark (FF FE) was prepended that specifies the encoding as UTF-16 LE.
Every character has a width of 16 Bits. Due to the Little-Endianess the leading zeros are not prepended but appended to the 8 Bit ASCII characters. UTF-16 LE is also the reason why any
Windows linebreak (Carriage Return plus Line Feed = 0D 0A) shows up as
0D 00 0A 00.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
What happens if we try to process the output inside of a FOR /F loop?
I prepended an
@ (40) to mark every beginning of a line that was not recognized to be empty by FOR /F. It would be removed automatically otherwise.
Code: Select all
>test2.txt (
for /f "delims=" %%i in ('WMIC TimeZone get Caption /value') do echo @%%i
)
test2.txt
400D 0D0A 400D 0D0A 4043 6170 7469 6F6E
3D28 5554 432B 3031 3A30 3029 2041 6D73
7465 7264 616D 2C20 4265 726C 696E 2C20
4265 726E 2C20 526F 6D2C 2053 746F 636B
686F 6C6D 2C20 5769 656E 0D0D 0A40 0D0D
0A40 0D0D 0A40 0D0D 0A
While the normal linebreak was
0D 0A we now see the strange
0D 0D 0A. But everything was redirected using ECHO. As we know ECHO appends a linebreak to every string. To see the actual contents of %%i we should use SET /P instead.
Code: Select all
>test2a.txt (
for /f "delims=" %%i in ('WMIC TimeZone get Caption /value') do <nul set /p "=@%%i"
)
test2a.txt
400D 400D 4043 6170 7469 6F6E 3D28 5554
432B 3031 3A30 3029 2041 6D73 7465 7264
616D 2C20 4265 726C 696E 2C20 4265 726E
2C20 526F 6D2C 2053 746F 636B 686F 6C6D
2C20 5769 656E 0D40 0D40 0D40 0D
Now things are getting more clear. While the the NUL Bytes for any unicode characters were simply removed something strange must be happened to the
0A 00. Probably because of the NUL Byte only
0A was recognized to be the linebreak. However the
0D was left at the end of the lines now.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A nested FOR /F loop helps as Dave already mentioned in his initial post
Code: Select all
>test3.txt (
for /f "delims=" %%i in ('WMIC TimeZone get Caption /value') do for /f "delims=" %%j in ("%%i") do echo @%%j
)
test3.txt
4043 6170 7469 6F6E 3D28 5554 432B 3031
3A30 3029 2041 6D73 7465 7264 616D 2C20
4265 726C 696E 2C20 4265 726E 2C20 526F
6D2C 2053 746F 636B 686F 6C6D 2C20 5769
656E 0D0A
Now everything is "normalized" to ASCII. The final
0D 0A comes from the ECHO command.
Some issues remain open:
- Why does CMD remove the BOM? (Or is it only prepended if you redirect to a file?)
- Why doesn't CMD recognize NUL Bytes as string terminators?
- If it doesn't recognize NUL Bytes as string terminators why doesn't it work properly for 0D 00 0A 00?
- Why was the Carriage Return (0D) recognized to be a line break in the inner loop even if it was already ignored in the outer loop?
- But the most important question ever is: Why the heck did M$ develop a console tool with unicode output even if they knew very well that CMD (along with a lot of other console tools) is not able to process its output accordingly?
Regards
aGerman