japanase cmd
Posted: 08 Apr 2014 03:19
Hello I have curiosity about how cmd handle unicode characters in variables.
Then, I get a windows japanase.
My first impression is that the path separator is other. It looks like a Y with a hyphen, and looks as it in notepad also.
But redirecting the ouput of cd. to a file, and looking the file in a english windows and hexadecimal editor, I found that is the same \ character, but in that windows it looks like a Y with a hyphen, but internally is the same \ of all life.
The codepage that it uses is 932.
I correctly save unicode text in a variable, and also I create a script, but a save it as ansi from notepad, and works:
I test saving it encoded as Unicode, Unicode Big Endian, UTF-8, and only works from cmd saving as ANSI.
Then I ask me how it representate internally the unicode characters using 8 bit bytes, and my answer is that cmd translates the 8 bit characters sequence to unicode. Because it the \ is showed like the character Y with hyphen.
This is a hexadecimal output image of the working ansi batch script that save a unicode text in variable.
I used the only japanse word that I know: sokoban.
Also is interesting, that it uses the raster font (or "terminal" for programmers), but it have the japanase characters.
Then, I get a windows japanase.
My first impression is that the path separator is other. It looks like a Y with a hyphen, and looks as it in notepad also.
But redirecting the ouput of cd. to a file, and looking the file in a english windows and hexadecimal editor, I found that is the same \ character, but in that windows it looks like a Y with a hyphen, but internally is the same \ of all life.
The codepage that it uses is 932.
I correctly save unicode text in a variable, and also I create a script, but a save it as ansi from notepad, and works:
I test saving it encoded as Unicode, Unicode Big Endian, UTF-8, and only works from cmd saving as ANSI.
Then I ask me how it representate internally the unicode characters using 8 bit bytes, and my answer is that cmd translates the 8 bit characters sequence to unicode. Because it the \ is showed like the character Y with hyphen.
This is a hexadecimal output image of the working ansi batch script that save a unicode text in variable.
I used the only japanse word that I know: sokoban.
Also is interesting, that it uses the raster font (or "terminal" for programmers), but it have the japanase characters.