[Updated] Patch for cmd.exe for windows xp for cp 65001

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
carlos
Expert
Posts: 503
Joined: 20 Aug 2010 13:57
Location: Chile
Contact:

[Updated] Patch for cmd.exe for windows xp for cp 65001

#1 Post by carlos » 11 May 2014 02:07

Hello.
Because cmd.exe for windows xp cannot run a batch script when it use the codepage 65001 (utf-8), I investigate the cause, and I found.
Also, I created a patch for it.

I run correctly a batch script encoded as utf-8 without bom using the codepage 65001 from cmd2.exe :
Image
65001.bat is a utf-8 encoded batch file (without bom) that set a variable called sokoban. As you see in the image, the batch script is not broken as in a normal cmd.exe because have the patch.

Edit: Originally, I write a patch for this. But Jason Hood provide a better patch. Next post are comments above my patch version.
A summary of the patch solution in this link: http://consolesoft.com/p/cmd-xp-65001-fix/index.html


Also the solution is posted here: http://www.dostips.com/forum/viewtopic.php?p=34428#p34428
Updated 19 may 2014.
[/b]

Thanks to Jason Hood.
Last edited by carlos on 27 May 2014 02:09, edited 13 times in total.

penpen
Expert
Posts: 2009
Joined: 23 Jun 2013 06:15
Location: Germany

Re: Patch for cmd.exe for windows xp for cp 65001

#2 Post by penpen » 11 May 2014 11:57

I'm just curious: What is the cause of the bug?
And what are the data values standing for, that your patch changes?

penpen

carlos
Expert
Posts: 503
Joined: 20 Aug 2010 13:57
Location: Chile
Contact:

Re: Patch for cmd.exe for windows xp for cp 65001

#3 Post by carlos » 11 May 2014 13:29

penpen, in the comments of the source there are a explanation.
Last edited by carlos on 11 May 2014 19:32, edited 1 time in total.

Aacini
Expert
Posts: 1914
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: Patch for cmd.exe for windows xp for cp 65001

#4 Post by Aacini » 11 May 2014 13:55

@carlos: I think the really interesting question is: How do you know that this sequence of bytes: { 0x75, 0xF8, 0x57, 0x6A, 0x01, 0xFF, 0x35 } and/or { 0x03, 0xF3, 0x56, 0x6A, 0x01, 0xFF, 0x35 } corresponds to the invocation of MultiByteToWideChar function?

Many years ago I patched command.com in order to have the ECHO OFF by default. I used debug.com to load command.com, run it, execute ECHO ON and EXIT, and save the file on disk; then, repeated the procedure with ECHO OFF. After that, the comparison of both saved files just differ in one byte: the one with ECHO status, so I know where I need to modify the original command.com in order to set ECHO OFF by default.

How do you know that cmd.exe uses MultiByteToWideChar function? How do you know the sequence of bytes that represent the parameters of such function in cmd.exe code?

Antonio

penpen
Expert
Posts: 2009
Joined: 23 Jun 2013 06:15
Location: Germany

Re: Patch for cmd.exe for windows xp for cp 65001

#5 Post by penpen » 11 May 2014 17:22

carlos wrote:penpen, i copy the comment from the source that explain:
Sorry, i've asked in a wrong way and missed how easy it is to misunderstand my above questions ...
(I wanted to ask short questions, but at least "forgotten" that you are not able to read my mind: I'm lacking in concentration because of bad headaches).
I should have asked in another way... .

Well i've read your documentation, used an dll-import-viewer to find the code location of MultiByteToWideChar (4AD01158).
Then i used a hex editor to search for 5811D04A (possible references to that function), and i saw that all are preceeded by "FF15", so all seems to be calls to that function (i cannot read all opcodes, but some looks familiar to me):
FF15 5811D04A == call ds:MultiByteToWideChar.

So you have changed the "dwFlags"-argument of the first two calls to MultiByteToWideChar, but you havent't changed the other 4 calls to that function:

Code: Select all

file offset | code offset | dwFlags set to
------------+-------------+----------------
       5E57 |    4AD06A57 |              1
       A477 |    4AD0B077 |              1
      105CE |    4AD111CE |       ebx ?= 0
      162e6 |    4AD16EE6 |              1
      1a3a5 |    4AD1AFA5 |       ebx == 1
      1c2dd |    4AD1CEDD |              1
So what i've wanted to know with my above questions:
'What is the cause of the bug?' ==
'Because you didn't change the "dwFlags"-argument of the other calls to MultiByteToWideChar, and
because you have written "dwFlags should be 0",
i've assumed you've described in your documentation is just a side effect, and i wanted to know the real cause.'

'And what are the data values standing for, that your patch changes?' ==
'What does MB_PRECOMPOSED mean?'
(Well the second question was just because i was to lazy to use google, and i think you should know their meanings.)

@Aacini:
After you've found out to change the second parameter (i only assumed it is the second, i haven't looked it up),
you just have to find out, what's above the call (maybe using IntelIA-32 Architectures, Software Developer’s Manual, Volume 2, Instruction Set Reference, A-Z):

Code: Select all

FF75 F8       == push [ebp-8]
57            == push esi
6A 01         == push 1
FF35 ECB9D24A == push dword ptr [4A...]
I think carlos just interpreted as much as needed to make these hex offsets unique.

Intel 64 and IA-32 Architectures Software Developer Manuals could be downloaded from this location:
http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html?iid=tech_vt_tech+64-32_manuals

penpen

Edit1: Added the "Intel 64 and IA-32 Architectures Software Developer Manuals" link.
Edit2: I've changed the file location of the third call from (105C0, 4AD111C0) to the real value (E at the end instead of the 0).
Last edited by penpen on 12 May 2014 01:33, edited 1 time in total.

carlos
Expert
Posts: 503
Joined: 20 Aug 2010 13:57
Location: Chile
Contact:

Re: Patch for cmd.exe for windows xp for cp 65001

#6 Post by carlos » 11 May 2014 17:51

@aacini, @penpen : Edited: I edited this info, because the reverse engineering law are differents in many countries. But in Chile law 20435 allow me reverse engineering in a software for correct the functioning.
Last edited by carlos on 11 May 2014 19:43, edited 4 times in total.

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: Patch for cmd.exe for windows xp for cp 65001

#7 Post by Liviu » 11 May 2014 19:31

carlos wrote:Because cmd.exe for windows xp cannot run a batch script when it use the codepage 65001 (utf-8), I investigate the cause, and I found.
Also, I created a patch for it.
That's quite a feat. Cmd2 seems to run batch tasks under codepage 65001, which xp couldn't. Don't know how applicable it's going to be at large, since it means running a customized CLI (command line interpreter) and maybe overriding ComSpec for external calls, too. But, technically speaking, that's some very nice detective work there.

carlos wrote:I compare the calls of cmd for windows xp and windows seven, and windows seven call to internal function IsMBTWCConversionTypeFlagsSupported for determine the correct flag, this function would return 0 or 1, for 65001 it return 0. As you see windows xp hardcode 1 as dwFlags.
This part is a bit worrisome. It could mean that some other obscure behavior might be broken by setting the flag to 0 unconditionally.

carlos wrote:I not know how write a codecave in assembly. A 100% patch would be introduce the code of IsMBTWCConversionTypeFlagsSupported and the call used in windows 7.
Guess the by-the-book approach would be to write a cmd launcher and use something like Microsoft's "detours" library to intercept the actual MultiByteToWideChar calls, then insert whatever custom code in the middle. However, I don't have first hand experience with that, and frankly I am not too interested in low level cmd hacking.

Liviu

carlos
Expert
Posts: 503
Joined: 20 Aug 2010 13:57
Location: Chile
Contact:

Re: Patch for cmd.exe for windows xp for cp 65001

#8 Post by carlos » 11 May 2014 19:48

Liviu: documentation says:

For UTF-8 ..., dwFlags must be set to either 0 ....
Otherwise, the function fails with ERROR_INVALID_FLAGS.

This is simple.
On xp: If you need launch a batch script and you need use codepage 65001, use cmd2.exe else use the normal cmd.exe
Last edited by carlos on 11 May 2014 21:07, edited 1 time in total.

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: Patch for cmd.exe for windows xp for cp 65001

#9 Post by Liviu » 11 May 2014 20:43

carlos wrote:For UTF-8 ..., dwFlags must be set to either 0 ....
Otherwise, the function fails with ERROR_INVALID_FLAGS.
Right, and also for a few other codepages listed at http://msdn.microsoft.com/en-us/library/windows/desktop/dd319072(v=vs.85).aspx. That's what makes your patch work for codepage 65001. However, the rest of the codepages not in that list used MB_PRECOMPOSED for conversion, and after your patch will use 0, instead. To make it even more confusing, the same docs say that MB_PRECOMPOSED is the default in case no flags are specified (i.e. dwFlags = 0). Which raises the question in my mind why Win7 decided to fix the issue by replacing "1" with a conditional value returned by another function, instead of simply hardcoding it to "0" (as your patch did) - which by the letter of the documentation should work for all codepages and default to "1" for those which support it. My guess is that there could be a difference between the two, and that could result in different behaviors in who-knows-what special cases.

carlos wrote:On xp: If you need launch a batch script and you need use codepage 65001, use cmd2.exe else use the normal cmd.exe
Right again, but if the batch calls in turn another "%comspec% /c another.bat" then that call would use the original cmd, unless ComSpec itself was overridden, first.

Liviu

P.S. I see that you redacted out one of the previous posts. IANAL but I'd be surprised if anyone anywhere saw a problem with a disassembly listing. You can get as much with MS' own WinDbg or Visual C++. Also, API hooking is well known, documented, and supported by MS' own "detours", which is vastly more powerful and far reaching than hex-editing a "1" to a "0".

carlos
Expert
Posts: 503
Joined: 20 Aug 2010 13:57
Location: Chile
Contact:

Re: Patch for cmd.exe for windows xp for cp 65001

#10 Post by carlos » 11 May 2014 21:01

Thanks Liviu. You are right.
cmd.exe for windows 7 in dwFlags uses 1 or 0 (not always 0 as my current patch).
I have many things to learn. And I have the patch idea very clear on my mind, but I need finish learn assembly, maybe aacini can help me with it.
I will hope that I will post coming soon a new version of the patcher.
And also i will see windbg.

carlos
Expert
Posts: 503
Joined: 20 Aug 2010 13:57
Location: Chile
Contact:

Re: Patch for cmd.exe for windows xp for cp 65001

#11 Post by carlos » 12 May 2014 05:49

Currently I'm writing a codecave for do a full patch of this. Please be patience.

carlos
Expert
Posts: 503
Joined: 20 Aug 2010 13:57
Location: Chile
Contact:

Re: Patch for cmd.exe for windows xp for cp 65001

#12 Post by carlos » 13 May 2014 13:31

I'm investigate more and I found some things interesting.

The documentation says about MultiByteToWideChar:

MB_PRECOMPOSED Default;

then, if you specify 0 as dwFlag MB_PRECOMPOSED maybe would be used anyways.

Edit: this is the same thing that says Liviu. I not read it carefully. :oops:
Last edited by carlos on 14 May 2014 13:21, edited 2 times in total.

carlos
Expert
Posts: 503
Joined: 20 Aug 2010 13:57
Location: Chile
Contact:

Re: Patch for cmd.exe for windows xp for cp 65001

#13 Post by carlos » 14 May 2014 03:38

Hello. Jason Hood, the author of ansicon, write the codecave, the full patch. After this patch cmd.exe for windows x86 would use the same behavior on MultibyteToWideChar that cmd.exe for windows 7. This means that codepage 65001 is fully supported.

This is the patch:

Edit: Patch updated 19 may 2014.

Note: This only works on version 5.1.2600.5512 (the version that comes with service pack 3 for windows xp).

cmd-utf8-new.txt

Code: Select all

# Patch XP's CMD.EXE (5.1.2600.5512) to work with UTF-8 batch files.
# Updated 19 may 2014

File: cmd.exe
005E57: E8C49B010090   [ FF155811D04A ]
00A477: E8A455010090   [ FF155811D04A ]
0162E6: E83597000090   [ FF155811D04A ]
01A3A5: E87656000090   [ FF155811D04A ]
01C2DD: E83E37000090   [ FF155811D04A ]
01FA20: 8B44E40485C07519687C49D04AFF153C  [ 00000000000000000000000000000000 ]
        11D04A68AC06D24A50FF153811D04AFF  [ 00000000000000000000000000000000 ]
        D03D35C4000074570F8727            [ 0000000000000000000000 ]
01FA4E: 83F82A744C3D2CC4000072153D2EC400  [ 00000000000000000000000000000000 ]
        00763E3D31C4000074373D33C4000074  [ 00000000000000000000000000000000 ]
        30FF255811D04A3DC8CE000074233D98  [ 00000000000000000000000000000000 ]
        D60000741C3DAADE000072E53DB3DE00  [ 00000000000000000000000000000000 ]
        00760E3DE8FD000072D73DE9FD000077  [ 00000000000000000000000000000000 ]
        D0C644E40800FF255811D04A00004765  [ 00000000000000000000000000000000 ]
        74414350                          [ 00000000 ]



For apply, you need copy the cmd.exe from windows\system32 to a folder.
Then, in that folder put bwpatch.exe utility (from he also) from here:
direct url:

Code: Select all

http://adoxa.altervista.org/misc/dl.php?f=bwpatch-w

url with other options:

Code: Select all

http://adoxa.altervista.org/misc/#bwpatch


then in that folder run this command:

Code: Select all

bwpatchw.exe cmd.exe -f cmd-utf8-new.txt


Now cmd.exe is fully patched on the current folder and for use it as default run these commands:

Code: Select all

Copy /Y cmd.exe "%SystemRoot%\system32\cmdutf8.exe"
Set "key=HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\cmd.exe"
Reg.exe add "%key%" /v "Debugger" /d "%SystemRoot%\system32\cmdutf8.exe" /f
exit


This is the source code of the patch:
syntax of

Code: Select all

http://rammichael.com/multimate-assembler

An anonymous label is defined as: @@:
The nearest following anonymous label is referred to as: @f


cmd-xp-new.asm

Code: Select all

; Patch XP's CMD.EXE (5.1.2600.5512) to work with UTF-8 batch files.
; Method discovered by Carlos, patch by adoxa.
; Created 14 may 2014
; Fixed 19 may 2014 by Carlos

<4ad06a57>
call 4ad20620
nop
<4ad0b077>
call 4ad20620
nop
<4ad16ee6>
call 4ad20620
nop
<4ad1afa5>
call 4ad20620
nop
<4ad1cedd>
call 4ad20620
nop

<4ad20620>
mov eax,[esp+4] ;; code page
test eax,eax
jnz short @f
push 4ad0497c ;; push lpModuleName =  L"kernel32.dll"
call dword[4ad0113c] ;; hModule = call GetModuleHandleW
push @GetACP ;; push lpProcName = "GetACP"
push eax ;; push hModule
call dword[4ad01138] ;; *func = GetProcAddress
call eax ;; call func()
@@:
cmp eax,50229.
je short @f
ja @bigger
cmp eax,42.
je short @f
cmp eax,50220.
jb short @ok
cmp eax,50222.
jbe short @f
cmp eax,50225.
je short @f
cmp eax,50227.
je short @f
@ok:
jmp dword[4ad01158] ;; MultiByteToWideChar
@bigger:
cmp eax,52936.
je short @f
cmp eax,54936.
je short @f
cmp eax,57002.
jb short @ok
cmp eax,57011.
jbe short @f
cmp eax,65000.
jb short @ok
cmp eax,65001.
ja short @ok
@@:
mov byte[esp+8],0 ;; flags
jmp dword[4ad01158] ;; MultiByteToWideChar

@GetACP@4: "GetACP\0"

Last edited by carlos on 27 May 2014 02:11, edited 5 times in total.

penpen
Expert
Posts: 2009
Joined: 23 Jun 2013 06:15
Location: Germany

Re: [Done] Patch for cmd.exe for windows xp for cp 65001

#14 Post by penpen » 14 May 2014 05:59

The MultiByteToWideChar call at file location 105CE (code location 4AD111CE) is not patched; what is the reason?

penpen

carlos
Expert
Posts: 503
Joined: 20 Aug 2010 13:57
Location: Chile
Contact:

Re: [Done] Patch for cmd.exe for windows xp for cp 65001

#15 Post by carlos » 14 May 2014 12:48

penpen, it was left as is, using dwFlags 0, it correspond to the function related with the type command. The cause is that cmd.exe for windows xp (bugged) use dwFlags 1 for all, except for it, and also windows 8, use dwFlags 0 in the same function.
Then the patch created by Jason is perfect. It do the calls as next, using as reference cmd.exe for windows 7 and windows 8.


Patch left calls to MultiByteToWideChar:
In-Function ; dwFlags; CodePage
-------------------------------------------------------
FindMsg ; eval ; GetACP()
ReadBufFromFile ; eval ; CurrentCP
FParseWork ; eval ; CurrentCP
TyWork ; 0 ; CurrentCP
RestoreCurrentDirectories ; eval ; CurrentCP
WriteMsgString ; eval ; GetACP()

Post Reply