For background, 65001 has been long known as the UTF-8 codepage, but officially unsupported and mostly useless due to critical limitations (except for one-off tricks like converting text files to UTF-8 encoding, for example viewtopic.php?p=16399#p16399). Two of those critical limitations - broken parsing, and broken for loops - appear to have been lifted in Win7. This post will discuss the for loop part.
Previously under XP (and, unverified, but probably Vista, too) for loops simply did not work while codepage 65001 was active, neither in batch nor even at the cmd prompt. They seem to work correctly in Win7 now, including the necessary conversions between Windows' native UTF-16 and the active UTF-8 codepage. As an example, start a cmd prompt (using Lucida Console i.e. a non-raster font) at an initially empty C:\tmp directory, then create the following files and set some to +s system and/or +h hidden.
Code: Select all
C:\tmp>(copy nul ‹αß©∂€›
More? copy nul ‹αß©∂€›.h
More? copy nul ‹αß©∂€›.s
More? copy nul ‹αß©∂€›.sh
More? attrib +h ‹αß©∂€›.h
More? attrib +s ‹αß©∂€›.s
More? attrib +s +h ‹αß©∂€›.sh)
1 file(s) copied.
1 file(s) copied.
1 file(s) copied.
1 file(s) copied.
C:\tmp>attrib *
A C:\tmp\‹αß©∂€›
A H C:\tmp\‹αß©∂€›.h
A S C:\tmp\‹αß©∂€›.s
A SH C:\tmp\‹αß©∂€›.sh
C:\tmp>
In XP (sp3) the following commands return...
Code: Select all
C:\tmp>ver
Microsoft Windows XP [Version 5.1.2600]
C:\tmp>for %d in (*) do @echo %~ad %d
--a------ ‹αß©∂€›
--a-s---- ‹αß©∂€›.s
C:\tmp>chcp 437
Active code page: 437
C:\tmp>for /f "delims=" %d in ('dir /a /b') do @echo %~ad %d
<αßc??>
<αßc??>.h
<αßc??>.s
<αßc??>.sh
C:\tmp>chcp 65001
Active code page: 65001
C:\tmp>for /f "delims=" %d in ('dir /a /b') do @echo %~ad %d
C:\tmp>
Now, in Win7 (x64.sp1) the same commands return...
Code: Select all
C:\tmp>ver
Microsoft Windows [Version 6.1.7601]
C:\tmp>for %d in (*) do @echo %~ad %d
--a------ ‹αß©∂€›
--a-s---- ‹αß©∂€›.s
C:\tmp>chcp 437
Active code page: 437
C:\tmp>for /f "delims=" %d in ('dir /a /b') do @echo %~ad %d
<αßc??>
<αßc??>.h
<αßc??>.s
<αßc??>.sh
C:\tmp>chcp 65001
Active code page: 65001
C:\tmp>for /f "delims=" %d in ('dir /a /b') do @echo %~ad %d
--a------ ‹αß©∂€›
--ah----- ‹αß©∂€›.h
--a-s---- ‹αß©∂€›.s
--ahs---- ‹αß©∂€›.sh
C:\tmp>
As noted, support is still far from complete. For one example, Win7 still fails if the last for loop runs a pipe under chcp 65001...
Code: Select all
C:\tmp>for /f "delims=" %d in ('dir /a /b ^| more') do @echo %~ad %d
Not enough memory.
C:\tmp>
Liviu