Fascinating topic
I'm a bit late to join in. But hopefully better late than never.
Long ago I had verified that values 0x80-0xFE are all available to FOR /F on my machine, which happens to use code page 437. But I only confirmed individual characters, one at a time. I did not bother to look at the sequencing, or investigate the effect of changing the code page.
Here is a summary of my understanding of the discussion so far - nothing new here
Based on my understanding of what has been written, plus some experimentation on my own, it looks like Windows interprets characters based on the active code page, and stores the value internally as the UTF-16? or UTF-32? code point. When parsing FOR tokens, the tokens are stored in a (presumably 0 based) array, and the base character establishes the UTF-?? code point offset. The offset is subtracted from each FOR /F character to determine the corresponding index into the array of tokens.
Most code pages used on this forum interpret 0x00-0x7F as ASCII, and the UTF-?? code points are the same. So all characters in the range 0x01-0x7F are available to FOR /F (except for 0x0D because that can only be accessed via delayed expansion, which does no good for FOR /F).
But the mapping of high-order byte characters varies tremendously. Although nearly all those characters can be used individually, there are often gaps in the mapping, which limits their effectiveness.
Aacini developed a utility to discover contiguous ranges of high-order byte characters. However, it is a bit slow and tedious to use, and it does not show the relationship of threads to each other. But it was a critical contribution that helped lead to the theory as it stands now.
aGerman and penpen developed a fast utility to convert a given code page into a list of byte codes with corresponding UTF-16 code point, sorted by the UTF-16. This could be very useful in establishing contiguous ranges, assuming the theory is correct. But the results still need to be verified against actual FOR /F behavior. I find the final line that lists the characters in code point order to be a bit useless because it is impossible to see where there are gaps.
Here begins my contribution
First off, here is my computer info:
Code: Select all
--------------------------------------------------------------------------------
Windows version : Microsoft Windows [Version 10.0.14393]
Product name : Windows 10 Pro, 64 bit
Performance indicators : Processor Cores: 4 Visible RAM: 4192432 kilobytes
Date/Time format : (mm/dd/yy) Sun 03/12/2017 17:37:39.09
__APPDIR__ : C:\WINDOWS\system32\
ComSpec : C:\WINDOWS\system32\cmd.exe
PathExt : .COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC
Extensions : system: Enabled user: Enabled
Delayed expansion : system: Disabled user: Disabled
Locale name : en-US Code Pages: OEM 437 ANSI 1252
DIR format : 03/08/2017 12:46 AM 7,812,395,008 pagefile.sys
Permissions : Elevated Admin=No, Admin group=Yes
Missing from the tool collection: debug
I decided to write my own utilities to probe FOR /F behavior.
The first utility,
probeFOR.bat, simply uses %1 to establish a base variable, represented as a hex value, and requests tokens 1-31*. It then attempts to echo the value of all values from 0x01-0xFF (except 0x0D), in order to discover which characters map to which tokens. I use FINDSTR and sort to get a sorted list of characters that map. Note that token [32] actually represents the remainder of the line after token 31.
It is most useful when there are characters that map to all 32 tokens. But even when there are gaps, it is still useful in building non-contiguous threads that will be useful in testing the UTF code point theory.
This utility will only work with single byte code pages that interpret 0x00-0x7F as ASCII.
The utility has lots of bytes that do not post well on the forum, so it is pointless to post the code. I've posted the zipped file instead:
See
http://stackoverflow.com/a/8520993/1012053 for a table that shows which characters can be used as a base point.
Example usage (code page 437):
Code: Select all
C:\test>probeFOR 01
\x01 = "[01]"
\x02 = "[02]"
\x03 = "[03]"
\x04 = "[04]"
\x05 = "[05]"
\x06 = "[06]"
\x07 = "[07]"
\x08 = "[08]"
\x09 = "[09]"
\x0A = "[10]"
\x0B = "[11]"
\x0C = "[12]"
\x0E = "[14]"
\x0F = "[15]"
\x10 = "[16]"
\x11 = "[17]"
\x12 = "[18]"
\x13 = "[19]"
\x14 = "[20]"
\x15 = "[21]"
\x16 = "[22]"
\x17 = "[23]"
\x18 = "[24]"
\x19 = "[25]"
\x1A = "[26]"
\x1B = "[27]"
\x1C = "[28]"
\x1D = "[29]"
\x1E = "[30]"
\x1F = "[31]"
\x20 = "[32]"
Note that 0x0D <CR>, token 13 is missing, as expected.
Some characters, like space, cannot be used as a starting point:
Code: Select all
C:\test>probeFOR 20
% was unexpected at this time.
Other characters like < must be escaped as ^<, so both characters must be passed as hex:
Code: Select all
C:\test>probeFOR 5E3C
\x3C = "[01]"
\x3D = "[02]"
\x3E = "[03]"
\x3F = "[04]"
\x40 = "[05]"
\x41 = "[06]"
\x42 = "[07]"
\x43 = "[08]"
\x44 = "[09]"
\x45 = "[10]"
\x46 = "[11]"
\x47 = "[12]"
\x48 = "[13]"
\x49 = "[14]"
\x4A = "[15]"
\x4B = "[16]"
\x4C = "[17]"
\x4D = "[18]"
\x4E = "[19]"
\x4F = "[20]"
\x50 = "[21]"
\x51 = "[22]"
\x52 = "[23]"
\x53 = "[24]"
\x54 = "[25]"
\x55 = "[26]"
\x56 = "[27]"
\x57 = "[28]"
\x58 = "[29]"
\x59 = "[30]"
\x5A = "[31]"
\x5B = "[32]"
I have confirmed that all characters in the range 0x01-0x7F map contiguously (except, of course, for 0x0D)
But beginning with 0x80, there may be gaps. Here are the results for code pages 437 and 850:
Code: Select all
C:\test>chcp
Active code page: 437
C:\test>probeFOR 80
\x80 = "[01]"
\x90 = "[03]"
\xA5 = "[11]"
\x99 = "[16]"
\x9A = "[22]"
\xE1 = "[25]"
\x85 = "[26]"
\xA0 = "[27]"
\x83 = "[28]"
\x84 = "[30]"
\x86 = "[31]"
\x91 = "[32]"
C:\test>chcp 850
Active code page: 850
C:\test>probeFOR 80
\x80 = "[01]"
\xD4 = "[02]"
\x90 = "[03]"
\xD2 = "[04]"
\xD3 = "[05]"
\xDE = "[06]"
\xD6 = "[07]"
\xD7 = "[08]"
\xD8 = "[09]"
\xD1 = "[10]"
\xA5 = "[11]"
\xE3 = "[12]"
\xE0 = "[13]"
\xE2 = "[14]"
\xE5 = "[15]"
\x99 = "[16]"
\x9E = "[17]"
\x9D = "[18]"
\xEB = "[19]"
\xE9 = "[20]"
\xEA = "[21]"
\x9A = "[22]"
\xED = "[23]"
\xE8 = "[24]"
\xE1 = "[25]"
\x85 = "[26]"
\xA0 = "[27]"
\x83 = "[28]"
\xC6 = "[29]"
\x84 = "[30]"
\x86 = "[31]"
\x91 = "[32]"
Using nothing but probeFOR.bat, I was able to tediously build a complete map for code page 850. I also added the UTF-16 code points to show that the results are consistent with the theory.
Code: Select all
CHCP 850 FF-00A0 (non-breaking space) is inaccessible
Rel | T H R E A D S |
Pos | 1 | 2 | 3 | 4 | 5 | 6 |
----+---------+---------+---------+---------+---------+---------+
1 | 01-0001 | AD-00A1 | C4-2500 | 9F-0192 | D5-0131 | F2-2017 |
2 | 02-0002 | BD-00A2 | 2501 | * | * | * |
3 | 03-0003 | 9C-00A3 | B3-2502 | | | |
4 | 04-0004 | CF-00A4 | 2503 | | | |
5 | 05-0005 | BE-00A5 | 2504 | | | |
6 | 06-0006 | DD-00A6 | 2505 | | | |
7 | 07-0007 | F5-00A7 | 2506 | | | |
8 | 08-0008 | F9-00A8 | 2507 | | | |
9 | 09-0009 | B8-00A9 | 2508 | | | |
10 | 0A-000A | A6-00AA | 2509 | | | |
11 | 0B-000B | AE-00AB | 250A | | | |
12 | 0C-000C | AA-00AC | 250B | | | |
13 | 000D | F0-00AD | DA-250C | | | |
14 | 0E-000E | A9-00AE | 250D | | | |
15 | 0F-000F | EE-00AF | 250E | | | |
16 | 10-0010 | F8-00B0 | 250F | | | |
17 | 11-0011 | F1-00B1 | BF-2510 | | | |
18 | 12-0012 | FD-00B2 | 2511 | | | |
19 | 13-0013 | FC-00B3 | 2512 | | | |
20 | 14-0014 | EF-00B4 | 2513 | | | |
21 | 15-0015 | E6-00B5 | C0-2514 | | | |
22 | 16-0016 | F4-00B6 | 2515 | | | |
23 | 17-0017 | FA-00B7 | 2516 | | | |
24 | 18-0018 | F7-00B8 | 2517 | | | |
25 | 19-0019 | FB-00B9 | D9-2518 | | | |
26 | 1A-001A | A7-00BA | 2519 | | | |
27 | 1B-001B | AF-00BB | 251A | | | |
28 | 1C-001C | AC-00BC | 251B | | | |
29 | 1D-001D | AB-00BD | C3-251C | | | |
30 | 1E-001E | F3-00BE | 251D | | | |
31 | 1F-001F | A8-00BF | 251E | | | |
32 | 20-0020 | B7-00C0 | 251F | | | |
33 | 21-0021 | B5-00C1 | 2520 | | | |
34 | 22-0022 | B6-00C2 | 2521 | | | |
35 | 23-0023 | C7-00C3 | 2522 | | | |
36 | 24-0024 | 8E-00C4 | 2523 | | | |
37 | 25-0025 | 8F-00C5 | B4-2524 | | | |
38 | 26-0026 | 92-00C6 | 2525 | | | |
39 | 27-0027 | 80-00C7 | 2526 | | | |
40 | 28-0028 | D4-00C8 | 2527 | | | |
41 | 29-0029 | 90-00C9 | 2528 | | | |
42 | 2A-002A | D2-00CA | 2529 | | | |
43 | 2B-002B | D3-00CB | 252A | | | |
44 | 2C-002C | DE-00CC | 252B | | | |
45 | 2D-002D | D6-00CD | C2-252C | | | |
46 | 2E-002E | D7-00CE | 252D | | | |
47 | 2F-002F | D8-00CF | 252E | | | |
48 | 30-0030 | D1-00D0 | 252F | | | |
49 | 31-0031 | A5-00D1 | 2530 | | | |
50 | 32-0032 | E3-00D2 | 2531 | | | |
51 | 33-0033 | E0-00D3 | 2532 | | | |
52 | 34-0034 | E2-00D4 | 2533 | | | |
53 | 35-0035 | E5-00D5 | C1-2534 | | | |
54 | 36-0036 | 99-00D6 | 2535 | | | |
55 | 37-0037 | 9E-00D7 | 2536 | | | |
56 | 38-0038 | 9D-00D8 | 2537 | | | |
57 | 39-0039 | EB-00D9 | 2538 | | | |
58 | 3A-003A | E9-00DA | 2539 | | | |
59 | 3B-003B | EA-00DB | 253A | | | |
60 | 3C-003C | 9A-00DC | 253B | | | |
61 | 3D-003D | ED-00DD | C5-253C | | | |
62 | 3E-003E | E8-00DE | 253D | | | |
63 | 3F-003F | E1-00DF | 253E | | | |
64 | 40-0040 | 85-00E0 | 253F | | | |
65 | 41-0041 | A0-00E1 | 2540 | | | |
66 | 42-0042 | 83-00E2 | 2541 | | | |
67 | 43-0043 | C6-00E3 | 2542 | | | |
68 | 44-0044 | 84-00E4 | 2543 | | | |
69 | 45-0045 | 86-00E5 | 2544 | | | |
70 | 46-0046 | 91-00E6 | 2545 | | | |
71 | 47-0047 | 87-00E7 | 2546 | | | |
72 | 48-0048 | 8A-00E8 | 2547 | | | |
73 | 49-0049 | 82-00E9 | 2548 | | | |
74 | 4A-004A | 88-00EA | 2549 | | | |
75 | 4B-004B | 89-00EB | 254A | | | |
76 | 4C-004C | 8D-00EC | 254B | | | |
77 | 4D-004D | A1-00ED | 254C | | | |
78 | 4E-004E | 8C-00EE | 254D | | | |
79 | 4F-004F | 8B-00EF | 254E | | | |
80 | 50-0050 | D0-00F0 | 254F | | | |
81 | 51-0051 | A4-00F1 | CD-2550 | | | |
82 | 52-0052 | 95-00F2 | BA-2551 | | | |
83 | 53-0053 | A2-00F3 | 2552 | | | |
84 | 54-0054 | 93-00F4 | 2553 | | | |
85 | 55-0055 | E4-00F5 | C9-2554 | | | |
86 | 56-0056 | 94-00F6 | 2555 | | | |
87 | 57-0057 | F6-00F7 | 2556 | | | |
88 | 58-0058 | 9B-00F8 | BB-2557 | | | |
89 | 59-0059 | 97-00F9 | 2558 | | | |
90 | 5A-005A | A3-00FA | 2559 | | | |
91 | 5B-005B | 96-00FB | C8-255A | | | |
92 | 5C-005C | 81-00FC | 255B | | | |
93 | 5D-005D | EC-00FD | 255C | | | |
94 | 5E-005E | E7-00FE | BC-255D | | | |
95 | 5F-005F | 98-00FF | 255E | | | |
96 | 60-0060 | * | 255F | | | |
97 | 61-0061 | | CC-2560 | | | |
98 | 62-0062 | | 2561 | | | |
99 | 63-0063 | | 2562 | | | |
100 | 64-0064 | | B9-2563 | | | |
101 | 65-0065 | | 2564 | | | |
102 | 66-0066 | | 2565 | | | |
103 | 67-0067 | | CB-2566 | | | |
104 | 68-0068 | | 2567 | | | |
105 | 69-0069 | | 2568 | | | |
106 | 6A-006A | | CA-2569 | | | |
107 | 6B-006B | | 256A | | | |
108 | 6C-006C | | 256B | | | |
109 | 6D-006D | | CE-256C | | | |
110 | 6E-006E | | 256D | | | |
111 | 6F-006F | | 256E | | | |
112 | 70-0070 | | 256F | | | |
113 | 71-0071 | | 2570 | | | |
114 | 72-0072 | | 2571 | | | |
115 | 73-0073 | | 2572 | | | |
116 | 74-0074 | | 2573 | | | |
117 | 75-0075 | | 2574 | | | |
118 | 76-0076 | | 2575 | | | |
119 | 77-0077 | | 2576 | | | |
120 | 78-0078 | | 2577 | | | |
121 | 79-0079 | | 2578 | | | |
122 | 7A-007A | | 2579 | | | |
123 | 7B-007B | | 257A | | | |
124 | 7C-007C | | 257B | | | |
125 | 7D-007D | | 257C | | | |
126 | 7E-007E | | 257D | | | |
127 | 7F-007F | | 257E | | | |
128 | * | | 257F | | | |
129 | | | DF-2580 | | | |
130 | | | 2581 | | | |
131 | | | 2582 | | | |
132 | | | 2583 | | | |
133 | | | DC-2584 | | | |
134 | | | 2585 | | | |
135 | | | 2586 | | | |
136 | | | 2587 | | | |
137 | | | DB-2588 | | | |
138 | | | 2589 | | | |
139 | | | 258A | | | |
140 | | | 258B | | | |
141 | | | 258C | | | |
142 | | | 258D | | | |
143 | | | 258E | | | |
144 | | | 258F | | | |
145 | | | 2590 | | | |
146 | | | B0-2591 | | | |
147 | | | B1-2592 | | | |
148 | | | B2-2593 | | | |
149 | | | 2594 | | | |
150 | | | 2595 | | | |
151 | | | 2596 | | | |
152 | | | 2597 | | | |
153 | | | 2598 | | | |
154 | | | 2599 | | | |
155 | | | 259A | | | |
156 | | | 259B | | | |
157 | | | 259C | | | |
158 | | | 259D | | | |
159 | | | 259E | | | |
160 | | | 259F | | | |
161 | | | FE-25A0 | | | |
| | | * | | | |
The next step was to write an efficient utility to map all of the high-order bytes in one step. CompileFOR.bat is dependent on probeFOR.bat. It writes diagnostic lines to stderr to show what steps are taken to compile the list. It then writes the final map to stdout, where it can be conveniently captured by redirection, if so desired.
compileFOR.bat
Code: Select all
@echo off
setlocal enableDelayedExpansion
:: Clear $ variables
for /f "delims==" %%A in ('set $ 2^>nul') do set "%%A="
:: Build list of high order bytes x80 - xFF
for %%A in (8 9 A B C D E F) do for %%B in (0 1 2 3 4 5 6 7 8 9 A B C D E F) do set "$x%%A%%B=1"
set /a minThread=curThread=101, maxThread=100"
:top
set "skip="
set "prev=1000"
for /f "delims=x= tokens=2" %%A in ('set $x 2^>nul') do (
>&2 echo START %%A
set /a "$T%curThread%.max=1000"
set "char=%%A"
call :buildThread && goto :top
)
:: Print Results
echo(
chcp
set "thread="
for /f "delims=$T.= tokens=1-3" %%A in ('set $T ^| findstr /lv ".max"') do (
if "%%A" neq "!thread!" (
set "thread=%%A"
echo(
echo Thread !thread:~-2!:
)
set "token=%%B"
echo !token:~-3!=%%C
)
for /f "delims=$x=" %%A in ('set $x 2^>nul') do (
if defined thread (
echo(
echo Inaccessible:
set "thread="
)
echo %%A
)
exit /b
:buildThread
>&2 echo :buildThread thread=%curThread% char=%char%
(
for /f "%skip% tokens=1,3 delims=\x=[] " %%A in ('probeFOR %char% 2^>nul') do (
set /a "beg=$T%curThread%.max+1, end=beg+10%%B-prev-2, $T%curThread%.max=end+1, prev=10%%B"
for /l %%N in (!beg! 1 !end!) do set "$T%curThread%.%%N= "
set "$T%curThread%.!$T%curThread%.max!=%%A"
set "$x%%A="
set "char=%%A"
for /l %%N in (!minThread! 1 !maxThread!) do if !$T%%N.1001! == %%A (
set /a "merge=%%N"
goto :mergeThread
)
)
)
if %prev% gtr 1001 (
set "skip=skip=1"
set /a prev=1001
goto :buildThread
)
if not defined $T%curThread%.1001 exit /b 1
set /a "maxThread=curThread, curThread=maxThread+1"
exit /b 0
:mergeThread
>&2 echo :mergeThread %curThread% %merge%
set /a "oldMax=$T%merge%.max, n=$T%merge%.max+=($T%curThread%.max-1001)"
for /l %%N in (!oldMax! -1 1001) do (
set "$T%merge%.!n!=!$T%merge%.%%N!"
set /a n-=1
)
for /l %%N in (1001 1 !$T%curThread%.max!) do set "$T%merge%.%%N=!$T%curThread%.%%N!"
for /f "delims==" %%A in ('set $T%curThread%.') do set "%%A="
exit /b 0
pause
And here is the result for for code page 437:
Code: Select all
C:\test>compileFOR
START 80
:buildThread thread=101 char=80
:buildThread thread=101 char=91
:buildThread thread=101 char=98
START 8E
:buildThread thread=102 char=8E
:mergeThread 102 101
START 9B
:buildThread thread=102 char=9B
:buildThread thread=102 char=A8
:mergeThread 102 101
START 9E
:buildThread thread=102 char=9E
START 9F
:buildThread thread=103 char=9F
START A9
:buildThread thread=104 char=A9
:buildThread thread=104 char=F5
START AD
:buildThread thread=105 char=AD
:mergeThread 105 101
START B0
:buildThread thread=105 char=B0
:buildThread thread=105 char=FE
START B3
:buildThread thread=106 char=B3
:buildThread thread=106 char=C3
:buildThread thread=106 char=C1
:buildThread thread=106 char=D6
:buildThread thread=106 char=CE
:buildThread thread=106 char=DB
:mergeThread 106 105
START C4
:buildThread thread=106 char=C4
:mergeThread 106 105
START E0
:buildThread thread=106 char=E0
:buildThread thread=106 char=ED
START E2
:buildThread thread=107 char=E2
:mergeThread 107 106
START EC
:buildThread thread=107 char=EC
:buildThread thread=107 char=EF
:buildThread thread=107 char=F7
:buildThread thread=107 char=F2
START F9
:buildThread thread=108 char=F9
:mergeThread 108 107
START FC
:buildThread thread=108 char=FC
START FF
:buildThread thread=109 char=FF
Active code page: 437
Thread 01:
001=AD
002=9B
003=9C
004=
005=9D
006=
007=
008=
009=
010=A6
011=AE
012=AA
013=
014=
015=
016=F8
017=F1
018=FD
019=
020=
021=E6
022=
023=FA
024=
025=
026=A7
027=AF
028=AC
029=AB
030=
031=A8
032=
033=
034=
035=
036=8E
037=8F
038=92
039=80
040=
041=90
042=
043=
044=
045=
046=
047=
048=
049=A5
050=
051=
052=
053=
054=99
055=
056=
057=
058=
059=
060=9A
061=
062=
063=E1
064=85
065=A0
066=83
067=
068=84
069=86
070=91
071=87
072=8A
073=82
074=88
075=89
076=8D
077=A1
078=8C
079=8B
080=
081=A4
082=95
083=A2
084=93
085=
086=94
087=F6
088=
089=97
090=A3
091=96
092=81
093=
094=
095=98
Thread 02:
001=9E
Thread 03:
001=9F
Thread 04:
001=A9
002=
003=
004=
005=
006=
007=
008=
009=
010=
011=
012=
013=
014=
015=
016=
017=F4
018=F5
Thread 05:
001=C4
002=
003=B3
004=
005=
006=
007=
008=
009=
010=
011=
012=
013=DA
014=
015=
016=
017=BF
018=
019=
020=
021=C0
022=
023=
024=
025=D9
026=
027=
028=
029=C3
030=
031=
032=
033=
034=
035=
036=
037=B4
038=
039=
040=
041=
042=
043=
044=
045=C2
046=
047=
048=
049=
050=
051=
052=
053=C1
054=
055=
056=
057=
058=
059=
060=
061=C5
062=
063=
064=
065=
066=
067=
068=
069=
070=
071=
072=
073=
074=
075=
076=
077=
078=
079=
080=
081=CD
082=BA
083=D5
084=D6
085=C9
086=B8
087=B7
088=BB
089=D4
090=D3
091=C8
092=BE
093=BD
094=BC
095=C6
096=C7
097=CC
098=B5
099=B6
100=B9
101=D1
102=D2
103=CB
104=CF
105=D0
106=CA
107=D8
108=D7
109=CE
110=
111=
112=
113=
114=
115=
116=
117=
118=
119=
120=
121=
122=
123=
124=
125=
126=
127=
128=
129=DF
130=
131=
132=
133=DC
134=
135=
136=
137=DB
138=
139=
140=
141=DD
142=
143=
144=
145=DE
146=B0
147=B1
148=B2
149=
150=
151=
152=
153=
154=
155=
156=
157=
158=
159=
160=
161=FE
Thread 06:
001=E2
002=
003=
004=
005=
006=E9
007=
008=
009=
010=
011=
012=
013=
014=
015=
016=
017=E4
018=
019=
020=E8
021=
022=
023=EA
024=
025=
026=
027=
028=
029=
030=
031=E0
032=
033=
034=EB
035=EE
036=
037=
038=
039=
040=
041=
042=
043=
044=
045=
046=E3
047=
048=
049=E5
050=E7
051=
052=ED
Thread 07:
001=F9
002=FB
003=
004=
005=
006=EC
007=
008=
009=
010=
011=
012=
013=
014=
015=
016=
017=EF
018=
019=
020=
021=
022=
023=
024=
025=
026=
027=
028=
029=
030=
031=
032=
033=
034=
035=
036=
037=
038=
039=
040=
041=
042=
043=
044=
045=
046=
047=
048=F7
049=
050=
051=
052=
053=
054=
055=
056=
057=
058=
059=
060=
061=
062=
063=
064=
065=
066=
067=
068=
069=
070=
071=
072=
073=F0
074=
075=
076=F3
077=F2
Thread 08:
001=FC
Inaccessible:
FF
I've done some minimal spot checking, but I have not fully added the UTF-16 code points to help verify the theory.
One last interesting tidbit
Code page 28591 (ISO/IEC 8859-1) could be really useful with FOR /F - It has all characters for many western European languages, and nearly complete for many more. But more importantly,
all characters from 0x01-0xFF are mapped contiguously by FOR /F
Even 0xFF is accessible. Only 0x0D cannot be used. So with careful construction, it should be possible to access up to 254 tokens simultaneously when using code page 28591.
Dave Benham