I've been working on a piece of software too Ed, for almost 2 years on and off.
A little over 1000 lines of code though, much of it space and commenting, but highly complex/optimized/modular.
You do need a regional copy of Windows for the differences between languages with DOS tools. Changing locale won't help. However, you can retrieve any information you need using vbscript and WMI in conjunction, except in the case of chcp codepage for example. But you can still retrieve any necessary data by filtering based on the structure of data returned. For example:
chcp on Japanese Windows returns a string with this type of format (romanized):
Where as English Windows returns:
We can't be sure that every language separates using the colon character :, but the space character before the codepage number is likely hard-coded, so with
for /f it becomes a matter of filtering for the
last token (non-specific) using spaces.
PS: Started writing this way earlier before Liviu:
Liviu wrote:But "natively localized" installations may have different directory structures and other subtle deviations, too.
Yes, so try to retrieve subjective information from a reliable source such as WMI or the registry.
As for DOS's unicode support, it has partial support due to Windows but it's technical and I think I've explained elsewhere. Suffice to say, unicode can be passed from Windows (shell for example), and different codepages will not change the operation of your batch script*. Batch can write UTF-16LE with cmd /u, but it's not possible to properly read unicode from files (or at least, process the output using
for /f to store data into a variable).
*All a codepage is, is the representative character for ASCII #0 to 255. You could make your coding look like gobbledygook and it will be read by the computer the same.
I'd like to think myself a leading expert in this area, so ask away.