By a lucky coincidence, I’ve been working for that past couple of months on several tools that are very relevant to this discussion!
1) I've updated my Microsoft C library eXtensions Library (
MsvcLibX),
so that all text written to stdout or stderr, and that goes to the console, is transparently written as UTF-16 Unicode.
This ensures that the expected characters are displayed correctly, whatever the current code page is,
and even if they're not part of that code page.
If stdout or stderr are redirected to a pipe of a file, then the output is converted to the current code page.
This is consistent to what cmd.exe itself does. For example:
displays Unicode file names, even with characters not in the current code page.
converts Unicode file names into the current code page, changing missing characters to '?'.
Give it a try after extracting this
Non-ASCII.zip file.
2) I've written a new tool, called
codepage.exe, that gives information
about the available code pages in your console.
- Without argument, it lists the current console and system code pages.
- Option -i lists all installed code pages that you can use. (Similar to the list that aGerman posted above)
- Option -s lists all supported code pages that you can install. (Same results as -i on Windows 7/8/10, but different in XP)
- Giving it a code page number displays a table of characters for that code page.
(And yes, thanks to the MsvcLibX update, they will be visible correctly whatever your current code page is
)
3) I've updated my
conv.exe code page conversion tool with the following features:
- It now checks the input file BOM, and automatically selects the right input encoding (ANSI/UTF7/UTF8/UTF16).
- Again, thanks to the MsvcLibX update, by default it outputs Unicode to the console.
This allows typing a non-ASCII text file to the console, without having to know its encoding. Just run:
4) I've rebuilt all my tool box with the new library.
So for example the
dirc.exe directory comparison tool
will display Russian or Hebrew file names in the US OEM code page.
5) I've used MsvcLibX to build a Windows version of The Silver Searcher (
ag.exe)
with the same multilingual capability.
This tool is a fast and powerful text search tool, originally built for Linux.
For information about this tool, see the
ag home page or that old
post.
The sources of my Windows port are
there.
If you find bugs (which is likely, as all this is very new!), please preferably report them in their respective GitHub interfaces.
Any feedback welcome.
Enjoy!
Jean-François