UTF-8 to ANSI /file

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
itsarnie
Posts: 1
Joined: 29 Oct 2012 09:56

UTF-8 to ANSI /file

#1 Post by itsarnie » 29 Oct 2012 10:04

Hi,

I have around 2000 files in UTF-8 mode. I have to save all of them in ANSI mode.

Please let me know if I can do it in a batch, at one go.

Regards,
Arnie.

Boombox
Posts: 80
Joined: 18 Oct 2012 05:51

Re: UTF-8 to ANSI /file

#2 Post by Boombox » 29 Oct 2012 10:44

I think you can just use;

Code: Select all

type inputfilename> outputfilename.txt


This would do one of them anyhow!


We would need a FOR loop to do them all...

Can we see the contents of one of these UTF-8 files please? Or a snippet at least?

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: UTF-8 to ANSI /file

#3 Post by foxidrive » 29 Oct 2012 10:49

Some interesting comments here. If you apply them to batch then the answer would seem to be: you need a dedicated command line tool to convert them.

http://stackoverflow.com/questions/8298 ... -ansi-in-c

Boombox
Posts: 80
Joined: 18 Oct 2012 05:51

Re: UTF-8 to ANSI /file

#4 Post by Boombox » 29 Oct 2012 12:00

Code: Select all

cmd.exe  /a /c TYPE c:\Utf8.txt > c:\Ansi.txt


The /a switch forces ANSI... Doesn't it?


But if Foxi says otherwise....

Liviu
Expert
Posts: 470
Joined: 13 Jan 2012 21:24

Re: UTF-8 to ANSI /file

#5 Post by Liviu » 29 Oct 2012 12:53

itsarnie wrote:I have around 2000 files in UTF-8 mode. I have to save all of them in ANSI mode.

Converting UTF-8 encoded text to any one codepage such as ANSI or OEM is "lossy" - characters not present in the target codepage will be either remapped (sometimes in surprising ways), or lost for good.

You can (losslessly) convert UTF-8 to UTF-16, see for example http://www.dostips.com/forum/viewtopic.php?p=16399#p16399.

You can also convert UTF-8 to the default ANSI codepage, as long as you don't mind the loss of information, using a variation of http://www.dostips.com/forum/viewtopic.php?p=16399#p16399. EDIT - see P.S. below for the corrected code.

Code: Select all

chcp 65001 >nul & cmd /a /c type utf8.txt >ansi.txt

Liviu

EDIT - P.S. On a second look, don't think it can be done (reliably) with just a one liner. The following however should always work. Note that it assumes the ANSI codepage is 1252, change that as necessary. The code uses "utf8to16.cmd" which would be the snippet in my previously linked post.

Code: Select all

call utf8to16 utf8.txt utf16.txt
chcp 1252>nul
type utf16.txt >ansi.txt
del utf16.txt
Last edited by Liviu on 29 Oct 2012 21:57, edited 1 time in total.

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: UTF-8 to ANSI /file

#6 Post by foxidrive » 29 Oct 2012 15:30

Boombox wrote:But if Foxi says otherwise....


I know very little about codepages and unicode. I assumed there might be some tool that is more thorough.

Liviu's post agrees with the stackoverflow's information in essence - and to paraphrase - the process is not possible without loss of information or surprising results.

Post Reply