UTF-8 encoding while replacing string [SOLVED]

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
AlphaInc.
Posts: 21
Joined: 15 Apr 2021 08:15

UTF-8 encoding while replacing string [SOLVED]

#1 Post by AlphaInc. » 05 Aug 2021 12:10

Hello everybody,

I recently set a mediaInfo script that outputs information about different media files inside a specific folder.
Now after every output I need to change some strings in the created output. For that I use the following command:

Code: Select all

powershell -Command "(gc myFile.txt) -replace 'foo', 'bar' | Out-File -encoding ASCII myFile.txt"
The problem I have is that there is an Umlaut which gets replaced (ö turns into ??). I also tried to replace ASCII encoding with utf8 with no success. Is there anyway to get my Umlaut into my encoded text-file OR use another string-replacement-string ?
Last edited by AlphaInc. on 06 Aug 2021 02:29, edited 1 time in total.

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: UTF-8 encoding while replacing string

#2 Post by aGerman » 05 Aug 2021 13:51

The important question is, what's the encoding of the input? If you know it, you have to specify it along with gc.
FWIW Obviously ASCII makes things worse because only 7-bit ASCII characters are supported in this case.

Steffen

AlphaInc.
Posts: 21
Joined: 15 Apr 2021 08:15

Re: UTF-8 encoding while replacing string

#3 Post by AlphaInc. » 05 Aug 2021 14:14

I don’t know, I created the txt file by using Mediainfo with a template and then output it (Mediainfo -Inform=file://template.txt Video.mkv >> Output.txt) to a text file.

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: UTF-8 encoding while replacing string

#4 Post by aGerman » 05 Aug 2021 14:45

I don't know anything about mediaInfo-
Try to run the tool I uploaded in Jean-François' thread. Maybe it's able to tell you the encoding.
viewtopic.php?p=64494#p64494
If you don't succeed, put the text file in a zip archive and upload it here. I'll probably find it out for you in no time.

Steffen

AlphaInc.
Posts: 21
Joined: 15 Apr 2021 08:15

Re: UTF-8 encoding while replacing string [SOLVED]

#5 Post by AlphaInc. » 06 Aug 2021 02:28

I'll try it, thank you.
But (for all who may run into something similar) found a workaround for that. I replaced the powershell command with a python script which replaces the string without having to deal with the encoding of the file. Maybe not ideal but it's good enough for my use case.

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: UTF-8 encoding while replacing string [SOLVED]

#6 Post by aGerman » 06 Aug 2021 03:52

AlphaInc. wrote:
06 Aug 2021 02:28
python script
If Python works out of the box then it's a strong indication that the text is UTF-8-encoded.

AlphaInc. wrote:
06 Aug 2021 02:28
without having to deal with the encoding of the file
That's just luck :lol:

Steffen

Post Reply