Edit subtitle with batch?
Moderator: DosItHelp
Edit subtitle with batch?
Hello i have a question and i hope someone can help me.
Here below is a bit from an srt.
1
00:01:01,520 --> 00:01:03,160
<i>Before they died,
my parents told me</i>
2
00:01:03,200 --> 00:01:05,521
<i>stories of how the world once was.</i>
3
00:01:06,520 --> 00:01:08,807
<i>What it was like long before I was born.</i>
4
00:01:09,720 --> 00:01:11,882
<i>Before the war with the machines.</i>
Is it possible for a bat script to look for the word "world" and then completely remove the sentences and everything above?
And change the line numbers accordingly?
So it becomes like this:
1
00:01:01,520 --> 00:01:03,160
<i>Before they died,
my parents told me</i>
2
00:01:06,520 --> 00:01:08,807
<i>What it was like long before I was born.</i>
3
00:01:09,720 --> 00:01:11,882
<i>Before the war with the machines.</i>
Thanks!
Here below is a bit from an srt.
1
00:01:01,520 --> 00:01:03,160
<i>Before they died,
my parents told me</i>
2
00:01:03,200 --> 00:01:05,521
<i>stories of how the world once was.</i>
3
00:01:06,520 --> 00:01:08,807
<i>What it was like long before I was born.</i>
4
00:01:09,720 --> 00:01:11,882
<i>Before the war with the machines.</i>
Is it possible for a bat script to look for the word "world" and then completely remove the sentences and everything above?
And change the line numbers accordingly?
So it becomes like this:
1
00:01:01,520 --> 00:01:03,160
<i>Before they died,
my parents told me</i>
2
00:01:06,520 --> 00:01:08,807
<i>What it was like long before I was born.</i>
3
00:01:09,720 --> 00:01:11,882
<i>Before the war with the machines.</i>
Thanks!
Re: Edit subtitle with batch?
You have not given an adequate specification as to how the source file might be formatted. For example, Is the text always enclosed within <i>...</i>
Regardless, this is not a problem I would want to tackle using pure batch. It could be done, but, yuck.
This is much better suited to something like PowerShell, JScript, or VBS.
I like to use my hybrid JScript/batch hybrid utility called JREPL.BAT.
Assuming no section contains more than one </i>, and every section ends with </i>, then the following JREPL.BAT solution works just fine
If you really want a pure batch solution, then this works as long as all lines are <= 1021 bytes long, the total length of each section is < ~8191 bytes,. and each section is separated by one or more empty lines. It also use \n (newline) instead of \r\n (carriage return and newline) at the end of each line of output. The line terminator can be fixed with a bit of additional code if needed.
Dave Benham
Edit addtions are in blue
Regardless, this is not a problem I would want to tackle using pure batch. It could be done, but, yuck.
This is much better suited to something like PowerShell, JScript, or VBS.
I like to use my hybrid JScript/batch hybrid utility called JREPL.BAT.
Assuming no section contains more than one </i>, and every section ends with </i>, then the following JREPL.BAT solution works just fine
Code: Select all
call jrepl "^(\d+)(\s*\n[\s\S]+?</i>\s*\n?\r?\n?)" "$2.match(/world/i)?'':(n+=1)+$2" /m /i /j /jbeg "var n=0" /f "test.txt" /o "output.txt"
If you really want a pure batch solution, then this works as long as all lines are <= 1021 bytes long, the total length of each section is < ~8191 bytes,. and each section is separated by one or more empty lines. It also use \n (newline) instead of \r\n (carriage return and newline) at the end of each line of output. The line terminator can be fixed with a bit of additional code if needed.
Code: Select all
@echo off
setlocal enableDelayedExpansion
set "inFile=test.txt"
set "outFile=output.txt"
set ^"LF=^
^" The empty line above is critical - DO NOT REMOVE
for /f %%N in ('find /c /v "" ^<"!inFile!"') do set "cnt=%%N"
set /a n=1
set "str="
<"!inFile!" >"!outFile!" (
for /l %%N in (1 1 !cnt!) do (
set "ln="
set /p "ln="
if defined ln (
if not defined str (
set "str=!n!!LF!"
) else (
set "str=!str!!ln!!LF!"
)
) else (
if defined str if "!str:world=!" equ "!str!" (
echo(!str!!LF!
set /a n+=1
)
set "str="
)
)
if defined str if "!str:world=!" equ "!str!" echo(!str!!LF!
)
Dave Benham
Edit addtions are in blue
Last edited by dbenham on 28 Sep 2015 05:04, edited 2 times in total.
Re: Edit subtitle with batch?
Try this:
Output:
This solution remove exclamation marks; this point may be fixed, if needed.
Antonio
Code: Select all
@echo off
setlocal EnableDelayedExpansion
for %%v in (num times wordFound i n) do set "%%v="
(for /F "delims=" %%a in (input.txt) do (
if not defined num (
set "num=%%a"
) else if not defined times (
set "times=%%a"
) else (
set /A i+=1
set "line[!i!]=%%a"
set "line=%%a"
if "!line:world=!" neq "!line!" set wordFound=true
if "!line:~-4!" equ "</i>" (
if not defined wordFound (
set /A n+=1
echo !n!
echo !times!
for /L %%i in (1,1,!i!) do echo !line[%%i]!
echo/
)
for %%v in (num times wordFound i) do set "%%v="
)
)
)) > output.txt
Output:
Code: Select all
1
00:01:01,520 --> 00:01:03,160
<i>Before they died,
my parents told me</i>
2
00:01:06,520 --> 00:01:08,807
<i>What it was like long before I was born.</i>
3
00:01:09,720 --> 00:01:11,882
<i>Before the war with the machines.</i>
This solution remove exclamation marks; this point may be fixed, if needed.
Antonio
Re: Edit subtitle with batch?
Thank you for helping me Dave & Antonio!
I have tried both scripts and Dave's second script gave me this output:
When i open the output.txt file i got this:
00:01:01,520 --> 00:01:03,160<i>Before they died,my parents told me</i>
00:01:06,520 --> 00:01:08,807<i>What it was like long before I was born.</i>
00:01:09,720 --> 00:01:11,882<i>Before the war with the machines.</i>
But when i copy and paste that i get it correct!
1
00:01:01,520 --> 00:01:03,160
<i>Before they died,
my parents told me</i>
2
00:01:06,520 --> 00:01:08,807
<i>What it was like long before I was born.</i>
3
00:01:09,720 --> 00:01:11,882
<i>Before the war with the machines.</i>
Subtitle files are a lot of lines but i tried the same script on the original subtitle file witch is 1413 lines long,
And it still works but only when you copy and paste it.
Is there anyway to solve this?
And i've tried your script Antonio and that give's me this output:
35
00:05:33,840 --> 00:05:35,365
<i>But John is more.</i>
36
00:05:36,000 --> 00:05:37,764
<i>We're here because tonight,</i>
37
00:05:38,160 --> 00:05:40,447
<i>he's going to lead us to crush Skynet.</i>
38
00:05:41,080 --> 00:05:42,411
<i>For good.</i>
39
00:05:42,920 --> 00:05:45,287
Sir? Request to join
the Colorado offensive.
47
00:05:45,920 --> 00:05:47,410
I need you with me, Reese.
48
00:05:47,680 --> 00:05:50,040
We're talking about the
complete destruction of Skynet, sir.
49
00:05:50,160 --> 00:05:52,288
The Colorado unit will succeed.
50
00:05:52,360 --> 00:05:54,567
The machines will fall tonight.
All the lines with <i> are great but lines without that don't have spaces between them.
Is this fixable?
I must say it's amazing that you guys know all this and i'm very thankfull that your willing to help!
I've been trying to do this for quite some time now and you guys did more in a day than i did in 2 weeks!
Again thank you.
I have tried both scripts and Dave's second script gave me this output:
When i open the output.txt file i got this:
00:01:01,520 --> 00:01:03,160<i>Before they died,my parents told me</i>
00:01:06,520 --> 00:01:08,807<i>What it was like long before I was born.</i>
00:01:09,720 --> 00:01:11,882<i>Before the war with the machines.</i>
But when i copy and paste that i get it correct!
1
00:01:01,520 --> 00:01:03,160
<i>Before they died,
my parents told me</i>
2
00:01:06,520 --> 00:01:08,807
<i>What it was like long before I was born.</i>
3
00:01:09,720 --> 00:01:11,882
<i>Before the war with the machines.</i>
Subtitle files are a lot of lines but i tried the same script on the original subtitle file witch is 1413 lines long,
And it still works but only when you copy and paste it.
Is there anyway to solve this?
And i've tried your script Antonio and that give's me this output:
35
00:05:33,840 --> 00:05:35,365
<i>But John is more.</i>
36
00:05:36,000 --> 00:05:37,764
<i>We're here because tonight,</i>
37
00:05:38,160 --> 00:05:40,447
<i>he's going to lead us to crush Skynet.</i>
38
00:05:41,080 --> 00:05:42,411
<i>For good.</i>
39
00:05:42,920 --> 00:05:45,287
Sir? Request to join
the Colorado offensive.
47
00:05:45,920 --> 00:05:47,410
I need you with me, Reese.
48
00:05:47,680 --> 00:05:50,040
We're talking about the
complete destruction of Skynet, sir.
49
00:05:50,160 --> 00:05:52,288
The Colorado unit will succeed.
50
00:05:52,360 --> 00:05:54,567
The machines will fall tonight.
All the lines with <i> are great but lines without that don't have spaces between them.
Is this fixable?
I must say it's amazing that you guys know all this and i'm very thankfull that your willing to help!
I've been trying to do this for quite some time now and you guys did more in a day than i did in 2 weeks!
Again thank you.
Re: Edit subtitle with batch?
You need to copy and paste because of the line feed vs. carriage return/line feed issue I talked about. Below is code that fixes that problem. I also fixed a minor bug that was introducing an extra line feed between each section.
If you think you may have many text editing tasks, then I strongly recommend you learn regular expressions and try out JREPL.BAT. There are a great many pitfalls with trying to edit text with pure batch, and a robust batch solution can be unacceptably slow with large files. JREPL is much simpler (once you learn regular expressions, and perhaps a bit of JScript), and it is much faster and more robust.
Here is an improved JREPL solution that does not rely on <i>...</i>, and instead assumes one or more empty lines are used to delimit sections.
Dave Benham
Code: Select all
@echo off
setlocal enableDelayedExpansion
set "inFile=test.txt"
set "outFile=output.txt"
:: Define LF to contain a line feed character
set ^"LF=^
^" The empty line above is critical - DO NOT REMOVE
:: Defiine CR to contain a carriage return character
for /f %%a in ('copy /Z "%~dpf0" nul') do set "CR=%%a"
:: Count the number of lines within the input file
for /f %%N in ('find /c /v "" ^<"!inFile!"') do set "cnt=%%N"
set /a n=1
set "str="
<"!inFile!" >"!outFile!" (
for /l %%N in (1 1 !cnt!) do (
set "ln="
set /p "ln="
if defined ln (
if not defined str (
set "str=!n!!CR!!LF!"
) else (
set "str=!str!!ln!!CR!!LF!"
)
) else (
if defined str if "!str:world=!" equ "!str!" (
echo(!str!
set /a n+=1
)
set "str="
)
)
if defined str if "!str:world=!" equ "!str!" echo(!str!
)
If you think you may have many text editing tasks, then I strongly recommend you learn regular expressions and try out JREPL.BAT. There are a great many pitfalls with trying to edit text with pure batch, and a robust batch solution can be unacceptably slow with large files. JREPL is much simpler (once you learn regular expressions, and perhaps a bit of JScript), and it is much faster and more robust.
Here is an improved JREPL solution that does not rely on <i>...</i>, and instead assumes one or more empty lines are used to delimit sections.
Code: Select all
@call jrepl "^(\d+)([\s\S]+?(\n(?:\r?\n)+|(?![\s\S])))" "$2.match(/world/i)?'':(n+=1)+$2" /m /i /j /jbeg "var n=0" /f "test.txt" /o "output.txt"
Dave Benham
Re: Edit subtitle with batch?
Amazing Dave, what a great script!
Just what i was looking for, but is there also i way i can add more than just 1 word to find?
Like world, hello, beyond etc?
And how can i replace a character?
Like:
This is a line
- This is line 2
Replace - with - so it becomes like:
This is a line
-This is line 2
Again thank you for you're help!
Just what i was looking for, but is there also i way i can add more than just 1 word to find?
Like world, hello, beyond etc?
And how can i replace a character?
Like:
This is a line
- This is line 2
Replace - with - so it becomes like:
This is a line
-This is line 2
Again thank you for you're help!
Re: Edit subtitle with batch?
Wow! So this is another "your solution don't works" and "ok, add this feature now" topic? Please, carefully read this post; then, realize that you didn't specified the format of the file! User dbenham clearly asked you:
But you just ignored that question! You showed the wrong output from my script, but you didn't showed the input! How do you think I can modify my code if I have not the data to test it? Said that...
The new code:
The input:
The output:
Antonio
PS - If I would knew the exact specifications of this problem before, I would not posted a pure Batch file solution! It is very slow...
dbenham wrote:You have not given an adequate specification as to how the source file might be formatted. For example, Is the text always enclosed within <i>...</i>
But you just ignored that question! You showed the wrong output from my script, but you didn't showed the input! How do you think I can modify my code if I have not the data to test it? Said that...
The new code:
Code: Select all
@echo off
setlocal EnableDelayedExpansion
set i=0
for %%r in (world hello beyond Colorado) do (
set /A i+=1
set "remove[!i!]=%%r"
)
for %%v in (num times wordFound i n) do set "%%v="
(
for /F "tokens=1* delims=:" %%a in ('findstr /N "^" input.txt') do (
set "line=%%b"
if not defined line (
if not defined wordFound (
set /A n+=1
echo !n!
echo !times!
for /L %%i in (1,1,!i!) do echo !line[%%i]!
echo/
)
for %%v in (num times wordFound i) do set "%%v="
) else if not defined num (
set "num=!line!"
) else if not defined times (
set "times=!line!"
) else (
set /A i+=1
set "line[!i!]=!line!"
for /F "tokens=2 delims==" %%r in ('set remove') do (
if "!line:%%r=!" neq "!line!" set wordFound=true
)
)
)
if defined i if not defined wordFound (
set /A n+=1
echo !n!
echo !times!
for /L %%i in (1,1,!i!) do echo !line[%%i]!
echo/
)
) > output.txt
The input:
Code: Select all
1
00:01:01,520 --> 00:01:03,160
<i>Before they died,
my parents told me</i>
2
00:01:03,200 --> 00:01:05,521
<i>stories of how the world once was.</i>
3
00:01:06,520 --> 00:01:08,807
<i>What it was like long before I was born.</i>
4
00:01:09,720 --> 00:01:11,882
<i>Before the war with the machines.</i>
5
00:05:38,160 --> 00:05:40,447
<i>he's going to lead us to crush Skynet.</i>
6
00:05:41,080 --> 00:05:42,411
<i>For good.</i>
7
00:05:42,920 --> 00:05:45,287
Sir? Request to join
the Colorado offensive.
8
00:05:45,920 --> 00:05:47,410
I need you with me, Reese.
9
00:05:47,680 --> 00:05:50,040
We're talking about the
complete destruction of Skynet, sir.
10
00:05:50,160 --> 00:05:52,288
The Colorado unit will succeed.
11
00:05:52,360 --> 00:05:54,567
The machines will fall tonight.
The output:
Code: Select all
1
00:01:01,520 --> 00:01:03,160
<i>Before they died,
my parents told me</i>
2
00:01:06,520 --> 00:01:08,807
<i>What it was like long before I was born.</i>
3
00:01:09,720 --> 00:01:11,882
<i>Before the war with the machines.</i>
4
00:05:38,160 --> 00:05:40,447
<i>he's going to lead us to crush Skynet.</i>
5
00:05:41,080 --> 00:05:42,411
<i>For good.</i>
6
00:05:45,920 --> 00:05:47,410
I need you with me, Reese.
7
00:05:47,680 --> 00:05:50,040
We're talking about the
complete destruction of Skynet, sir.
8
00:05:52,360 --> 00:05:54,567
The machines will fall tonight.
Antonio
PS - If I would knew the exact specifications of this problem before, I would not posted a pure Batch file solution! It is very slow...
Re: Edit subtitle with batch?
I'm very sorry Antonio!
I didn't read the question right.
And i am very sorry that i didn't read the rules.
I also gave no information about the full input, that is completely my mistake and i am very thankfull
that you're willing to help!
I only posted a couple of lines of the input and i should have said that there are lines without <i> and </i>.
Again thank you for helping me.
I've tried your new code and it works but like you said it's pretty slow.
I didn't read the question right.
And i am very sorry that i didn't read the rules.
I also gave no information about the full input, that is completely my mistake and i am very thankfull
that you're willing to help!
I only posted a couple of lines of the input and i should have said that there are lines without <i> and </i>.
Again thank you for helping me.
I've tried your new code and it works but like you said it's pretty slow.
Re: Edit subtitle with batch?
Can someone please help me?
This is currently the script:
I'm looking for a way that when i drag a srt on the batch file it processes the srt.
But currently it's saying that the path can't be found.
Also how can i extend the script so it searches multiple words?
Now it only finds the word "world" but how can i make it search world, hello, beyond etc...
I've used Antonio's script above but it's really slow.
This is currently the script:
Code: Select all
@echo off
setlocal enableDelayedExpansion
set inFile= %1
set outFile= %~n1_no_world.srt
>nul findstr "^[0-9].*-->" %1 && (
goto process_srt
) || (
goto:eof
)
:process_srt
:: Define LF to contain a line feed character
set ^"LF=^
^" The empty line above is critical - DO NOT REMOVE
:: Defiine CR to contain a carriage return character
for /f %%a in ('copy /Z "%~dpf0" nul') do set "CR=%%a"
:: Count the number of lines within the input file
for /f %%N in ('find /c /v "" ^<"!inFile!"') do set "cnt=%%N"
set /a n=1
set "str="
<"!inFile!" >"!outFile!" (
for /l %%N in (1 1 !cnt!) do (
set "ln="
set /p "ln="
if defined ln (
if not defined str (
set "str=!n!!CR!!LF!"
) else (
set "str=!str!!ln!!CR!!LF!"
)
) else (
if defined str if "!str:world=!" equ "!str!" (
echo(!str!
set /a n+=1
)
set "str="
)
)
if defined str if "!str:world=!" equ "!str!" echo(!str!
)
I'm looking for a way that when i drag a srt on the batch file it processes the srt.
But currently it's saying that the path can't be found.
Also how can i extend the script so it searches multiple words?
Now it only finds the word "world" but how can i make it search world, hello, beyond etc...
I've used Antonio's script above but it's really slow.
Re: Edit subtitle with batch?
Can someone please tell me why this doesn't work?
When i only use the word "world" it works but when i add "hope" it doesn't.
When i only use the word "world" it works but when i add "hope" it doesn't.
Code: Select all
@echo off
setlocal enableDelayedExpansion
set "inFile=%~1"
set "outFile=%~n1_no_world.txt"
:: Define LF to contain a line feed character
set ^"LF=^
^" The empty line above is critical - DO NOT REMOVE
:: Define CR to contain a carriage return character
for /f %%a in ('copy /Z "%~dpf0" nul') do set "CR=%%a"
:: Count the number of lines within the input file
for /f %%N in ('find /c /v "" ^<"!inFile!"') do set "cnt=%%N"
set /a n=1
set "str="
<"!inFile!" >"!outFile!" (
for /l %%N in (1 1 !cnt!) do (
set "ln="
set /p "ln="
set "TRUE="
if "!str:world=!" equ "!str!" set TRUE=1
if "!str:hope=!" equ "!str!" set TRUE=1
if defined ln (
if not defined str (
set "str=!n!!CR!!LF!"
) else (
set "str=!str!!ln!!CR!!LF!"
)
) else (
if defined TRUE (echo(!str!
set /a n+=1
)
set "str="
)
)
if defined TRUE echo(!str!
)
Re: Edit subtitle with batch?
Code: Select all
@set @a=0 /*
@CScript //nologo //E:JScript "%~F0" < "%~F1" > "%~DPN1_no_world.txt"
@goto :EOF */
var fileContents = WScript.StdIn.ReadAll(),
search = /(\d+\r\n)(.+\r\n((.+\r\n)+)(\r\n)?)/g,
ignoreWord = /world|hello|beyond/, match, n=0;
while ( match = search.exec(fileContents) ) {
if ( ! ignoreWord.test(match[3]) ) {
WScript.Stdout.Write(++n+"\r\n"+match[2]);
}
}
Code: Select all
Input data search = /regexp/
1 (\d+\r\n) \d=a digit, +=one or more times, CR+LF -> match[1]
00:01:01,520 --> 00:01:03,160 (.+\r\n (.=any char, +=one or more times, CR+LF
<i>Before they died, ((.+\r\n) (.=any char, +=one or more times, CR+LF)
my parents told me</i> +) +=one or more times -> match[3]
(\r\n)? empty line, ?=zero or one time (zero times in last line)
) ) -> match[2]
Full regexp details here.
Antonio
Re: Edit subtitle with batch?
Thank you Antonio!
What a great script and it's lightning fast!
The final thing to complete the script is to add a replace function.
I'm trying to replace - with +.
This is what i got, but it doesn't work.
What a great script and it's lightning fast!
The final thing to complete the script is to add a replace function.
I'm trying to replace - with +.
This is what i got, but it doesn't work.
Code: Select all
@set @a=0 /*
@echo off
if NOT %~x1 == .txt goto :EOF
@CScript //nologo //E:JScript "%~F0" < "%~F1" > "%~DPN1_no_world.txt"
@goto :EOF */
var fileContents = WScript.StdIn.ReadAll(),
search = /(\d+\r\n)(.+\r\n((.+\r\n)+)(\r\n)?)/g,
ignoreWord = /world|hello|beyond|[)]/, match, n=0;
replaceWord = /-/, match, n=0;
while ( match = search.exec(fileContents) ) {
if ( ! ignoreWord.test(match[3]) ) {
WScript.Stdout.Write(++n+"\r\n"+match[2]);
}
if ( ! replaceWord.test(match[3]) ) {
WScript.Stdout.replace(search, replaceWord "+")
}