JREPL.BAT v8.6 - regex text processor with support for text highlighting and alternate character sets
Moderator: DosItHelp
Re: JREPL.BAT v8.3 - regex text processor with support for text highlighting and alternate character sets
Thanks for continued support. How can I handle percent(%) character? following does not seem to work as expected...
::SyntaxDescr MediaWiki pmWiki
::NumberAlpha ## "## %alpha%"
call jrepl_v8_3.bat "##" "## %alpha%" /f C:\Temp\jrepl\output.rtf /o - /l
::SyntaxDescr MediaWiki pmWiki
::NumberAlpha ## "## %alpha%"
call jrepl_v8_3.bat "##" "## %alpha%" /f C:\Temp\jrepl\output.rtf /o - /l
Re: JREPL.BAT v8.3 - regex text processor with support for text highlighting and alternate character sets
I have few thousand files from MediaWiki that are a part of a translation I am working on to pmWiki. Many of these files have underscores in filename...but links in MediaWiki have spaces as follows...
Filenames have underscores:
ACI_318_08.pdf
MediaWiki links have spaces:
[[Media:ACI 318 08.pdf|ACI 318-08 - Imperial]]
Right now this does NOT work in pmWiki:
[[Attach:ACI 318 08.pdf|ACI 318-08 - Imperial]]
Right now this DOES work in pmWiki:
[[Attach:ACI_318_08.pdf|ACI 318-08 - Imperial]]
Please advise how to replace space with underscore between '[[Attach:' and '.pdf|'
I have many thousands of broken links of this nature to attend to.
Much appreciated.
Filenames have underscores:
ACI_318_08.pdf
MediaWiki links have spaces:
[[Media:ACI 318 08.pdf|ACI 318-08 - Imperial]]
Right now this does NOT work in pmWiki:
[[Attach:ACI 318 08.pdf|ACI 318-08 - Imperial]]
Right now this DOES work in pmWiki:
[[Attach:ACI_318_08.pdf|ACI 318-08 - Imperial]]
Please advise how to replace space with underscore between '[[Attach:' and '.pdf|'
I have many thousands of broken links of this nature to attend to.
Much appreciated.
Re: JREPL.BAT v8.3 - regex text processor with support for text highlighting and alternate character sets
I apologize for my bad english,"google translator"
I want to thank everyone in particular for Dave Benham for this great tool, fantastic. I am still crawling in the JREPL world, I have studied the examples adapting them to my needs. I'm happy with the results.I would like to share
ex.
search the lines of text containing the word joaçaba
jrepl "^(.+?)Joaçaba.*$" "if ($1!=prev) {$1;$0} " /jmatch /jbeg "prev=''" /f input.txt >output.txt
cleaning html
type input.html | jrepl "=?\r?\n" "" /m | jrepl "<tr>(.*?)</tr>" "$1" /jmatch /m >output.html
Maybe someone can improve them .
I want to thank everyone in particular for Dave Benham for this great tool, fantastic. I am still crawling in the JREPL world, I have studied the examples adapting them to my needs. I'm happy with the results.I would like to share
ex.
search the lines of text containing the word joaçaba
jrepl "^(.+?)Joaçaba.*$" "if ($1!=prev) {$1;$0} " /jmatch /jbeg "prev=''" /f input.txt >output.txt
cleaning html
type input.html | jrepl "=?\r?\n" "" /m | jrepl "<tr>(.*?)</tr>" "$1" /jmatch /m >output.html
Maybe someone can improve them .
-
- Posts: 1
- Joined: 12 Sep 2019 20:06
Re: JREPL.BAT v8.3 - regex text processor with support for text highlighting and alternate character sets
Hi there! Thought I'd register for the first time a year and a half after discovering JREPL for the first time.
It's always come in handy when I need to adjust files on-the-fly within batch scripts, and one of the things I liked was how you could choose a specific line, or range of lines, to remove from a text file altogether. For example, this always worked for me:
... that is, until v8.0 (and beyond). For some reason when the new file is written, it remains identical to the original, and no lines are removed. It's as if something got changed along the way and this method doesn't work anymore. I did consult the v8.0 changelog to see what had changed and can only conclude that it must be something to do with the /K parameter. But unfortunately I'm not smart enough to determine whether I'm missing another setting now, or if there is in fact a bug in the current JREPL build. Which may be the case, as after further tinkering I discovered that I could remove line 1..... and only line 1. Referencing any line other than the first - whether stand-alone or a range - results in no changes being made.
I sort-of found a way around it by using hex-code references and the /X and /M parameters but this seems very long-winded to remove just two lines. So I thought I'd mention the issue just in case I have indeed discovered a bug in the code.
Feel free to let me know if I'm missing anything important here. Thanks!
It's always come in handy when I need to adjust files on-the-fly within batch scripts, and one of the things I liked was how you could choose a specific line, or range of lines, to remove from a text file altogether. For example, this always worked for me:
Code: Select all
jrepl "^" "" /k 0 /exc 30:31 /u /f main.c /o main2.c
I sort-of found a way around it by using hex-code references and the /X and /M parameters but this seems very long-winded to remove just two lines. So I thought I'd mention the issue just in case I have indeed discovered a bug in the code.
Feel free to let me know if I'm missing anything important here. Thanks!
Re: JREPL.BAT v8.3 - regex text processor with support for text highlighting and alternate character sets
That is most definitely a serious bug
Not sure how long it will take, but I will definitely fix that.
Thanks for reporting - please don't hesitate to report any suspect behavior in the future.
Dave Benham
Not sure how long it will take, but I will definitely fix that.
Thanks for reporting - please don't hesitate to report any suspect behavior in the future.
Dave Benham
Re: JREPL.BAT v8.4 - regex text processor with support for text highlighting and alternate character sets
Well that was an easy fix. I replaced version 8.3 with version 8.4
I also updated the main release to version 8.4 at the original post in this thread.
Thanks again MarzSyndrome - you were a big help.
Dave Benham
I also updated the main release to version 8.4 at the original post in this thread.
Thanks again MarzSyndrome - you were a big help.
Dave Benham
Re: JREPL.BAT v8.4 - regex text processor with support for text highlighting and alternate character sets
Hi, this script is very useful and functional. Congratulations.
I want to understand how to search and replace characters on a file ... for example:
&= &
< = <
> = >
© = ©
® = ®
´ = ´
« = «
» = »
¡ = ¡
¿ = ¿
À = À
à = à
Á = Á
á = á
 = Â
â = â
à = Ã
ã = ã
Ä = Ä
ä = ä
Å = Å
å = å
Æ = Æ
æ = æ
Ç = Ç
ç = ç
Ð = Ð
ð = ð
È = È
è = è
É = É
é = é
Ê = Ê
ê = ê
Ë = Ë
ë = ë
Ì = Ì
ì = ì
Í = Í
í = í
Î = Î
î = î
Ï = Ï
ï = ï
Ñ = Ñ
ñ = ñ
Ò = Ò
ò = ò
Ó = Ó
ó = ó
Ô = Ô
ô = ô
Õ = Õ
õ = õ
Ö = Ö
ö = ö
Ø = Ø
ø = ø
Ù = Ù
ù = ù
Ú = Ú
ú = ú
Û = Û
û = û
Ü = Ü
ü = ü
Ý = Ý
ý = ý
ÿ = ÿ
Þ = Þ
þ = þ
ß = ß
§ = §
¶ = ¶
µ = µ
¦ = ¦
± = ±
· = ·
¨ = ¨
¸ = ¸
ª = ª
º = º
¬ = ¬
_ = ­
¯ = ¯
° = °
¹ = ¹
² = ²
³ = ³
¼ = ¼
½ = ½
¾ = ¾
× = ×
÷ = ÷
¢ = ¢
£ = £
¤ = ¤
I have a txt file that I would like to replace the characters because the xml reader does not recognize the accented characters.
Many thanks in advance
I want to understand how to search and replace characters on a file ... for example:
&= &
< = <
> = >
© = ©
® = ®
´ = ´
« = «
» = »
¡ = ¡
¿ = ¿
À = À
à = à
Á = Á
á = á
 = Â
â = â
à = Ã
ã = ã
Ä = Ä
ä = ä
Å = Å
å = å
Æ = Æ
æ = æ
Ç = Ç
ç = ç
Ð = Ð
ð = ð
È = È
è = è
É = É
é = é
Ê = Ê
ê = ê
Ë = Ë
ë = ë
Ì = Ì
ì = ì
Í = Í
í = í
Î = Î
î = î
Ï = Ï
ï = ï
Ñ = Ñ
ñ = ñ
Ò = Ò
ò = ò
Ó = Ó
ó = ó
Ô = Ô
ô = ô
Õ = Õ
õ = õ
Ö = Ö
ö = ö
Ø = Ø
ø = ø
Ù = Ù
ù = ù
Ú = Ú
ú = ú
Û = Û
û = û
Ü = Ü
ü = ü
Ý = Ý
ý = ý
ÿ = ÿ
Þ = Þ
þ = þ
ß = ß
§ = §
¶ = ¶
µ = µ
¦ = ¦
± = ±
· = ·
¨ = ¨
¸ = ¸
ª = ª
º = º
¬ = ¬
_ = ­
¯ = ¯
° = °
¹ = ¹
² = ²
³ = ³
¼ = ¼
½ = ½
¾ = ¾
× = ×
÷ = ÷
¢ = ¢
£ = £
¤ = ¤
I have a txt file that I would like to replace the characters because the xml reader does not recognize the accented characters.
Many thanks in advance
Re: JREPL.BAT v8.4 - regex text processor with support for text highlighting and alternate character sets
That is pretty simple with the /T FILE option, and ADO.
First modify your translations into two txt files representing your find and replace terms, one per line.
find.txt (Must be encoded as Unicode - probably UTF-16LE or UTF-8)
repl.txt(ASCII encoding is fine )
Let's assume both your find.txt and your source.txt are encoded as UTF-8. Then the following will convert it into ASCII (assuming all needed translations are accounted for)
If the files are UTF-16LE, then
To learn more about using ADO with JREPL, use JREPL /?/I and JREPL /?/O
All the character sets supported by ADO can be listed by using JREPL /?CHARSET/
To learn more about the /T option, use JREPL /??/T
Dave Benham
First modify your translations into two txt files representing your find and replace terms, one per line.
find.txt (Must be encoded as Unicode - probably UTF-16LE or UTF-8)
Code: Select all
&
<
>
©
®
´
«
»
¡
¿
À
à
Á
á
Â
â
Ã
ã
Ä
ä
Å
å
Æ
æ
Ç
ç
Ð
ð
È
è
É
é
Ê
ê
Ë
ë
Ì
ì
Í
í
Î
î
Ï
ï
Ñ
ñ
Ò
ò
Ó
ó
Ô
ô
Õ
õ
Ö
ö
Ø
ø
Ù
ù
Ú
ú
Û
û
Ü
ü
Ý
ý
ÿ
Þ
þ
ß
§
¶
µ
¦
±
·
¨
¸
ª
º
¬
_
¯
°
¹
²
³
¼
½
¾
×
÷
¢
£
¤
Code: Select all
&
<
>
©
®
´
«
»
¡
¿
À
à
Á
á
Â
â
Ã
ã
Ä
ä
Å
å
Æ
æ
Ç
ç
Ð
ð
È
è
É
é
Ê
ê
Ë
ë
Ì
ì
Í
í
Î
î
Ï
ï
Ñ
ñ
Ò
ò
Ó
ó
Ô
ô
Õ
õ
Ö
ö
Ø
ø
Ù
ù
Ú
ú
Û
û
Ü
ü
Ý
ý
ÿ
Þ
þ
ß
§
¶
µ
¦
±
·
¨
¸
ª
º
¬
­
¯
°
¹
²
³
¼
½
¾
×
÷
¢
£
¤
Code: Select all
jrepl "find.txt|utf-8" repl.txt /t file /f "source.txt|utf-8" /o output.txt
Code: Select all
jrepl "find.txt|unicode" repl.txt /t file /f "source.txt|unicode" /o output.txt
All the character sets supported by ADO can be listed by using JREPL /?CHARSET/
To learn more about the /T option, use JREPL /??/T
Dave Benham
Re: JREPL.BAT v8.4 - regex text processor with support for text highlighting and alternate character sets
An alternative to using named character references is to use numeric character references in the form of &#nnnn; Then you don't need any special table of translations - you can use very simple JScript logic with the /JQ option.
The following will translate any UTF-8 document into pure ASCII by transforming all unicode character points >= 128, as well as & < > and _
Dave Benham
The following will translate any UTF-8 document into pure ASCII by transforming all unicode character points >= 128, as well as & < > and _
Code: Select all
jrepl "[\x80-\u{FFFFFF}&<>_]" "$txt='&#'+$0.charCodeAt(0)+';'" /xseq /jq /f "source.txt|utf-8" /o "output.txt"
Dave Benham
Re: JREPL.BAT v8.4 - regex text processor with support for text highlighting and alternate character sets
Hi!
I would like to extract all strings between curly brackets after FUNCTION
input.txt
What I need in this case:
My latest script:
call JREPL "\w+" "$txt=$0" /jmatchq /INC "/\\bFUNCTION\\s*=\\s*\\{/:/\\}/" /f "input.txt" > "output.txt"
output.txt
I tried hard to understand how inc parameter works but to no avail. Maybe it shouldn't be used for what I need.
Thanks for any help.
I would like to extract all strings between curly brackets after FUNCTION
input.txt
Code: Select all
FUNCTION = {}
FUNCTION = { value1
value2
value3
}
FUNCTION = {
value4
value5
}
WRONG1 = {wrong1}
FUNCTION = { value6 value7 value8} WRONG2 = {wrong2}
FUNCTION ={value9
value10
value11} FUNCTION ={value12
value13
}
Code: Select all
value1
value2
value3
value4
value5
value6
value7
value8
value9
value10
value11
value12
value13
call JREPL "\w+" "$txt=$0" /jmatchq /INC "/\\bFUNCTION\\s*=\\s*\\{/:/\\}/" /f "input.txt" > "output.txt"
output.txt
Code: Select all
FUNCTION
FUNCTION
value1
value2
value3
FUNCTION
value6
value7
value8
WRONG2
wrong2
FUNCTION
value9
value10
value11
FUNCTION
value12
Thanks for any help.
Re: JREPL.BAT v8.4 - regex text processor with support for text highlighting and alternate character sets
The /INC option does not help because it is line based - all text within the included lines is searched. But you want to search only the text on the line(s) that is between the braces.
You need the /P and /PREPL options to specify the regions of the file to search. You need /M because the braced text may span multiple lines.
Note that /XSEQ encodings are implicitly available for /P regexes. You need \c for ^ in case you use CALL because CALL will double all quoted ^
Dave Benham
You need the /P and /PREPL options to specify the regions of the file to search. You need /M because the braced text may span multiple lines.
Note that /XSEQ encodings are implicitly available for /P regexes. You need \c for ^ in case you use CALL because CALL will double all quoted ^
Code: Select all
call jrepl "\w+" "" /match /m /p "FUNCTION\s*=\s*\{([\c}]+)}" /prepl "{$1}" /f "input.txt" /o "output.txt"
Dave Benham
Re: JREPL.BAT v8.5 - regex text processor with support for text highlighting and alternate character sets
Here is JREPL version 8.5
Summary of Changes
The reason this is important is that it allows manipulating end of line while reading one line at a time, so there are no file size restrictions. The only other way to manipulate line terminators on a line by line basis is through the /M option, but that requires the entire file to fit in memory, which limits the size of the file that can be edited.
Updated /RTN documentation
Summary of Changes
New /EOL optionJREPL /?HISTORY wrote:
2020-02-29 v8.5: Added /EOL option to set the end of line terminator.
Added the eol global jscript variable.
Doc fix - No EOL if /RTN option specifies a :LineNumber.
. . .
The /U option remains unchanged:JREPL /?/EOL wrote:
/EOL EndOfLineString
Write lines using EndOfLineString as the line terminator.
Standard JScript escape sequences may be used.
The default is "\r\n" (CarriageReturn LineFeed).
The value may be set to an empty string to eliminate linefeeds
from the output.
/EOL has no effect if the /M option is used unless /MATCH,
/JMATCH, or /JMATCHQ is also used.
Note that /EOL does not affect input.ReadLine or output.WriteLine
methods in user supplied JScript. ReadLine always accepts both
\r\n and \n as line terminators. And WriteLine always terminates
lines with \r\n.
New eol JScript variable for user supplied JScriptJREPL /?/U wrote:
/U - Write lines using a Unix line terminator \n instead of Windows
terminator of \r\n. This is the same as using /EOL "\n".
See /EOL help for more info.
The most obvious use for this variable is to remove line feeds without using the /M option. For example, the following will remove the newline after any line that ends with a dash:JREPL /?JSCRIPT wrote: . . .
eol - The line terminator used when writing output lines. This is the
same value set by the /EOL option.
. . .
Code: Select all
jrepl "-$" "$txt=$0;eol=''" /jq /jbegln "eol='\r\n'" /f input.txt /o -
Updated /RTN documentation
JREPL /?/RTN wrote:
/RTN ReturnVar[:[-]LineNumber]
Write the result to variable ReturnVar.
If the optional LineNumber is present, then only that specified
line within the result set is returned. A LineNumber of 1 is the
first line. A negative LineNumber is measured from the end of the
result set, so -1 is the last line. /RTN always breaks lines at
\r\n and \n - the /EOL value is ignored.
All byte codes except NULL (0x00) are preserved, regardless
whether delayed expansion is enabled or not. An error is thrown
and no value stored if the result contains NULL.
An error is thrown and no value stored if the value does not fit
within a variable. The maximum returned length varies depending
on the variable name and result content. The longest possible
returned length is 8179 bytes.
The line terminator of the last match is suppressed if /MATCH,
/JMATCH, or /JMATCHQ is used. There is also no line terminator
if LineNumber is specified.
/RTN uses a temporary output file to transfer the result to the
environment variable. By default the temporary file is written
as UTF-8. But the file is written using the CSCRIPT default code
page if the /XFILE option is used - the action may fail if the
result contains a character that cannot be mapped to the CSCRIPT
default code page.
-
- Posts: 1
- Joined: 11 Mar 2020 16:20
Re: JREPL.BAT v8.5 - regex text processor with support for text highlighting and alternate character sets
Hi Im using JRepl inside of Webpack with Webpack ShellPlugin. My call works fine when using inside of cmd but when usign the same script inside webpack it gives weird results
My command is
when used on this string
I get the results
But when using it in cmd I get which is the expected result. Any ideas why this behaves differently?
My command is
Code: Select all
'call "./framework/config/JREPL.BAT" "(Error)\(([^()]*|\(([^()]*|\([^()]*\))*\))*\)" "Error(\q\q)" /xseq /f ./dist/index.html /o ./dist/indexFinal.html'
Code: Select all
Error( skjdksjdskd() + "" + )
Code: Select all
Error()( skjdksjdskd() + "" + )
Code: Select all
Error("")
Re: JREPL.BAT v8.5 - regex text processor with support for text highlighting and alternate character sets
The above post is a follow-up question to my StackOverflow answer to this question.
Based on the OP's last comment on 2020-03-11, the solution is to escape the quotes and backslashes as \" and \\ when using Webpack ShellPlugin:
Based on the OP's last comment on 2020-03-11, the solution is to escape the quotes and backslashes as \" and \\ when using Webpack ShellPlugin:
Code: Select all
call "./framework/config/JREPL.BAT" \"(Error)\\(([\\c()]*|\\(([\\c()]*|\\([\\c()]*\\))*\\))*\\)\" \"Error(\\q\\q)\" /xseq /f ./dist/index.html /o ./dist/indexFinal.html
Re: JREPL.BAT v8.5 - regex text processor with support for text highlighting and alternate character sets
thanks for the excellent tool !
I need to parse an mxf file within a batch; the mxf contains several special characters which pose a problem for batch: "!", " " " (double quote), ":",...
I want to replace them by a harmless characters for subsequent parsing...
as a first step I tried to replace "!" by "_" out of a batch file with this line:
call jrepl.bat "\b^!\b" "_" /f !Lineup! /M /o !dr!___lineup.mx
'Lineup' is the path to the source file; '!dr!' is the path to the target file "___lineup.mx"
wthat I get is this (example line):
original :
<Lineup uid="!MCLineup!DVB-S-DE" primaryProvider="!MCLineup!MainLineup" guid="d78688cca4aa41e788f6092c0beadcb0" />
output :
<Lineup uid="!MCLineup_DVB-S-DE" primaryProvider="!MCLineup_MainLineup" guid="d78688cca4aa41e788f6092c0beadcb0" />
so, the "!" after the "=" is not replaced...
do I need to specify something more in the cmd-line?
I need to parse an mxf file within a batch; the mxf contains several special characters which pose a problem for batch: "!", " " " (double quote), ":",...
I want to replace them by a harmless characters for subsequent parsing...
as a first step I tried to replace "!" by "_" out of a batch file with this line:
call jrepl.bat "\b^!\b" "_" /f !Lineup! /M /o !dr!___lineup.mx
'Lineup' is the path to the source file; '!dr!' is the path to the target file "___lineup.mx"
wthat I get is this (example line):
original :
<Lineup uid="!MCLineup!DVB-S-DE" primaryProvider="!MCLineup!MainLineup" guid="d78688cca4aa41e788f6092c0beadcb0" />
output :
<Lineup uid="!MCLineup_DVB-S-DE" primaryProvider="!MCLineup_MainLineup" guid="d78688cca4aa41e788f6092c0beadcb0" />
so, the "!" after the "=" is not replaced...
do I need to specify something more in the cmd-line?