JREPL.BAT v8.6 - regex text processor with support for text highlighting and alternate character sets

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
Aacini
Expert
Posts: 1914
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

#286 Post by Aacini » 12 Apr 2017 13:58

You may directly do that with a Batch file:

Code: Select all

@echo off
setlocal EnableDelayedExpansion

for /F "delims=" %%a in (test.txt) do set "string=%%a"

echo !string!
echo/

for /F "delims=" %%a in (^"!string:^<br^>^=^<br^>^
% Do NOT remove this line %
!^") do (
  echo %%a
)

Antonio

kyouniis
Posts: 2
Joined: 12 Apr 2017 15:18

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

#287 Post by kyouniis » 12 Apr 2017 15:40

I'm trying to replace some bytes with jrepl.bat using this syntax:

Code: Select all

@echo off
jrepl "\x11\x3C\xC9\x31\x01\x0C\x60\x7C\x04\x8E\xD4\x31\x01\x0C\x60\x7C\x04\x8E\xCD\x5A" "\x11\x3C\xCE\x31\x01\x0C\x60\x7C\x04\x8E\xD4\x31\x01\x0C\x60\x7C\x04\x8E\xCD\x5A" /m /x /f dumpin.bin /o dumpout.bin
pause

But the new file is 30KB bigger than before. What's causing it? I can upload the file if you want to check, it's 1.5MB compressed.

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

#288 Post by dbenham » 13 Apr 2017 07:58

Argh. That should not be :?

Your search and replace strings are the same length, and you have properly used the /M option, so the size should not change.

I definitely would like to have access to the file so I can test. But I am pretty sure you cannot add an attachment that large to this site. So you will have to use some external service and provide a link in your post here. I'm familiar with dropbox, but there are lots of other options. If you prefer, you can send me a private message with the link to the file.


Dave Benham

zimxavier
Posts: 53
Joined: 17 Jan 2016 10:09
Location: France

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

#289 Post by zimxavier » 16 Apr 2017 12:37

Hi!

How can you search a line break with inc option ? I read that \m is incompatible with \inc
This code does nothing:

Code: Select all

@echo off
for /f "delims=" %%a in ('dir /b /a-d "GAME\*.txt" ') do (
call JREPL "\n" "|" /inc "/^BEGIN$/+1:/^END$/-1" /x /f "GAME\%%~a" /o -
)

\n works fine when in Replace though.

Thanks!

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

#290 Post by dbenham » 18 Apr 2017 20:39

Normally (without /M) JREPL reads and processes one line at a time, and the terminating \r\n is not included in the string. So of course searching for \n is pointless. But \r\n automatically gets restored when the resultant line is written.

As stated in the documentation, the only way to productively search for \n is to use the /M option, which puts the entire binary image of the file into memory. But, as you say, the /M option is incompatible with /INC.

So you need a different approach.

I'm not 100% sure of your end goal.

Given the following input

Code: Select all

Preserve 1
Preserve 2
BEGIN
A
B
C
END
Preserve 3
Preserve 4

Then I interpret your desired output to be

Code: Select all

Preserve 1
Preserve 2
BEGIN
A|B|C
END
Preserve 3
Preserve 4

The following should give a result like above. I use the /M option to capture everything between the BEGIN (inclusive) and END (exclusive). I use the /JMATCHQ option to apply a second find/replace on the lines after BEGIN, substituting | for each \r\n.

The code is simpler with a simple FOR instead of FOR /F

Code: Select all

@echo off
for %%F in ("GAME\*.txt"') do (
  call jrepl "(^BEGIN\r?\n)([\s\S]*?)(?=\nEND$)" "$txt=$1+$2.replace(/\r?\n/gm,'|')" /jq /m /f "%%F" /o -
)


Dave Benham

Arc
Posts: 3
Joined: 21 Apr 2017 07:55

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

#291 Post by Arc » 21 Apr 2017 09:24

Hi everyone!

I've signed up to thank Dave Benham. I watched also his videos on Youtube. He's not really an ordinary person :)

I'm not a professional user. I used the gui and macros of numerous text editors. Just search for and replace. I tried to learn regular expressions. But it's really boring to stick to an application. JREPL.BAT was a great savior. Thank you for closing this gap.

I think this simple but effective tool should be heard by every user. Please create a manual with more examples. I have a hard time reading the manuals. Real life examples teach a lot for beginner user. Otherwise it looks very impossible. So please spread to all users.

Now my question: I have difficulty using T command. For example

Code: Select all

<ab>.......................<td>
<\ab>......................<\td>
<efg>......................<tr>
<\efg>.....................<\tr>
<hij lang="en-us"><td>.....<td><hij lang="en-us">
<hij lang="xx"><td>........<td><hij lang="xx">
xml:lang=".................lang="


To change all of them I know these regex:

Code: Select all

(<\?)ab>........................\1td>
(<\?)efg>.......................\1tr>
(<hij lang=".+?">)(<td>)........\2\1
xml:lang="......................lang="


Now I'd created a try.bat

Code: Select all

@echo off
  call jrepl "(<\?)ab>|(<\?)efg>|(<hij lang=".+?">)(<td>)|xml:lang=" ^
             "\1td>|\1tr>|\2\1|lang=" /x /t "|" /f "input.txt" /o "output.txt"


I only know \1 should be $1. But I don't know what else to do anymore.

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

#292 Post by dbenham » 21 Apr 2017 19:37

Thanks for the complements :D

First off, you cannot include a double quote literal in your command line find or replace strings. You need to use either \q (with /X option), or \x22.

Arc wrote:I only know \1 should be $1. But I don't know what else to do anymore.
Nope :!: :twisted:

The /T option is probably the trickiest option to use. The critical piece of information that you have not grasped is from this paragraph from the built-in documentation (jrepl /?/t)

Code: Select all

            The search expressions may be regular expressions, possibly with
            captured groups. Note that each expression is itself converted into
            a captured group behind the scene, and the operation is performed
            as a single search/replace upon execution. So backreferences within
            each regex, and $n references within each replacement expression,
            must be adjusted accordingly. The total number of expressions plus
            captured groups must not exceed 99.

The concept is hard to put into words, but there is an example within the docs that effectively demonstrates the concept:

Code: Select all

          Pig Latin - This example shows how /T can be used with regular
          expressions, and it demonstrates how the numbering of captured
          groups must be adjusted. The /T delimiter is set to a space.
 
          The first regex is captured as $1, and it matches words that begin
          with a consonant. The first captured group ($2) contains the initial
          sequence of consonants, and the second captured group ($3) contains
          the balance of the word. The corresponding replacement string moves
          $2 after $3, with a "-" in between, and appends "ay".
 
          The second regex matches any word, and it is captured as $4 because
          the prior regex ended with group $3. Because the first regex matched
          all words that begin with consonants, the only thing the second
          regex can match is a word that begins with a vowel. The replacement
          string simply adds "-yay" to the end of $4. Note that $0 could have
          been used instead of $4, and it would yield the same result.
 
            echo Can you speak Pig Latin? | jrepl^
             "\b((?:qu(?=[aeiou])|[bcdfghj-np-twxz])+)([a-z']+)\b \b[a-z']+\b"^
             "$3-$2ay $4-yay" /t " " /i
 
            -- OUTPUT --
 
            an-Cay you-yay eak-spay ig-Pay atin-Lay?
Don't forget that parenthesized groups that begin with ?: or ?= or ?! are not captured.

So now, looking at your expressions, I have the main expression number to the left, and each captured group number above:

Code: Select all

      $2
$1 = "(<\?)ab>"    -->   "$2td>"

      $4
$3 = "(<\?)efg>"  -->  "$4tr>"

      $6                  $7
$5 = "(<hij lang=\q.+?\q>)(<td>)"  -->  "$7$6"

$8 = "xml:lang=\q"  -->  "lang=\q"

I would simplify the last expression by using a look ahead expression:

Code: Select all

$8 = "xml:(?=lang=\q)"  -->  ""

So the complete command becomes

Code: Select all

call jrepl "(<\?)ab>|(<\?)efg>|(<hij lang=\q.+?\q>)(<td>)|xml:(?=lang=\q)" ^
           "$2td>|$4tr>|$7$6|" /x /t "|" /f "input.txt" /o "output.txt"


Dave Benham

kyouniis
Posts: 2
Joined: 12 Apr 2017 15:18

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

#293 Post by kyouniis » 21 Apr 2017 20:34

dbenham wrote:Argh. That should not be :?

Your search and replace strings are the same length, and you have properly used the /M option, so the size should not change.

I definitely would like to have access to the file so I can test. But I am pretty sure you cannot add an attachment that large to this site. So you will have to use some external service and provide a link in your post here. I'm familiar with dropbox, but there are lots of other options. If you prefer, you can send me a private message with the link to the file.


Dave Benham

Hi Dave, I posted a link to the file but for some reason my post didn't show up, I'll try to upload it again and send you the link through PM.

catalinnc
Posts: 39
Joined: 12 Jan 2015 11:56

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

#294 Post by catalinnc » 22 Apr 2017 12:46

dbenham wrote:Argh. That should not be :?

Your search and replace strings are the same length, and you have properly used the /M option, so the size should not change.

I definitely would like to have access to the file so I can test. But I am pretty sure you cannot add an attachment that large to this site. So you will have to use some external service and provide a link in your post here. I'm familiar with dropbox, but there are lots of other options. If you prefer, you can send me a private message with the link to the file.


Dave Benham


what is the ETA for fixing this bug?
_

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

#295 Post by dbenham » 22 Apr 2017 13:34

Through a PM, I was able to get a copy of kyouniis' source file, and I could not reproduce his problem. The script successfully modified a single byte in the binary file without changing the total length, exactly as the script was designed to do.

So as far as I know, there is no bug.

I am still working with kyouniis to try to diagnose why he is getting (or thinks he is getting) a different result.

My best guess at the moment is either 1) - there might be some character code pages that do not work properly with JREPL (corrupt the output), or 2) - the output that he is looking at is not actually coming from JREPL, but rather some other source. But those are truly just guesses.

I will post the final result, when (if) we figure out what is actually going on on his machine.


Dave Benham

LM459
Posts: 1
Joined: 01 May 2017 10:24

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

#296 Post by LM459 » 01 May 2017 10:56

I just found out about JREPL and have downloaded it. I am having trouble trying to do the following.
I have a number of text files that contain the "|" character, where a line break should be present.
I would like to replace the "|" character with the 2-character HEX value of 0D 0A (Carriage return/line break), but I am not having any luck with the formatting of the instruction.
I am attempting type oldfile.txt jrepl "\|" "\u0D0A" /X >> newfile.txt, as a test, but the result is not what I'm expecting.

Can someone please provide the proper syntax to accomplish this task.

Thanks!

Arc
Posts: 3
Joined: 21 Apr 2017 07:55

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

#297 Post by Arc » 01 May 2017 13:56

LM459 wrote:I just found out about JREPL and have downloaded it. I am having trouble trying to do the following.
I have a number of text files that contain the "|" character, where a line break should be present.
I would like to replace the "|" character with the 2-character HEX value of 0D 0A (Carriage return/line break), but I am not having any luck with the formatting of the instruction.
I am attempting type oldfile.txt jrepl "\|" "\u0D0A" /X >> newfile.txt, as a test, but the result is not what I'm expecting.

Can someone please provide the proper syntax to accomplish this task.

Thanks!


Welcome to DosTips.com. I'm also newbie :) but I can help. In such cases I would rather use unicode value. You can get this value from charmap.exe or on the net.

For only one file:

Code: Select all

jrepl.bat "\u007C" "\r\n" /x /m /f oldtextfile.txt /o newtextfile.txt


For batch mode, save as batch.bat and run:

Code: Select all

@chcp 65001>nul
@echo off
echo.
echo Drag your txt folder!
echo.
set /p fullpath=
for /f "delims="  %%? in ('dir /b /s "%fullpath:"=%\"*.txt') do (
call jrepl.bat "\u007C" "\r\n" /x /m /f "%%?" /o "%%~dp?/%%~n?_newfile.txt"
)
Last edited by Arc on 04 May 2017 11:33, edited 1 time in total.

Aacini
Expert
Posts: 1914
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

#298 Post by Aacini » 01 May 2017 14:54

@LM459,

You not need Unicode characters nor hundreds of lines of code to perform a replacement as simple as this one. The two-lines Batch file below (save it with .BAT extension) do what you want using the same method of JREPL.BAT...

Code: Select all

@set @a=0 // & cscript //nologo //E:JScript "%~F0" < oldfile.txt > newfile.txt & goto :EOF

WScript.Stdout.Write(WScript.Stdin.ReadAll().replace(/\|/g,"\r\n"));

Antonio

brinda
Posts: 78
Joined: 25 Apr 2012 23:51

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

#299 Post by brinda » 02 May 2017 22:15

dave,

need add-on help you have done previously from link below

viewtopic.php?f=3&t=6044&start=90#p42892
viewtopic.php?f=3&t=6044&start=105#p42894

Actual paragraph
I couldn't believe that I could actually understand what I was reading. Using the incredible power of the human brain, according to research at Cambridge University, it doesn't matter in what order the letters in a word are, the only important thing is that the first and last letter be in the right place. The rest can be a total, mess and you can read it without a problem. This is because the human mind does not read every letter by itself, but the word as a whole. Amazing, huh? Yeah and I always thought spelling was important! See if your friends can read this too!

Jumble0
I cnduo't bvleiee taht I culod aulaclty uesdtannrd waht I was rdnaieg. Unisg the icndeblire pweor of the hmuan mnid, aocdcrnig to rseecrah at Cmabrigde Uinervtisy, it dseno't mttaer in waht oderr the lterets in a wrod are, the olny irpoamtnt tihng is taht the frsit and lsat ltteer be in the rhgit pclae. The rset can be a taotl mses and you can sitll raed it whoutit a pboerlm. Tihs is bucseae the huamn mnid deos not raed ervey ltteer by istlef, but the wrod as a wlohe. Aaznmig, huh? Yaeh and I awlyas tghhuot slelinpg was ipmorantt! See if yuor fdreins can raed tihs too.

link
https://www.ecenglish.com/learnenglish/lessons/can-you-read

Jumble0 [new add on request]
Jumble0 mix criteria,
a)Only the first and last letter in a word should remain in its original position.
b)Maintain Letter Capitalization if available
c)Original position of space should remain

Jumble1 mix criteria,
a)Only the last letter in a word should remain in its original position.
b)Maintain Letter Capitalization
c)Original position of space should remain

Jumble2 Mirror criteria
a)Reversal of letters position and the word position from left to right. E.g "Sri Advaita" becomes "atiavdA irS"
b)Maintain Letter Capitalization
c)Original position of space should remain

Jumble3 Reverse criteria
a)Reversal of letters position from left to right. Word position remains. E.g "Sri Advaita" becomes "irS atiavdA"
b)Maintain Letter Capitalization
c)Original position of space should remain



Normal word list input on text file looks below

Code: Select all

Sri Advaita
Bhagavad Gita
Saptaham
Maha Bali Puram




Processed list text file (Normal, jumble0, jumble,mirror,reverse)
Code: Select all

Normal,jumble0,jumble,mirror,reverse

Code: Select all

Sri Advaita,Sri Avaidta,rSi vAdiata,atiavdA irS,irS atiavdA 
Bhagavad Gita,Bhvaagad Gtia,avBgahd tiGa,atiG davagahB,davagahB atiG
Saptaham,Sahaptam,tpahaaSm,mahatpaS,mahatpaS
Maha Bali Puram,Mhaa Blai Parum,ahMa lBai arPum,maruP ilaB ahaM,ahaM ilaB maruP 

Arc
Posts: 3
Joined: 21 Apr 2017 07:55

Re: JREPL.BAT - regex text processor - successor to REPL.BAT

#300 Post by Arc » 04 May 2017 11:33

Sample.txt

Code: Select all

"

{Delete " and the two lines with above.}
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Maecenas egestas efficitur lobortis.
"

{Don't delete the two lines above.}
Vestibulum id dui nec nisi mattis tristique.
Donec pretium felis eu odio iaculis maximus.

Ok let's try...

Code: Select all

JREPL.BAT "\q\r\n\r\n" "" /INC "1:3" /m /x /f
"input.txt" /o output.txt
Not working, why?
"The /INC option is incompatible with /M and /S."

/INC and /EXC are great but they cannot be used with /M. Is there any alternative?

Post Reply