JREPL.BAT v8.6 - regex text processor with support for text highlighting and alternate character sets
Moderator: DosItHelp
Re: JREPL.BAT v7.1 - regex text processor now with Unicode and XRegExp support
Glad it was helpful
Steffen
Steffen
Re: JREPL.BAT v7.2 - regex text processor now with Unicode and XRegExp support
Sorry guys. I discovered a stupid bug with the /T FILE ADO support - it was completely broken
I had tested the feature, but then I made one additional small change before release of 7.1, and I forgot to apply the change to the /T FILE feature.
The /T FILE option now properly supports ADO as was originally intended with v7.0
I also improved the documentation of the new v7 features.
I've updated the prior release post to v7.2
The /X documentation describes how the \xnn escape sequence only works properly if your machine defaults to the Windows-1252 character set, or if you explicitly use ADO to read and write using the Windows-1252 character set.
I am working on a new version 7.4 that should enable \xnn to support any single byte character set. Hopefully it will not be long.
Dave Benham
I had tested the feature, but then I made one additional small change before release of 7.1, and I forgot to apply the change to the /T FILE feature.
The /T FILE option now properly supports ADO as was originally intended with v7.0
I also improved the documentation of the new v7 features.
I've updated the prior release post to v7.2
The /X documentation describes how the \xnn escape sequence only works properly if your machine defaults to the Windows-1252 character set, or if you explicitly use ADO to read and write using the Windows-1252 character set.
I am working on a new version 7.4 that should enable \xnn to support any single byte character set. Hopefully it will not be long.
Dave Benham
Re: JREPL.BAT v7.4 - regex text processor now with Unicode and XRegExp support
Here is version 7.4 with new behavior for the /X \xnn extended ASCII escape sequence.
Prior to 7.4, \xnn was always treated as a Windows-1252 byte code. This worked great for most people in Western Europe and North and South America. But there are many others where this behavior is worthless.
Starting with v7.4, the /X \xnn sequence uses the correct local character set, whenever possible. This is accomplished by creating a binary file length 256 that contains all possible byte codes, and then letting JREPL read the file. Assuming the file is interpreted as a single byte character set, then each byte is converted into a specific Unicode code point according to the rules of the character set. JREPL can then read the character at a particular offset to determine the correct mapping.
The other v7.0 features allow the character set to be selected independently for both input and output, using ADO. Version 7.4 automatically interprets the /X \xnn using the correct character set, depending on if the sequence is in a search string (input), or replacement string (output).
If the character set is not a fixed, single byte character set, then /X \xnn is treated as a Unicode code point.
If the binary file cannot be created for any reason, then JREPL falls back to v7.3 behavior, where /X \xnn is always treated as a Windows-1252 byte code.
Below is the relevant documentation changes for the new v7.4 behavior. Be sure to look at the other v7.0 features, as they work well with the v7.4 changes.
I have no plans to introduce any new features to JREPL, so barring any bugs, this should be the last version of JREPL for quite some time.
Dave Benham
Prior to 7.4, \xnn was always treated as a Windows-1252 byte code. This worked great for most people in Western Europe and North and South America. But there are many others where this behavior is worthless.
Starting with v7.4, the /X \xnn sequence uses the correct local character set, whenever possible. This is accomplished by creating a binary file length 256 that contains all possible byte codes, and then letting JREPL read the file. Assuming the file is interpreted as a single byte character set, then each byte is converted into a specific Unicode code point according to the rules of the character set. JREPL can then read the character at a particular offset to determine the correct mapping.
The other v7.0 features allow the character set to be selected independently for both input and output, using ADO. Version 7.4 automatically interprets the /X \xnn using the correct character set, depending on if the sequence is in a search string (input), or replacement string (output).
If the character set is not a fixed, single byte character set, then /X \xnn is treated as a Unicode code point.
If the binary file cannot be created for any reason, then JREPL falls back to v7.3 behavior, where /X \xnn is always treated as a Windows-1252 byte code.
Below is the relevant documentation changes for the new v7.4 behavior. Be sure to look at the other v7.0 features, as they work well with the v7.4 changes.
Code: Select all
>jrepl /?history
2017-09-25 v7.4: Modified /X \xnn extended ASCII escape sequence to support
any single byte character set.
Added /X \x{nn,Charset} escape sequence.
Added /XBYTES and /XBYTESOFF options.
Modified decode() to support the new /X \xnn behavior.
<...truncated>
>jrepl /?/X & jrepl /?/XBYTES & jrepl /?/XBYTESOFF
/X - Preserves extended ASCII characters that may appear within
command line arguments and/or variables by first writing the
values to temporary files within the %TEMP% directory. Extended
ASCII values are byte codes >= 128 (0x80). Extended ASCII within
files, stdin, and stdout are preserved regardless.
Also enables extended escape sequences for both Search strings and
Replacement strings, with support for the following sequences:
\\ - Backslash
\b - Backspace
\c - Caret (^)
\f - Formfeed
\n - Newline
\q - Quote (")
\r - Carriage Return
\t - Horizontal Tab
\v - Vertical Tab
\xnn - Extended ASCII byte code expressed as 2 hex digits nn.
If used within a Find string, then the input character
set is used. If within a Replacement string, then the
output character set is used. If the selected character
set is invalid or not a single byte character set, then
\xnn is treated as a Unicode code point.
\x{nn,CharSet} - Same as \xnn, except explicitly uses CharSet
character set mapping.
\unnnn - Unicode code point expressed as 4 hex digits nnnn.
\u{N} - Any Unicode code point where N is 1 to 6 hex digits
JREPL automatically creates an XBYTES.DAT file containing all 256
possible byte codes. The XBYTES.DAT file is preferentially created
in "%ALLUSERSPROFILE\JREPL\" if at all possible. Otherwise the
file is created in "%TEMP%\JREPL\" instead. JREPL uses the file
to establish the correct \xnn byte code mapping for each character
set. Once created, successive runs reuse the same XBYTES.DAT file.
If the file gets corrupted, then use the /XBYTES option to force
creation of a new XBYTES.DAT file. If JREPL cannot create the file
for any reason, then JREPL defaults to using pre v7.4 behavior
where /X \xnn is interpreted as Windows-1252.
Without the /X option, only standard JSCRIPT escape sequences
\\, \b, \f, \n, \r, \t, \v, \xnn, \unnnn are available for the
search strings. And the \xnn sequence represents a unicode
code point, not extended ASCII.
Extended escape sequences are supported even when the /L option
is used. Both Search and Replace support all of the extended
escape sequences if both the /X and /L opions are combined.
Extended escape sequences are not applied to JScript code when
using any of the /Jxxx options. Use the decode() function if
extended escape sequences are needed within the code.
/XBYTES - Force creation of a new XBYTES.DAT file for use by the /X
option when decoding \xnn sequences.
/XBYTESOFF - Force JREPL to use pre v7.4 behavior where /X \xnn is
always interpreted as Windows-1252.
>jrepl /?jscript
<...truncated>
decode( String [,CharSet] )
Decodes extended escape sequences within String as defined by
the /X option, and returns the result. CharSet specifies the
single byte character set to use for \xnn escape sequences.
If CharSet is 'input', then the character set of the input is
used. If CharSet is 'output', then the character set of the
output is used. If CharSet is 'default' or undefined, then the
default character set for the machine is used. Otherwise,
CharSet should be a valid internet character set name understood
by the machine. If the selected character set is invalid or not
a single byte character set, then \xnn is treated as a Unicode
code point.
All backslashes within String must be escaped an extra time to
use this function in your code.
Examples:
quote literal: decode('\\q','output')
extended ASCII(128): decode('\\x80','output')
backslash literal: decode('\\\\','output')
This function is only needed if you use any \q, \c, or \u{N}
escape sequences, or \xnn escape sequence for extended ASCII.
<...truncated>
I have no plans to introduce any new features to JREPL, so barring any bugs, this should be the last version of JREPL for quite some time.
Dave Benham
Re: JREPL.BAT v7.6 - regex text processor now with Unicode and XRegExp support
Well I guess I lied , plus I fixed a minor bugdbenham wrote:(Re: v7.4) I have no plans to introduce any new features to JREPL, so barring any bugs, this should be the last version of JREPL for quite some time.
Here is version 7.6 with new help options and a bug fix
Code: Select all
>jrepl /?history
2017-10-08 v7.6: Fixed /?Intro syntax help for /?Charset/[Query]
2017-10-08 v7.5: Added /?CHARSET and /?XREGEXP web page help options
Added /?CHARSET/[query] List character sets help option
Fixed ADO output.WriteLine() to use \r\n instead of \n
Improved documentation: /EXC, /OFF, /U, /?HELP, decode()
<truncated...>
1) Bugfix
The output.WriteLine() method used in user supplied JScript has been fixed to always use \r\n line terminators. The prior bugged version was using \n with ADO output.
2) New /?[?]CHARSET/[Query] help option
Code: Select all
>jrepl /?help
<truncated...>
/?CHARSET/[Query] - List all character set names for use with ADO I/O
that are installed on this computer. Optionally restrict
the list to names that contain Query. Wildcards * and ? may
be used within Query. The default Query is an empty string,
meaning list all available character sets. The list is
generated via reg.exe.
Examples:
jrepl /??charset/ - Paged list of all available names
jrepl /?charset/utf - List of names containing "utf"
3) New /?CHARSET web page help
Opens up a Microsoft documentation page listing code pages and their corresponding character set names.
4) New /?XREGEXP web page help
Opens up the home page for the xRegExp augmented regular expression javascript module'
Dave Benham
Re: JREPL.BAT v7.7 - regex text processor now with Unicode and XRegExp support
Here is JREPL.BAT version 7.7
1) Fixed broken documentation links
Microsoft documentation links within JREPL were recently broken and had to be fixed.
2) Allow /O "-|CharSet"
The prior version forced the output character set to match the input when the /O - option was used (no |CharSet specification allowed). Version 7.7 allows the output character set to be different when using /O -.
3) Fix decode(Str[,CharSet]) bug
Version 7.4 extended the decode() function to allow specification of the character set to be used with \x escape sequences. In the interest of remaining backward compatible, the CharSet argument was supposed to be optional, but versions 7.4 through 7.6 were bugged.
Version 7.7 truly makes the CharSet argument optional, as was originally intended.
Dave Benham
Code: Select all
>jrepl /?history
2017-10-24 v7.7: Fixed broken Microsoft documentation links
Allow /O "-|CharSet"
Fixed decode(Str[,CharSet]) bug when CharSet is undefined
<truncated...>
1) Fixed broken documentation links
Microsoft documentation links within JREPL were recently broken and had to be fixed.
2) Allow /O "-|CharSet"
The prior version forced the output character set to match the input when the /O - option was used (no |CharSet specification allowed). Version 7.7 allows the output character set to be different when using /O -.
Code: Select all
C:\test>jrepl /?/o
/O OutFile[|CharSet]
<truncated...>
If /F InFile is also used, then an OutFile value of "-" overwrites
the original InFile with the output. A value of "-" preserves the
original character set. A value of "-|" explicitly transforms the
file into the machine default character set. A "-|CharSet" value
explicitly transforms the file into the specified character set.
The output is first written to a temporary file with the same path
and name, with .new appended. Upon completion, the temp file is
moved to replace the InFile.
<truncated...>
3) Fix decode(Str[,CharSet]) bug
Version 7.4 extended the decode() function to allow specification of the character set to be used with \x escape sequences. In the interest of remaining backward compatible, the CharSet argument was supposed to be optional, but versions 7.4 through 7.6 were bugged.
Version 7.7 truly makes the CharSet argument optional, as was originally intended.
Dave Benham
Re: JREPL.BAT v7.7 - regex text processor now with Unicode and XRegExp support
Hi
1. Is there a way to avoid the buggy findstr with /g parameter using JREPL? (I use /r parameter for avoiding a nasty bug even if my strings are literal, and it is very slow) Maybe jmatchq?
And \v parameter?
2. I would like to extract some strings from inside brackets and after "function" in one move:
Example.txt
My code is currently in two steps:
batch.bat
step1: newfile.txt
step2: newfile.txt
Thanks for any help!
1. Is there a way to avoid the buggy findstr with /g parameter using JREPL? (I use /r parameter for avoiding a nasty bug even if my strings are literal, and it is very slow) Maybe jmatchq?
And \v parameter?
2. I would like to extract some strings from inside brackets and after "function" in one move:
Example.txt
Code: Select all
function = { string1 string2 string3 }
function = {
string4
}
batch.bat
Code: Select all
for %%F in ("D:\folder\*.txt") do (
call JREPL "(\bfunction\s*=\s*{)([\s\S]*?)}" "$txt=$2" /jmatchq /m /x /f "%%F" >> "newfile.txt"
)
call JREPL "([A-Za-z0-9_-]+)" "$txt=$1" /jmatchq /f "newfile.txt" /o -
Code: Select all
string1 string2 string3
string4
Code: Select all
string1
string2
string3
string4
Re: JREPL.BAT v7.7 - regex text processor now with Unicode and XRegExp support
1) There is no simple JREPL emulation of FINDSTR /G at the moment. But it is something that I have thought about in the past. I'm already working on a new JREPL release. Now that you have requested a solution, I think I will extend the /K and /R options to allow reading a set of search strings from a file. It shouldn't take long to whip up and release this new functionality (probably within 1 week).
2) Oh yes
This begins to tap into the true power of JREPL
Read up on the /T (translate) option that allows you to specify multiple independent find/replace pairs. Couple that with /JMATCHQ and /JBEG to provide a little JSCRIPT logic, and the solution is relatively simple and elegant.
I wish JSCRIPT regex supported look behind, because then the solution would be oh so simple with the /P (pre-filter) option:
But alas ... No look behinds, so the above does not work.
It is possible to solve this with the /P option, without /T, but then you must search for "function = {" twice, which I do not like:
2017-11-23 Update - New version 7.9 adds a /PREPL option that circumvents the lack of look behind support.
The solution is now as simple as:
Dave Benham
2) Oh yes
This begins to tap into the true power of JREPL
Read up on the /T (translate) option that allows you to specify multiple independent find/replace pairs. Couple that with /JMATCHQ and /JBEG to provide a little JSCRIPT logic, and the solution is relatively simple and elegant.
Code: Select all
@echo off
>newfile.txt (
for %%F in ("D:\folder\*.txt") do (
call jrepl "\bfunction\s*=\s*{ } [A-Za-z0-9_-]+" "$txt=!(go=true) $txt=go=false $txt=go?$0:go" /t " " /jmatchq /jbeg "var go=false" /f "%%F"
)
)
Code: Select all
call jrepl "[A-Za-z0-9_-]+" "$txt=$0" /p "(?<=\bfunction\s*=\s*{)[\c}]+(?=})" /m /x /jmatchq /f "%%F"
It is possible to solve this with the /P option, without /T, but then you must search for "function = {" twice, which I do not like:
Code: Select all
call jrepl "\bfunction\s*=\s*{|([A-Za-z0-9_-]+)" "$txt=$1?$1:false" /p "\bfunction\s*=\s*{[\c}]+}" /jmatchq /m /x /f "%%F"
The solution is now as simple as:
Code: Select all
call jrepl "[A-Za-z0-9_-]+" "" /match /p "\bfunction\s*=\s*\{([\c}]+)}" /prepl "{$1}" /m /f "%%F"
Re: JREPL.BAT v7.8 - regex text processor now with Unicode and XRegExp support
Here is JREPL.BAT version 7.8
1) Add :FILE syntax for /K and /R to load searches from file
This feature satisfies a request from zimxavier to emulate the FINDSTR /G option, but without the nasty FINDSTR bugs.
Example - List all lines from input.txt that match at least one string found in file search.txt.
You might try the following with FINDSTR, but it could give the wrong results due to this bug
You can get the correct result using JREPL as follows:
2) Added \x{nn-mm} and \x{nn-mm,CharSet} escape sequences
3) Split /X into /XFILE and /XSEQ - /X implies both
4) Fixed /XSEQ escaped backslash bug with /INC, /EXC, AND /P
Prior to v7.8, the /INC, /EXC, and /P options could give the wrong result if the regular expression contained a backslash literal and the /X option was used.
For example, the following command:is supposed to include lines that contain the following literal string "\n".
But the prior bugged versions would mistakenly treat the resultant "\n" as an escape sequence, and would attempt to include lines that contain a newline instead.
Version 7.8 fixes the bug and gives the correct behavior.
Dave Benham
Code: Select all
Prompt>jrepl /?history
2017-11-13 v7.8: Added \x{nn-mm} and \x{nn-mm,CharSet} escape sequences
Split /X into /XFILE and /XSEQ - /X implies both
Add :FILE syntax for /K and /R to load searches from file
Fixed /XSEQ escaped backslash bug with /INC, /EXC, AND /P
<truncated...>
This feature satisfies a request from zimxavier to emulate the FINDSTR /G option, but without the nasty FINDSTR bugs.
Code: Select all
Prompt>jrepl /?/k & jrepl /?/r
/K PreContext:PostContext[:FILE]
/K Context[:FILE]
Keep matches - Search and write out lines that contain at least
one match, without doing any replacement. The Replace argument is
still required, but is ignored.
The integers PreContext and PostContext specify how many non-
matching lines to write before the match, and after the match,
respectively. If a single Context integer is given, then the same
number of non-matching lines are written before and after.
A Context of 0 writes only matching lines.
If :FILE is appended to the context, then the Search parameter
specifies a file containing one or more search terms, one term
per line. A line matches if any of the search terms are found
witin the line. The file can be opened via ADO if |CharSet
(internet character set name) is appended to the file name.
Note: the /V option does not apply to Search if /K :FILE is used.
/K is incompatible with /A, /J, /JQ, /JMATCH, /JMATCHQ, /M,
/MATCH, /R, /S, and /T.
/R PreContext:PostContext[:FILE]
/R Context[:FILE]
Reject matches - Search and write out lines that do not contain
any matches, without doing any replacement. The Replace argument
is still required, but is ignored.
The integers PreContext and PostContext specify how many matching
lines to write before the non-match, and after the non-match,
respectively. If a single Context integer is given, then the same
number of matching lines are written before and after.
A Context of 0 writes only non-matching lines.
If :FILE is appended to the context, then the Search parameter
specifies a file containing one or more search terms, one term
per line. A line is rejected if any of the search terms are found
witin the line. The file can be opened via ADO if |CharSet
(internet character set name) is appended to the file name.
Note: the /V option does not apply to Search if /K :FILE is used.
/R is incomptaible with /A, /J, /JQ, /JMATCH, /JMATCHQ, /K, /M,
/MATCH, /S, and /T.
You might try the following with FINDSTR, but it could give the wrong results due to this bug
Code: Select all
findstr /L /G:search.txt input.txt
Code: Select all
call jrepl search.txt "" /L /K 0:file /F input.txt
3) Split /X into /XFILE and /XSEQ - /X implies both
Code: Select all
D:\test>jrepl /?/x & jrepl /?/xfile & jrepl /?/xseq
/X - Shorthand for combined /XFILE and /XSEQ.
/XFILE - Preserves extended ASCII characters that may appear within
command line arguments and/or variables by first writing the
values to temporary files within the %TEMP% directory. Extended
ASCII values are byte codes >= 128 (0x80). This option is ignored
(no temporary files written) if /UTF is also used.
Temporary files may be needed when the cmd.exe active code page
does not match the default character set used by the CSCRIPT
JSCRIPT engine.
/XSEQ - Enables extended escape sequences for both Search strings and
Replacement strings, with support for the following sequences:
\\ - Backslash
\b - Backspace
\c - Caret (^)
\f - Formfeed
\n - Newline
\q - Quote (")
\r - Carriage Return
\t - Horizontal Tab
\v - Vertical Tab
\xnn - Extended ASCII byte code expressed as 2 hex digits nn.
The code is mapped to the correct Unicode code point,
depending on the chosen character set. If used within
a Find string, then the input character set is used. If
within a Replacement string, then the output character
set is used. If the selected character set is invalid or
not a single byte character set, then \xnn is treated as
a Unicode code point. Note that extended ASCII character
class ranges like [\xnn-\xnn] should not be used because
the intended range likely does not map to a contiguous
set of Unicode code points - use [\x{nn-mm}] instead.
\x{nn-mm} - A range of extended ASCII byte codes for use within
a regular expression character class expression. The
The min value nn and max value mm are expressed as hex
digits. The range is automatically expanded into the
full set of mapped Unicode code points. The character
set mapping rules are the same as for \xnn.
\x{nn,CharSet} - Same as \xnn, except explicitly uses CharSet
character set mapping.
\x{nn-mm,CharSet} - Same as \x{nn-mm}, except explicitly uses
CharSet character set mapping.
\unnnn - Unicode code point expressed as 4 hex digits nnnn.
\u{N} - Any Unicode code point where N is 1 to 6 hex digits
JREPL automatically creates an XBYTES.DAT file containing all 256
possible byte codes. The XBYTES.DAT file is preferentially created
in "%ALLUSERSPROFILE\JREPL\" if at all possible. Otherwise the
file is created in "%TEMP%\JREPL\" instead. JREPL uses the file
to establish the correct \xnn byte code mapping for each character
set. Once created, successive runs reuse the same XBYTES.DAT file.
If the file gets corrupted, then use the /XBYTES option to force
creation of a new XBYTES.DAT file. If JREPL cannot create the file
for any reason, then JREPL defaults to using pre v7.4 behavior
where /XSEQ \xnn is interpreted as Windows-1252.
Without the /XSEQ option, only standard JSCRIPT escape sequences
\\, \b, \f, \n, \r, \t, \v, \xnn, \unnnn are available for the
search strings. And the \xnn sequence represents a unicode
code point, not extended ASCII.
Extended escape sequences are supported even when the /L option
is used. Both Search and Replace support all of the extended
escape sequences if both the /XSEQ and /L opions are combined.
Extended escape sequences are not applied to JScript code when
using any of the /Jxxx options. Use the decode() function if
extended escape sequences are needed within the code.
Prior to v7.8, the /INC, /EXC, and /P options could give the wrong result if the regular expression contained a backslash literal and the /X option was used.
For example, the following command:
Code: Select all
jrepl "some search" "some replace" /x /inc "/\\n/" /f input.xt
But the prior bugged versions would mistakenly treat the resultant "\n" as an escape sequence, and would attempt to include lines that contain a newline instead.
Version 7.8 fixes the bug and gives the correct behavior.
Dave Benham
Re: JREPL.BAT v7.8 - regex text processor now with Unicode and XRegExp support
Wow! It works fine I need to study the relatively simple solution thoughdbenham wrote:1) There is no simple JREPL emulation of FINDSTR /G at the moment. But it is something that I have thought about in the past. I'm already working on a new JREPL release. Now that you have requested a solution, I think I will extend the /K and /R options to allow reading a set of search strings from a file. It shouldn't take long to whip up and release this new functionality (probably within 1 week).
Read up on the /T (translate) option that allows you to specify multiple independent find/replace pairs. Couple that with /JMATCHQ and /JBEG to provide a little JSCRIPT logic, and the solution is relatively simple and elegant.Dave BenhamCode: Select all
@echo off >newfile.txt ( for %%F in ("D:\folder\*.txt") do ( call jrepl "\bfunction\s*=\s*{ } [A-Za-z0-9_-]+" "$txt=!(go=true) $txt=go=false $txt=go?$0:go" /t " " /jmatchq /jbeg "var go=false" /f "%%F" ) )
-----------------------------------------------------------------------------------------------------
About JREPL.BAT version 7.8... Wow x2
No issue with the result, it works as expected. I currently have a huge performance loss though.
Code: Select all
call jrepl search.txt "" /L /K 0:file /F input.txt > output.txt
Code: Select all
findstr /b /L /g:search.txt input.txt > output.txt
Time - findstr: 52 files in 40 seconds
Time - jrepl: Not even 3 files in 5 minutes (I cancelled)
I hope I miss nothing. Maybe related to the size of my search.txt (67 000 lines)?
Can I emulate B option? And X?
Re: JREPL.BAT v7.8 - regex text processor now with Unicode and XRegExp support
Certainly 67,000 lines in your search file is going to take time - I'm a bit surprised that JREPL works at all with that many search terms.zimxavier wrote: About JREPL.BAT version 7.8... Wow x2
No issue with the result, it works as expected. I currently have a huge performance loss though.
is a lot slower thanCode: Select all
call jrepl search.txt "" /L /K 0:file /F input.txt > output.txt
(with or without B option)Code: Select all
findstr /b /L /g:search.txt input.txt > output.txt
Time - findstr: 52 files in 40 seconds
Time - jrepl: Not even 3 files in 5 minutes (I cancelled)
I hope I miss nothing. Maybe related to the size of my search.txt (67 000 lines)?
Since FINDSTR /G has a bug that can lead to missed matches, the FINDSTR timing is a bit pointless. Perhaps if it gave the correct answer, it would be slower
That being said, your FINDSTR command uses the /B option, which you did not use with JREPL. Looking anywhere within a line for a string is very computationally expensive compared to restricting matches to the beginning of a line.
Have you tried to use the built in help?zimxavier wrote: Can I emulate B option? And X?
You can use the following to get a description of all available help options:
Code: Select all
Prompt>jrepl /?help
Help is available by supplying a single argument beginning with /? or /??:
/? - Writes all available help to stdout.
/?? - Same as /? except uses MORE for pagination.
/?Topic - Writes help about the specified topic to stdout.
Valid topics are:
INTRO - Basic syntax and default behavior
OPTIONS - Brief summary of all options
JSCRIPT - JREPL objects available to user JScript
RETURN - All possible return codes
VERSION - Display the version of JREPL.BAT
HISTORY - A summary of all releases
HELP - Lists all methods of getting help
Example: List a summary of all available options
jrepl /?options
/?WebTopic - Opens up a web page within your browser about a topic.
Valid web topics are:
REGEX - Microsoft regular expression documentation
REPLACE - Microsoft Replace method documentation
UPDATE - DosTips release page for JREPL.BAT
CHARSET - List of possible character set names for ADO I/O
Some character sets may not be installed
XREGEXP - xRegExp.com home page (extended regex docs)
/?/Option - Writes detailed help about the specified /Option to stdout.
Example: Display paged help about the /T option
jrepl /??/t
/?CHARSET/[Query] - List all character set names for use with ADO I/O
that are installed on this computer. Optionally restrict
the list to names that contain Query. Wildcards * and ? may
be used within Query. The default Query is an empty string,
meaning list all available character sets. The list is
generated via reg.exe.
Examples:
jrepl /??charset/ - Paged list of all available names
jrepl /?charset/utf - List of names containing "utf"
Code: Select all
Prompt>jrepl /?options
Options: Behavior may be altered by appending one or more options.
The option names are case insensitive, and may appear in any order
after the Replace argument.
/A - write Altered lines only
/APP - Append results to the output file
/B - match Beginning of line
/C - Count number of source lines
/D - Delimiter for /N and /OFF
/E - match End of line
/EXC BlockList - EXClude lines from selected blocks
/F InFile[|CharSet] - read input from a File
/I - Ignore case
/INC BlockList - INClude lines from selected blocks
/J - JScript replace expressions
/JBEG InitCode - initialization JScript code
/JBEGLN NewLineCode - line initialization JScript code
/JEND FinalCode - finalization JScript code
/JENDLN EndLineCode - line finalization JScript code
/JLIB FileList - load file(s) of initialization code
/JMATCH - write matching JScript replacements only
/JMATCHQ - new Quick form of /JMATCH
/JQ - new Quick form of /J
/K Context or Pre:Post - search and Keep lines that match
/L - Literal search
/M - Multi-line mode
/MATCH - Search and print each match, one per line
/N MinWidth - prefix output with liNe numbers
/O OutFile[|CharSet] - write Output to a file
/OFF MinWidth - add char OFFsets to /K, /JMATCHQ, /MATCH output
/P Regex - only search/replace strings that match a Regex
/PFLAG Flags - set the /P regex Flags to "g", "gi", "", or "i"
/R Context or Pre:Post - search and Reject lines that match
/RTN ReturnVar[:Line#] - Return result in a variable
/S VarName - Source is read from a variable
/T DelimChar or FILE - Translate multiple search/replace pairs
/TFLAG Flags - Specify XRegExp flags for use with /T
/U - Unix line terminators (\n instead of \r\n)
/UTF - All input and output as UTF-16LE (BOM optional)
/V - use Variables for Search/Replace and code
/X - enable eXtended ASCII and escape sequences
/XBYTES - force creation of new XBYTES.DAT
/XBYTESOFF - force all \xnn to be treated as Windows-1252
/XREG FileList - adds XRegExp support to JREPL
Code: Select all
Prompt> jrepl /?/b
/B - The Search must match the Beginning of a line.
Mostly used with literal searches.
If you add /B to your JREPL /K 0:FILE search, then performance may be significantly better.
Dave Benham
Re: JREPL.BAT v7.9 - regex text processor now with Unicode and XRegExp support
Here is JREPL.BAT v7.9
Summary of changes:
1) Allow escape sequences with /T "" coupled with /XSEQ
Example - Use /T "" and /X coupled with \x{nn-mm} escape sequences ( introduced in v7.8 ) to perform ROT13 obfuscation (rotation cipher)
The escape sequences are interpreted and expanded into the following strings before the /T "" option splits the strings:
After the split, we get the expected find/replace pairs: A -> N, B -> O, C -> P, etc.
2) Added /PREPL option to augment /P behavior
Prior to v7.9, the /P option had limited use because JSCRIPT regular expressions do not support look behind expressions.
The new /PREPL option circumvents that limitation by passing only a captured group within a matched /P filter.
Here is modified documentation for /P, as well as documentation for the new /PREPL option - Be sure to look at the two examples embedded within the docs:
3) Bug fix - Force /L when /T "" used, as per documentation
Using /T "" is supposed to implicitly set the /L option, but this feature was broken in some prior release.
Version 7.9 restores this behavior.
4) Bug fix - Allow /?charset/search to include non alpha
The ability to search and display installed character sets via /?charset/search was introduced in v7.5,
but there was a bug that only allowed alphabetic characters in the search.
Searching by a number like jrepl /?charset/1252 would result in an erroneous Invalid /? option error message.
Version 7.9 fixes the bug such that all characters can now be included in the search.
Dave Benham
Summary of changes:
Code: Select all
prompt>jrepl /??history
2017-11-23 v7.9: Allow escape sequences with /T "" coupled with /XSEQ
Added /PREPL option to augment /P behavior
Bug fix - Force /L when /T "" used, as per documentation
Bug fix - Allow /?charset/search to include non alpha
<truncated...>
Code: Select all
prompt>jrepl /?/t
/T DelimiterChar
/T FILE
The /T option is very similar to the Oracle Translate() function,
or the unix tr command, or the sed y command.
The Search represents a set of search expressions, and Replace
is a like sized set of replacement expressions. Expressions are
delimited by DelimiterChar (a single character). If DelimiterChar
is an empty string, then each character is treated as its own
expression. The /L option is implicitly set if DelimiterChar is
empty. Normally escape sequences are interpreted after the search
and replace strings are split into expressions. But if the
DelimiterChar is empty and /XSEQ is used, then escape sequences
are interpreted prior to the split at every character.
<truncated...>
Code: Select all
prompt>echo Goodbye Cruel World! | jrepl "\x{41-5a}\x{61-7a}" "\x{4e-5a}\x{41-4d}\x{6e-7a}\x{61-6d}" /t "" /x
Tbbqolr Pehry Jbeyq!
Code: Select all
find=ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
repl=NOPQRSTUVWXYZABCDEFGHIJKLMnopqrstuvwxyzabcdefghijklm
2) Added /PREPL option to augment /P behavior
Prior to v7.9, the /P option had limited use because JSCRIPT regular expressions do not support look behind expressions.
The new /PREPL option circumvents that limitation by passing only a captured group within a matched /P filter.
Here is modified documentation for /P, as well as documentation for the new /PREPL option - Be sure to look at the two examples embedded within the docs:
Code: Select all
prompt>jrepl /?/p & jrepl /?/prepl
/P FilterRegex
Only Search/Replace strings that match the Pre-filter regular
expression FilterRegex. All escape sequences defined by /XSEQ are
available to FilterRegex, even if /XSEQ has not been set.
FilterRegex is a global, case sensitive search by default.
The behavior may be changed via the /PFLAG option.
By default, /P passes the entire matched filter string to the
main Search/Replace routine. If your FilterRegex includes captured
groups, then you can add the /PREPL option to selectively pass one
or more captured groups instead.
The /P option ignores /I, but honors /M.
The /P option may be combined with /INC and/or /EXC, in which case
/P is applied after lines have been included and/or excluded.
From the standpoint of the main "Search" argument, ^ matches the
beginning of the matched filter, and $ matches the end of the
matched filter.
Example - Substitute X for each character within curly braces,
including the braces.
echo abc{xyz}def|jrepl . X /p "{.*?}"
result:
abcXXXXXdef
See /PREPL for an example showing how to preserve the enclosing
braces.
/PREPL FilterReplaceCode
Specify a JScript expression FilterReplaceCode that controls
what portion of the /P Pre-filter match is passed on to the main
Search/Replace routine, and what portion is preserved as-is.
The expression is mostly standard JScript, and should evaluate to
a string value. $0 is the entire Pre-filter match, and $1 through
$N are the captured groups. The only non-standard syntax is the
use of curly braces to indicate what string expression gets passed
on to the main Search/Replace. Prior to executing the /P filter,
each brace expression within /PREPL is transformed as follows:
{Expression} --> (Expression).replace(Search,Replace)
Any JScript is allowed within /PREPL, except string literals
should not contain $, {, or }.
Using /P without /PREPL is the same as using /P with /PREPL "{$0}"
/PREPL cannot be used with /K or /R.
Note that neither /V nor /XFILE apply to /PREPL.
Example - Substitute X for each character within curly braces,
excluding the braces.
echo abc{xyz}def|jrepl . X /p "({)(.*?)(})" /prepl "$1+{$2}+$3"
result:
abc{XXX}def
Using /T "" is supposed to implicitly set the /L option, but this feature was broken in some prior release.
Version 7.9 restores this behavior.
4) Bug fix - Allow /?charset/search to include non alpha
The ability to search and display installed character sets via /?charset/search was introduced in v7.5,
but there was a bug that only allowed alphabetic characters in the search.
Searching by a number like jrepl /?charset/1252 would result in an erroneous Invalid /? option error message.
Version 7.9 fixes the bug such that all characters can now be included in the search.
Dave Benham
Re: JREPL.BAT v7.9 - regex text processor now with Unicode and XRegExp support
Dave Could you please help me to form a script using JREPL for the following change
In the Above property file I've change value of Environment to UAT1-v6.19.15.0
I don't know about the current value of Environment.
Code: Select all
URL=http://UKGSWTOWB12:
WEBSERVER=UKGSWTOWB12
WPORT=8003
PORT=8003
DEBUG=true
SS=false
UTC=false
InitServlet=/war_Servlet
Package=com.tcs.bancs
AccessVerifierRequired=true
MCAppDataBaseType=Oracle
NCSContext=NCSWeb
NoOfTabs=10
longDateFormat=false
#MasterCraftVector.SEC.AppendByValue=false
ejb.AM.Local=Y
ejb.FA.Local=Y
ejb.CR.Local=Y
ejb.CM.Local=Y
ejb.AN.Local=Y
ejb.IF.Local=Y
ejb.es.Local=Y
MCAppServer=weblogic
Environment=Training-v6.19.13.0
DisplayedPageTitle = TCS BαNCS
MasterCraftDateTime.corearch.BaseTZ=GMT
#################bancs.system.otherContextURL=http://ArchivalIntranetHost:ArchivalIntranetPort/Bancs
bancs.system.isarchival=NO
#entry to implement CA parser
WorkItemParserImplClass=com.tcs.bfsarch.workitem.BancsWorkItemParserImpl
##################bancs.system.otherContextURL=http://172.19.102.115:7001/Bancs
#This property has to be set "no" to stop archive of report when generated by batch
BatchReportArchive=yes
# These properties have to be specified for sending mails.
BaNCSAccLocked=<<correcsponding template id to be given>>
BaNCSPassReset=<<correcsponding template id to be given>>
BaNCSPassGen=<<correcsponding template id to be given>>
# Keberos Properties
# This is the separator in Active Directory.
KeyValueSeprator=:
# LDAP_PARTS should be 3 for IBM JRE and 2 for SUN JRE
LDAP_PARTS=3
# Date format of whenChanged date from Active Directory , specify in lowercase.
# year in yyyy , month in mm , date in dd , hours in hh , minutes in mm , seconds in ss
noOfRecToExport=100
#This property is used to show/hide keyopad for entering password
VirtualKeypad_Password_Req = no
#Property to enable table header sorting
SortingRequired = yes
#Property to display confirmation box or alert box to user while doing data export
isConfirmationReq=yes
#Property to enable Non admin Users to view all generated reports
ReportViewAllUser=yes
ReflectionUtilClass=com.tcs.mastercraft.mctype.CachedReflectionUtil
#This property is used to show ammount and currency seperately in ADSL and GL windows
ExportToExcelAmtCurSep=yes
#This property is to be set 'yes' for the Product Logo to appear in the center
#Default value is yes
CenterLogo_Req = yes
###########Security Log Configuration start#############
# This property is used enable/disable (YES/NO) the HeartBeat message .
HeartBeatMsgReq=YES
#Maximum length for the security log
SecLogMaxLength=1024
#Incase the length of security log is more than SecLogMaxLength property then Message body ( $MessageText will be truncated ).So we can user MessageAppender to mention that message is truncated and the appender will be added at the end of MessageText.
MessageAppender=...(cont...)
#This is used to seperate the multiple error messages in the MessageText.
MessageSeparator=,
#The Bellow properties used to escape some char from the MessageText.
#To enable escape char feature in MessageText. This replaces the regular exp 'MessageEscapeRegex' with 'MessageEscapeReplacement' property value(regex,replacement). EscapecChar need to be taken care.
MessageEscapeReq=YES
MessageEscapeRegex==
MessageEscapeReplacement=\\\\=
#This property is used to specify the missing properties from the security log. If any properties are missing the value set for this 'UnkownPropertyValue' will appear in the message, which indicate property is missing.
#If this feature is not needed, please delete the value part <UnkownProp>.
UnkownPropertyValue=<UnkownProp>
#We can define these two properties for what message needs to be printed for success/failure state of a event.
ServiceState_Success=Success
ServiceState_Failure=Failure
#To specify the default severity.
DefaultSeverity=5
#To specify the default EffectedUserID.
DefaultEffectedUserID=
#These are the property to specify the signature id , signature name , service description and severity for each Actions.
#HeartBeat Message
SignatureID_HeartBeat=HB01
SignatureName_HeartBeat=HeartBeat
Severity_HeartBeat=0
#Login/Logoff/ChangePassword/UserAccountLocked events
SignatureName_LoginSuccess=Successful User Login
SignatureName_LoginFailure=User Login Failure
ServiceDesc_Login=LoginAction.Login
SignatureID_LoginSuccess=LL01
SignatureID_LoginFailure=LL03
Severity_Login=0
#These are the property to specify the signature id , signature name and severity for each service. User can add there own services for that in FUNCTION TABLE set diagonestic level 4 and add respective properties
#Format for SignatureName : SignatureName_<ServiceID/FuncID>
#Format for SignatureID : SignatureID_<ServiceID/FuncID>_<0/4>. 0 for success and 4 for failure
#Format for Severity : Severity_<ServiceID/FuncID>
SignatureName_226=CreateUser
SignatureID_226_0=SA0101
SignatureID_226_4=SA0102
Severity_226=0
SignatureName_229=UnlockUser
SignatureID_229_0=SA1901
SignatureID_229_4=SA1902
Severity_229=0
#Set this to as provide to use third party encryption algo apart from JCE . Eg - org.bouncycastle.jce.provider.BouncyCastleProvider is used for bouncy castle.
provider=
#Environment variable containing the key
BANCS_ENC_LOC=
#This property has to set as "YES" to make access flag condition visible in Create User Screen otherwise set as "NO"
ACCESS_FLG_REQD=NO
#The below property is used to show or hide the splash screen
#yes: Splash screen appears
#no : Splash screen does not appear
SplashScreen_Req=yes
I don't know about the current value of Environment.
Re: JREPL.BAT v7.9 - regex text processor now with Unicode and XRegExp support
@naraen87
I must say it is disappointing you haven't demonstrated any effort at understanding regular expressions, or countless existing examples on JREPL usage. Your situation is about as simple a use case as there is.
Imagine you are the program that must change the value - how would you do it? What would you look for?
So you don't know what the current environment value is... What do you know about the line?
You should be able to quickly figure this out on your own if you expend just a little effort.
If you still cannot figure this out, but you demonstrate you have made an honest effort to solve this, then I can give you the simple answer.
Dave Benham
I must say it is disappointing you haven't demonstrated any effort at understanding regular expressions, or countless existing examples on JREPL usage. Your situation is about as simple a use case as there is.
Imagine you are the program that must change the value - how would you do it? What would you look for?
So you don't know what the current environment value is... What do you know about the line?
You should be able to quickly figure this out on your own if you expend just a little effort.
If you still cannot figure this out, but you demonstrate you have made an honest effort to solve this, then I can give you the simple answer.
Dave Benham
Re: JREPL.BAT v7.9 - regex text processor now with Unicode and XRegExp support
You would not even need to use JREPL to accomplish this task. A FOR /F command with the tokens and delims options would do the trick. Just my two cents.