Here is version 7.3 - A major new release with Unicode and XRegExp support
- JREPL7.3.zip
- Version 7.3 was downloaded 23 times from the main release page over 2 days while it was the current version.
- (22.46 KiB) Downloaded 816 times
Bugged v7.1 was download 172 times in 15 daysRather than write a new summary of the changes, I will post the relevant built in help text to catalog the enhancements.
Summary of changesCode: Select all
>jrepl /?history
2017-09-23 v7.3: Fixed /O - support for ADO input.
2017-09-23 v7.2: Improved documentation of new 7.0 features.
Bug fix - /T FILE ADO support was broken
2017-09-08 v7.1: Bug fix - v7.0 failed if Find or Replace contained )
2017-09-08 v7.0: Added /XREG and /TFLAG for XRegExp regex support.
Added /UTF for UTF-16LE support.
Added /X support for the \u{N} unicode escape sequence.
Added |CharSet syntax for file names to allow reading
and writing via ADO with a specified character set.
Exposed the fso FileSystemObject to user JScript.
Augmented openOutput for Unicode and ADO support.
... <truncated>
Native Unicode 16 Little Endian support (UTF-16LE) for input and outputCode: Select all
>jrepl /?/utf
/UTF - All input and output encodings are Unicode UTF-16 Little
Endian (UTF-16LE). This includes stdin and stdout. The only
exceptions are /JLIB and /XREG files, which are still read
as ASCII.
The \xFF\xFE BOM is optional for input.
Output files will automatically have the \xFF\xFE BOM inserted.
But stdout will not have the BOM.
Extended ASCII escape sequences (\x80 - \xFF) should not be used
with /UTF combined with /X.
Regular expression support of Unicode can be improved by using
the /XREG option.
Variable values are no longer written to temporary files when
/X is used if /UTF is also used.
Unfortunately, /UTF is incompatible with /RTN.
Read and write files using virtually any character set (including UTF-8) via ADOA list of valid character set names and their corresponding code page can be found at
https://msdn.microsoft.com/en-us/library/windows/desktop/dd317756(v=vs.85).aspxCode: Select all
>jrepl /?/f & jrepl /?/o & jrepl /?/t
/F InFile[|CharSet]
Input is read from file InFile instead of stdin.
-> If |CharSet (internet character set name) is appended to InFile,
-> then the file is opened via ADO using the specified CharSet value.
-> JREPL still recognizes both \n and \r\n as input line terminators
-> when using ADO. Both ADO and the CharSet must be available on the
-> local system.
/O OutFile[|CharSet]
Output is written to file OutFile instead of stdout. Any existing
OutFile is overwritten unless the /APP option is also used.
-> If |CharSet (internet character set name) is appended to OutFile,
-> then the file is opened via ADO using the specified CharSet value.
-> The output line terminator still defaults to \r\n when using ADO,
-> and may be changed to \n with the \U option. Both ADO and the
-> CharSet must be available on the local system.
If /F InFile is also used, then an OutFile value of "-" overwrites
the original InFile with the output, preserving the character set.
The output is first written to a temporary file with the same path
and name, with .new appended. Upon completion, the temp file is
moved to replace the InFile. It is not valid to use "-|CharSet"
/T DelimiterChar
/T FILE
The /T option is very similar to the Oracle Translate() function,
or the unix tr command, or the sed y command.
The Search represents a set of search expressions, and Replace
is a like sized set of replacement expressions. Expressions are
delimited by DelimiterChar (a single character). If DelimiterChar
is an empty string, then each character is treated as its own
expression. The /L option is implicitly set if DelimiterChar is
empty. Escape sequences are interpreted after the search and
replace strings are split into expressions, so escape sequences
cannot be used without a delimiter.
An alternate syntax is to specify the word FILE instead of a
DelimiterChar, in which case the Search and Replace parameters
specify files that contain the search and replace expressions,
-> one expression per line. Each file can be opened via ADO if
-> |CharSet (internet character set name) is appended to the file
name. Note that the /V option does not apply to Search and Replace
if /T FILE is used.
... <truncated>
Add Unicode support to objects available to user supplied JScriptCode: Select all
>jrepl /?jscript
The following global JScript variables/objects/functions are available for
use in JScript code associated with the /Jxxx options.
... <truncated>
input - The TextStream object from which input is read.
This may be stdin or a file.
If the file was opened by ADO with |CharSet, then input is
an object that partially emulates a TextStream object, with
a private ADO Stream doing the actual work. The following
public members are available to the ADO object:
Property Method
------------- -----------------------------------
AtEndOfStream Read
ReadLine
SkipLine
Write
WriteLine
Close
output - The TextStream object to which the output is written.
This may be stdout or a file. ... <truncated>
If the file was opened by ADO with |CharSet, then output is
an object that partially emulates a TextStream object (see the
input object).
openOutput( fileName[|CharSet] [,appendBoolean [,utfBoolean]] )
Open a new TextStream object for writing and assign it to the
output variable. If appendBoolean is truthy, then open the file
for appending.
If |CharSet is appended to the fileName, then open the file
using ADO and the specified internet character set name. The
output variable will be set to an object that partially
emulates a TextStream object (see the input object).
If utfBoolean is truthy, then output is encoded as unicode
(UTF-16LE). The unicode file will automatically have the BOM
unless opened for appending. The utfBoolean argument is ignored
if |CharSet is also specified.
If fileName is falsey, then output is written to stdout.
All subsequent output will be written to the new destination.
Any prior output file is automatically closed.
... <truncated>
New escape sequence \u{N} for access to any Unicode code point, including "Astral" (supplemental) planesCode: Select all
>jrepl /?/x
/X - ... <truncated>
Also enables extended substitution pattern syntax with support
for the following escape sequences within the Replace string:
... <truncated>
\u{N} - Any Unicode code point where N is 1 to 6 hex digits
Also enables the \q, \c, and \u{N} escape sequences for the Search
string. The other escape sequences are already standard for a
regular expression Search string.
... <truncated>
When using \xnn with /X, JREPL assumes your machine defaults to
Windows-1252, which is generally true for Western Europe and North
and South America. If your machine doesn't use Windows-1252, then
you should not use \xnn with values above 7F unless you force
input and output to use Windows-1252 via /F "inFile|Windows-1252"
and /O "outFile|Windows-1252" (or /O -).
Note that without the /X option, \xnn within a regex search string
maps to unicode code points.
Enhanced regular expression syntax via XRegExp (xregexp.com)Code: Select all
>jrepl /?/xreg & jrepl /?/tflag
/XREG FileList
Adds support for XRegExp by loading the xregexp files specified
in FileList before any /JLIB code is loaded. Multiple files are
delimited by forward slashes (/). If FileList is simply a dot,
then substitute the value of environment variable XREGEXP for
the FileList.
The simplest option is to load "xregexp-all.js", but this
includes all available XRegExp options and addons, some of which
are unlikely to be useful to JREPL. Alternatively you can load
only the specific modules you need, but they must be loaded in the
correct order.
Once the XRegExp module(s) are loaded, all user supplied regular
expressions are created using the XRegExp constructor rather than
the standard RegExp constructor. Also, XRegExp.install('natives')
is executed so that many standard regular expression methods are
overridden by XRegExp methods.
/XREG requires XRegExp version 2.0.0 or 3.x.x. JREPL will not
support version 4.x.x (when it is released) because v4.x.x
is scheduled to drop support for XRegExp.install('natives').
One of the key features of XRegExp is that it extends the JScript
regular expression syntax to support named capture groups, as in
(?<name>anyCapturedExpression). Named groups can be referenced
in Replace strings as ${name}, and in Replace JScript code as
$0.name
The /T option is no longer limited to 99 capture groups when
/XREG is used. However, /T replace expressions must reference a
captured group by name if the capture index is 100 or above.
Every /T search expression is automatically given a capture group
name of Tn, where n is the 0 based index of the /T expression.
XRegExp also adds support for non-standard mode flags:
n - Explicit capture
s - Dot matches all
x - Free spacing and line comments
A - Astral
These flags can generally be applied by using (?flags) syntax
at the begining of any regex. This is true for /P, /INC, /EXC,
and most Find regular expressions. The one exception is /T doesn't
support (?flags) at the beginning of the Find string. The /TFLAG
option should be used to specify XRegExp flags for use with /T.
XRegExp also improves regular expression support for Unicode via
\p{Category}, \p{Script}, \p{InBlock}, \p{Property} escape
sequences, as well as the negated forms \P{...} and \p{^...}.
Note that example usage on xregexp.com shows use of doubled back
slashes like \\p{...}. But JREPL automatically does the doubling
for you, so you should use \p{...} instead.
See xregexp.com for more information about the capabilities of
XRegExp, and for links to download XRegExp.
/TFLAG Flags
Used to specify XRegExp non-standard mode flags for use with /T.
/TFLAG is ignored unless both /T and /XREG are used.
The enhancements were implemented in a fairly surgical manner, but they have a profound impact on the entire functioning of the utility. I have only done limited testing, so I won't be surprised if I have introduced some bugs.
I encourage everyone to try out the new features, and please report any problems that you find.
Dave Benham