Page 1 of 1

Variable vs arguments containing special chars

Posted: 12 Mar 2022 12:30
by paolo2504
Hi all. I'm just beginner with ms/jscript (jscript under Windows).
I recently faced an issue that i can't understand: i found that variable containing special characters are treated in a different way when passed as arguments from a batch file or set directly inside the jscript code.
I wrote a brief testcode below to show my concern: I would like to understand :
a) why arguments and var directly set behave differently (and how set vars inside the jscript code to fully maintain all special chars)
b) why the eval instruction returns /ig instead of /gi
Appreciate any hint. Thanks in advance.

PS NOTE: I've just found right now that doubling each backslash in var FindstrDirectlySet everything goes fine. I wonder if the argument passed from the batch code
is considered by jscript as an object (not a string) so not to be changed/interpreted....

Code: Select all

@if (@X)==(@Y) @end /*
@echo off
set findchr="\.\\|\.\/|\\\\|=\\|=\/|image=\\"
cscript //nologo //E:JScript "%~F0" %findchr% 2>d:\jslog.txt
goto :EOF

************* JScript portion **********/
var args = WScript.Arguments;
var FindstrFromArgs=args.Item(0)
var FindstrDirectlySet = "\.\\|\.\/|\\\\|=\\|=\/|image=\\";
var regexFile = eval("/" + FindstrFromArgs + "/gi");

WScript.Stderr.WriteLine("FindstrFromArgs = " + FindstrFromArgs);
WScript.Stderr.WriteLine("FindstrDirectlySet = " + FindstrDirectlySet);
WScript.Stderr.WriteLine("regexFile = " + regexFile);
result

Code: Select all

FindstrFromArgs = \.\\|\.\/|\\\\|=\\|=\/|image=\\
FindstrDirectlySet = .\|./|\\|=\|=/|image=\
regexFile = /\.\\|\.\/|\\\\|=\\|=\/|image=\\/ig

Re: Variable vs arguments containing special chars

Posted: 13 Mar 2022 05:28
by aGerman
There are several things you have to keep in mind. It definitely doesn't get easier if different languages are involved. Consider to choose only one language which is capable to handle your task entirely.
However, when messing with Batch, JScript and regex:
1) The backslash is an escape character in string literals of JScript code. (A string literal is a hard-coded piece of text in the script.)
2) The backslash is an escape character in regular expressions. (Thus, if you want to match a \ you have to write \\ in the pattern, and if the pattern is a string literal you have to double once more to \\\\)
3) The string type in JScript is an object. Not to be confused with a string literal which is processed by a parser before it gets used.
4) While the string argument passed to the JScript is handled by the calling shell, the parameter containing the argument in JScript is already a string object.
5) Respect the rules of the calling shell for escaping characters in the argument. (E.g. you have to double a percent sign to represent a single percent sign in a string.)
6) Respect the rules of the receiving script engine. (E.g. keep in mind that the JScript engine removes all quotation marks in a parameter value(*). Arguments received by JScript are not handled like string literals.)
(*) Consider to read values out of the inherited process environment if you need to preserve quotation marks.

Steffen

Code: Select all

@if (0)==(0) echo off

:: This is how you have to escape the literal to get ""%a%"&"\b\"" in variable foo. (Parsing rules of the cmd shell.)
set foo=^""%%a%%"^&"\b\"^"
echo BATCH
echo variable foo in the shell: %foo%
echo(

:: Passing %foo% does only work because cmd.exe parses quoted strings in the order of their occurrence. Thus, the & is considered to be quoted. (Parsing rules of the shell.)
cscript //nologo /e:jscript "%~fs0" %foo%

pause
goto :eof @end

WScript.Echo('JSCRIPT');
var oWshSh = new ActiveXObject('WScript.Shell');

// Backslashes are to be doubled. Double quotes don't have to be escaped here due to the surrounding single quotes. (Parsing rules of the script engine for string literals.)
WScript.Echo('string literal in JScript: ""%a%"&"\\b\\""');

// The script engine removes all quotation marks from the arguments, and there is nothing you can do against it. (Parsing rules of the script engine for arguments.)
WScript.Echo('argument received:         ' + WScript.Arguments(0));

// Fortunately, cscript.exe inherits a copy of the environment from the parent process (cmd.exe) in which the variable foo was defined before cscript was called. (No further parsing rules of the script engine here.)
WScript.Echo('environment variable read: ' + oWshSh.Environment('PROCESS')('foo'));
// (Note that this is a one-way. Updates to the received copy in the child process never affect the original parent environment.)
output:

Code: Select all

BATCH
variable foo in the shell: ""%a%"&"\b\""

JSCRIPT
string literal in JScript: ""%a%"&"\b\""
argument received:         %a%&\b\
environment variable read: ""%a%"&"\b\""
Drücken Sie eine beliebige Taste . . .

Re: Variable vs arguments containing special chars

Posted: 13 Mar 2022 05:32
by paolo2504
Appreciate. Thanks

Re: Variable vs arguments containing special chars

Posted: 14 Mar 2022 07:11
by Aacini
paolo2504 wrote:
12 Mar 2022 12:30
Hi all. I'm just beginner with ms/jscript (jscript under Windows).
I recently faced an issue that i can't understand: i found that variable containing special characters are treated in a different way when passed as arguments from a batch file or set directly inside the jscript code.
I wrote a brief testcode below to show my concern: I would like to understand :
a) why arguments and var directly set behave differently (and how set vars inside the jscript code to fully maintain all special chars)
b) why the eval instruction returns /ig instead of /gi
Appreciate any hint. Thanks in advance.

PS NOTE: I've just found right now that doubling each backslash in var FindstrDirectlySet everything goes fine...
I said you that on Feb/28:
Aacini wrote:
28 Feb 2022 16:33
An additional problem in your case is that the backslash must be preceded by an additional backslash always, and the point must be preceded by a backslash when it is used in a regex.

. . .

Antonio
Perhaps I wasn't clear enough?

Consider these lines from the same post:

Code: Select all

// First pass: replace substrings in *all* file contents
var repl = new Array();
repl[".\\"] = "";
repl["\\\\"] = "";
repl["\\image"] = "image";
Contents = Contents.replace(/\.\\|\\\\|\\image/g, function (A) {return repl[A]});
The first string you were looking for was ".\" "point-backslash". However, because "the backslash must be preceded by an additional backslash always", in this statement:

Code: Select all

repl[".\\"] = "";
... the backslash is written as double-backslash, but in this line:

Code: Select all

Contents = Contents.replace(/\.\\|\\\\|\\image/g, function (A) {return repl[A]});
... the point is also preceded by backslash because "the point must be preceded by a backslash when it is used in a regex".



I assumed you would understand that these rules apply just in JScript code and that you knew how to write special characters in a Batch file. If you want to write a percent-sign in a Batch file you must write %% always, but not in JScript. If you want to write a backslash in JScript you must write \\ always, but not in Batch.

How do you must write characters in Batch that will be used in JScript? Depends on the way they will be used in JScript.

If you want to literally expand a string to be parsed in JScript, you must write it in Batch in the same way as you would write it in JScript:

Code: Select all

rem In Batch:
set findchr="\.\\|\.\/|\\\\|image=\\"
cscript //nologo //E:JScript "%~F0" %findchr%

// In JScript:
var regexFile = eval("/"+args.Item(0)+"/gi");
If you want to process the string in JScript as a string, just define the string in Batch in the right way. A string is the same string in Batch or JScript (or any other target):

Code: Select all

rem In Batch:
set findchr=".\|./|\\|image=\"
cscript //nologo //E:JScript "%~F0" %findchr%

// In JScript:
var regexFile = new RegExp(args.Item(0),"gi");
Of course, if you want to search for a percent-sign in JScript, you must write it as double-percent in Batch:

Code: Select all

set "findstr=The 25%% discount"
... and if you want to put a backslash in JScript, you must write it as double-backslash even if it is not used as a regexp:

Code: Select all

var myString = "Hello \\world\\";    // the string "Hello \world\"

Antonio


PS - I suggest you to use this form when you assemble your Batch-JScript hybrid scripts.

Standard form:

Code: Select all

@if (@CodeSection == @Batch) @then

@echo off
rem The Batch code section goes here

. . .

goto :EOF

@end

// The JScript code section goes here

. . .
The purpose of the first line is to introduce a method (command/statement) that be valid in both Batch and JScript, and that cause that the Batch code section be ignored by JScript. In Batch: it is a valid IF command that is false (because "(@CodeSection" string is different than "@Batch)" string) so the @then "command" is not executed. In JScript it is an @if conditional compilation statement that will ignore the following text until the @end delimiter string if the value between parentheses is zero (false). This means that @if (0) would give the same result, but when I proposed this method I though that using @CodeSection == @Batch expression and @then "command" (delimiter) would be clearer for people that know nothing about JScript. The @ signs are required by JScript syntax.

NOTE: There are other methods to do the same thing that, IMHO, does not provide any advantage but the opposite result. For example:

Code: Select all

@if (@X)==(@Y)
In this case the @X and @Y don't provide any additional information, but this syntax is tricky because the value (@X) is enough for JScript syntax, but not for Batch. In this way, writting any string after the double-equal sign would be enough to complete the Batch syntax. For example: @if (@X) == Y. Why write the Y also enclosed in parens and preceded by @-sign? Just to make this trick more esoteric and incomprehensible, perhaps? I really don't know...

A particularly strange form is this one:

Code: Select all

@if (0)==(0) 
In this case the IF Batch statement is true, so a valid Batch command must be placed ahead, in the same line... :|


Another simpler way to write a Batch-JScript hybrid script is this:

Simplified form:

Code: Select all

@set @a=0  /*

@echo off
rem The Batch code section goes here

. . .

goto :EOF

*/

// The JScript code section goes here

. . .
In this case the purpose of the first line is to introduce a JScript multi-line comment /* that will ignore the Batch file section until the complementary */ close comment characters. You could put the /* characters alone in a line, but that would cause an error in Batch code. The simplest way to avoid the Batch error is using the @set command, that in Batch is valid and in JScript is the @set conditional compilation statement.

NOTE: Some authors also use this method:

Code: Select all

@if (@X)==(@Y) @end /*
In this case, the @if statement, which can skip a multi-line section by itself, is closed and a multi-line comment is immediately started. :shock: IMHO, this is a "waste" of the capabilities of the @if statement... (Why make things more complicated than necessary?) :cry:

Re: Variable vs arguments containing special chars

Posted: 15 Mar 2022 02:08
by paolo2504
One more time, appreciate your great help.
Regarding the past issue and this present one
there are two ways to declare bars:

a) var search str = “\\.\\/“+
“\\.\\\\” +,
“\\=\\\\”;
var re = eval(“/“+str+”/gì”)
b) var re = /\.\/|\.\\|\=\\/gi

The method .replace works in both cases.
Regards.