Page 1 of 1

For/if parsing under delayedExpansion vs. general rules

Posted: 02 Mar 2014 23:27
by Liviu
For/if commands seem to fail when executed via a delayed expansion, or a 'call'.

Code: Select all

@echo off & setlocal disableDelayedExpansion

echo(
set "echo.ok=echo 123"
set "if.fail.edx=if 1==1 (echo 1==1) else (echo ???)"
set "for.fail.edx=for %%X in (x y) do echo '%%X'

@rem all work ok
%echo.ok%
%if.fail.edx%
%for.fail.edx%

setlocal enableDelayedExpansion

@rem only 'echo' works when delayed-expanded
echo(
!echo.ok!
!if.fail.edx!
!for.fail.edx!

echo(
for %%Z in (1) do !echo.ok!
for %%Z in (1) do !if.fail.edx!
for %%Z in (1) do !for.fail.edx!

echo(
call !echo.ok:%%=%%%%!
call !if.fail.edx:%%=%%%%!
call !for.fail.edx:%%=%%%%!
Output:

Code: Select all

123
1==1
'x'
'y'

123
'if' is not recognized as an internal or external command,
operable program or batch file.
'for' is not recognized as an internal or external command,
operable program or batch file.

123
'if' is not recognized as an internal or external command,
operable program or batch file.
'for' is not recognized as an internal or external command,
operable program or batch file.

123
'if' is not recognized as an internal or external command,
operable program or batch file.
'for' is not recognized as an internal or external command,
operable program or batch file.

It's not about internal commands, since 'echo' still works in all cases, but it appears to be a parsing anomaly with for/if specifically. And the 'call' failure in particular doesn't seem to follow from the parsing rules at http://stackoverflow.com/questions/4094699/how-does-the-windows-command-interpreter-cmd-exe-parse-scripts. According to those, the 1st iteration would proceed to phase 6, then loop back and rerun phases 1 and 2, where the 'if' and 'for' tokens should be detected in phase 2 of the 2nd pass. Maybe I am missing or misreading something.

Liviu

Re: For/if parsing under delayedExpansion vs. general rules

Posted: 03 Mar 2014 01:03
by dbenham
Liviu wrote:For/if commands seem to fail when executed via a delayed expansion, or a 'call'.
...
It's not about internal commands, since 'echo' still works in all cases, but it appears to be a parsing anomaly with for/if specifically. And the 'call' failure in particular doesn't seem to follow from the parsing rules at http://stackoverflow.com/questions/4094699/how-does-the-windows-command-interpreter-cmd-exe-parse-scripts. According to those, the 1st iteration would proceed to phase 6, then loop back and rerun phases 1 and 2, where the 'if' and 'for' tokens should be detected in phase 2 of the 2nd pass. Maybe I am missing or misreading something.

Most of the failures are indirectly explained by the following, late in phase 2:

- In this phase REM, IF and FOR are detected, for the special handling of them.

Since delayed expansion takes place after phase 2, there is never an opportunity to properly parse the IF and FOR commands. So it makes sense that they fail as a macro invoked with delayed expansion.

But I think that you are correct with regard to the description of CALL in phase 6). It claims that the 2nd round of processing stops after phase 2, but it must stop prior to the end of phase 2), before IF and FOR are processed. Or else the special FOR/IF parsing should be broken out into another step between 2) and 3).

I also think a clearer distinction should be made between special processing of REM vs. IF/FOR, given that REM does work as a macro invoked by delayed expansion.


Dave Benham

Re: For/if parsing under delayedExpansion vs. general rules

Posted: 03 Mar 2014 01:30
by jeb
dbenham wrote:I also think a clearer distinction should be made between special processing of REM vs. IF/FOR, given that REM does work as a macro invoked by delayed expansion.

Yes, REM is a bit special here, the full featured version of REM can be only detected in phase 2, but REM has a fallback detection in phase 7(Execution phase, normal detection for external commands), there can be detected also a late expanded REM or REM with appendix like REM. REM+ and so on.
But in the late detection REM loses it's strong behavoiur, that it stops the parser IN phase 2.

Code: Select all

@setlocal enableDelayedExpansion
@set "myREM=REM"
REM This ^ you can see!
!REM! But now, the caret ^ is gone, but the exclamation mark ! is still there


I suppose that this is a bit of a dinosaur, that REM was detected always in this phase in former DOS versions.

jeb

Re: For/if parsing under delayedExpansion vs. general rules

Posted: 03 Mar 2014 03:04
by penpen
dbenham wrote:Most of the failures are indirectly explained by the following, late in phase 2:

- In this phase REM, IF and FOR are detected, for the special handling of them.
A space expanded by a delayed expansion is not a seperator but a normal character.
I always thought the parser expects a delimiter ("special" space) on for and if (in opposit to other commands) and this causes the behavior of for and if, and no extra detection (i'm not sure with rem: never have thought about it, as i nearly never use it).
You can force this behavior without delayed expansion (batch file example):

Code: Select all

@setlocal disableDelayedExpansion
for %%a in (world) do @echo Hello %%a^!
for^ %%a in (world) do @echo Hello %%a^!
if a==a @echo ok
if^ a==a @echo ok
@endlocal
@goto :eof

penpen

Re: For/if parsing under delayedExpansion vs. general rules

Posted: 03 Mar 2014 06:38
by jeb
penpen wrote:A space expanded by a delayed expansion is not a seperator but a normal character.


Code: Select all

@echo off
setlocal enableDelayedExpansion
echo @echo ########## %* ##########>"echo hello.bat"
set "cmd=echo hello.bat"
!cmd!

This does not start the batch file, it simply outputs
hello.bat


Also this shows that it's not the escaped space, that causes the FOR/IF problems.
It fails with an error.

Code: Select all

@echo off
set myIF=IF
setlocal enableDelayedExpansion
!myIF! 1==1 (echo true) ELSE (echo false)


jeb

Re: For/if parsing under delayedExpansion vs. general rules

Posted: 03 Mar 2014 07:14
by foxidrive
jeb wrote:

Code: Select all

@echo off
set myIF=IF
setlocal enableDelayedExpansion
!myIF! 1==1 (echo true) ELSE (echo false)


jeb



That's the beauty of batch - so many things work and don't work in weird and wonderful ways.

Re: For/if parsing under delayedExpansion vs. general rules

Posted: 03 Mar 2014 08:28
by penpen
jeb wrote:
penpen wrote:A space expanded by a delayed expansion is not a seperator but a normal character.


Code: Select all

@echo off
setlocal enableDelayedExpansion
echo @echo ########## %* ##########>"echo hello.bat"
set "cmd=echo hello.bat"
!cmd!

This does not start the batch file, it simply outputs
hello.bat

That's true, but most commands ignore if a space is a seperator or a normal character.
If i remember right you have created a execution of echo, that makes this visible if a delimiter is handled as a seperator, so the expanded space is no seperator:

Code: Select all

@echo off
setlocal enableDelayedExpansion
set "cmd=echo "
set "cmd_=echo"
!cmd! !^^^^"^^" ^^^^"^^"
!cmd_!!^^^^"^^" ^^^^"^^"
%cmd_%!^^^^"^^" ^^^^"^^"
My (maybe non standard) term usage may be problematic here:
- A delimiter seperates single words/tokens,
- a seperator seperates multiwords/"blocks" (!=code block)/super tokens.
- "is not a seperator but a normal character"; with "normal" i meat "no seperator" (in the above sense, so it is handeled as delimiter which is not a normal character in most peoples usage..., but in my lazy usage...).

So i count echo to those commands, that ignores if a space is a seperator,
but it detects the space as a delimiter, so it recognized the tokens [echo][hello.bat] and so it echoes in the above example.
You may also avoid beeing detected as a delimiter:
Just explicitely escape the space, so in the above example the lexem [echo hello.bat] is recognized and the batch is called.
<end of self-justification>

Sorry if my lazy usage of terms (especially "normal") causes this misunderstanding.

Nevertheless your "myIF" example shows that i was wrong: Thanks for this example.

penpen

Re: For/if parsing under delayedExpansion vs. general rules

Posted: 03 Mar 2014 12:19
by Liviu
This doesn't look like a tokenizing issue. The error "'if' is not recognized as an internal or external command" shows that the primary token is in fact identified correctly, but at a point where it's no longer recognized as the internal "if" command.

As for the 'call' part, problem seems to be with the 2nd pass of parsing, and can in fact be demonstrated with direct commands without any expansions.

Code: Select all

@echo off

(echo echo %%~nx0:  %%*) >if.bat

echo( & echo [1] call if
call if

echo( & echo [2] call if ???
call if ???

echo( & echo [3] call if.bat ???
call if.bat ???

echo( & echo [4] call if 1==1 (echo 1==1) else (echo ???)
call if 1==1 (echo 1==1) else (echo ???)

echo( & echo [5] call if 1==1 else
call if 1==1 else

echo( & echo [6] call if 1 equ 1 else
call if 1 equ 1 else

echo( & echo [7] call if exist . else
call if exist . else

echo( & echo [8] call if defined cd else
call if defined cd else

echo( & echo [9] call if not defined cd else
call if not defined cd else
Output under Win7x64.sp1:

Code: Select all

[1] call if
 if was unexpected at this time.

[2] call if ???

[3] call if.bat ???
if.bat:  ???

[4] call if 1==1 (echo 1==1) else (echo ???)
if.bat:  9

[5] call if 1==1 else
if.bat:  9

[6] call if 1 equ 1 else
if.bat:  :

[7] call if exist . else
if.bat:  7

[8] call if defined cd else
if.bat:  6

[9] call if not defined cd else
if.bat:  8

The above makes a few interesting points:
- [1] shows that 'if' is recognized right away, completely ignoring the 'if.bat' which exists in the same directory (unlike for example echo or set);
- [2] seems to be doing nothing at all, and confirms that the parser is an entirely alien incomprehensible creature ;-)
- [4] ... [9] seem to indicate that once the 'if' passes some very basic syntax checks, 'call' lets it pass, but the 2nd pass no longer gives it special treatment, and invokes the 'if.bat' batch instead of the internal command.

May also be interesting to note that the mysterious argument in [4]...[9] passed to 'if' in the 2nd pass is consistent between runs, but different between Win7 ( 9 9 : 7 6 8 ) vs. XP ( 7 7 8 5 4 6 ). Almost looks like the 1st pass already decided what type of 'if' it expects, and passes some of that information on, so that the 2nd pass is already "keyed" to parse a certain 'if' syntax for example 'if exist' or 'if defined'.

Liviu

Re: For/if parsing under delayedExpansion vs. general rules

Posted: 03 Mar 2014 18:33
by penpen
What is making me completely speechless confuse...;
what is causing the characters (on XP only numbers :D ) to be displayed :shock: ?

Code: Select all

@echo off
(echo echo %%~nx0:  %%*) >if.bat
:: XP values
::2
   call if CmdExtVersion 1 else
   call if CmdExtVersion 1 echo ok
::3
   call if ErrorLevel 1 else
   call if ErrorLevel 1 echo ok
::4
   call if defined if.bat else
   call if defined if.bat echo ok
::5
   call if exist test else
   call if exist test echo ok
::7
   call if /I a == a else
   call if /I a == a echo ok
::8
   call if /I a EQU a else
   call if a EQU a else
It is not because the syntax is wrong... .
Only one seems to be clear, it is the type of the if processed... .
But they shouldn't "spawn out of nowhere"... .

penpen.

Re: For/if parsing under delayedExpansion vs. general rules

Posted: 03 Mar 2014 19:18
by Liviu
penpen wrote:what is causing the characters [...] Only one seems to be clear, it is the type of the if processed... .
Think I can fill in the missing value in your sequence ;-)

Code: Select all

::6
call if not cmdextversion 1 else
call if not defined cd else
call if not errorlevel 1 else
call if not 1==1 else
That said, I don't know the rest of the answers. The value has to do with the type of the 'if', but it's only partial information - in the example above, all 'if not' variants raise the same value 6 (under XP) while their non-negated counterparts each have distinct values.

As to how it ends up on the command line of the external if.bat, I won't even try to guess. Fact is that we deliberately fooled the 'call' line into identifying its target as an 'if' statement in the 1st pass of parsing, but then cheated and hijacked it to a namesake batch file in the 2nd pass. It's entirely possible that while processing internal commands the parser could be using the command line buffer for its own purposes, only in this case it inadvertently broadcasted it to an external command. And, there could be additional information hidden behind a NUL or other binary data in that buffer, which would cause an external batch to only "see" the first character.

The interesting fact, however, is that this "accident" hints at a far stronger coupling between the 2 passes of parsing a 'call' line than previously thought. Given the relative consistency of those values, it's a fair guess that the 1st pass does at least a primitive parsing of the 'call' target, and passes this information to the 2nd pass.

penpen wrote:the characters (on XP only numbers :D )
The fact that they are numbers-only in XP looks like a mere coincidence. The values in Win7 are simply higher by 2 than XP's, and ':' happens to be the next character after '9' in ASCII sequence.

Of course, the other question is why did Win7 change those internal codes. It looks like it needed to make room for 2 new codes, though I am not aware of any new 'if' syntax in Win7 vs. XP. Could be a side-effect to some bug fix, or could be a new/undocumented 'if' feature - but I'll leave that to the professionals ;-)

Liviu

Re: For/if parsing under delayedExpansion vs. general rules

Posted: 03 Mar 2014 19:58
by dbenham
Yes, you rediscovered something jeb ran into back in 2011: viewtopic.php?p=8407#p8407

Look at the bottom of that post, and also look at a number of following posts. There are many examples of this weirdness. We never got very far with drawing any conclusions.


Dave Benham

Re: For/if parsing under delayedExpansion vs. general rules

Posted: 03 Mar 2014 20:32
by Liviu
dbenham wrote:Yes, you rediscovered something jeb ran into back in 2011: viewtopic.php?p=8407#p8407

Thanks for the pointer. It's another proof of call's unpredictable clumsiness - equally fascinating, but still not quite the same thing. Those cases all revolve around parenthesized calls, and the odd effects of ||-&&-|-for-if therein. But, unless I missed it, there is no change in observed behavior if replacing the 'IF a==a echo aaa,' with 'IF not a==a echo aaa,' for example. Which is what I think is the interesting point of the case here - there is a distinct difference between 'if' flavors, as a strong indication that the early 1st pass 'call' parsing does its own "pre-determination" on what the 2nd pass should expect and needs to resolve, down to command specific details like the particular 'if' syntax, and not just the primary token or if-vs-for or internal-vs-external.

Liviu

Re: For/if parsing under delayedExpansion vs. general rules

Posted: 05 Mar 2014 12:16
by penpen
I've thought more about that:
If lexer and parser are more interconnected, than thought, it could be,
that both call problems ("if", "()")) may be caused by different things,
but they could show the same thing in both cases:
If the tokens are organized in a tree (default: parse tree), then it could be that the token type of the first child token is diplayed as a character:
In case of if "()":
- token type "FOR" = 0x29 = ')'
- token type "IF" = 0x2A = '*'
- token type "&" = 0x2D = '-'
- token type "||" = 0x2E = '.'
- token type "&&" = 0x2F = '/'
- token type "|" = 0x30 = '0'
In case of if (only recognized as tokens within if):
- token type "CmdExtVersion" = 0x32 = '2'
- token type "ErrorLevel" = 0x33 = '3'
- token type "defined" = 0x34 = '4'
- token type "exist" = 0x35 = '5'
- token type "NOT" = 0x36 = '6'
- token type "==" = 0x37 = '7'
- token type "EQU", "NEQ", "LSS", "LEQ", "GTR", "GEQ" = 0x38 = '8'

This would also explain, why '6' is returned on all "if not" variations.

penpen

Edit1-2: I've corrected the ascii code for character '0' and the related position in the above list.

Re: For/if parsing under delayedExpansion vs. general rules

Posted: 05 Mar 2014 17:23
by Liviu
penpen wrote:If lexer and parser are more interconnected, than thought
Lexer and parser? My picture of the cmd source code is more like a bowl of spaghetti ;-)

penpen wrote:If the tokens are organized in a tree (default: parse tree), then it could be that the token type of the first child token is diplayed as a character
That's a possibility, though I still believe that the display is completely accidental, and the only thing it shows is a stronger coupling between the 'call' re-parsing phases.

Anyway, it's not always one single character. Consider the following 'for' counterpart to the 'if' experiment.

Code: Select all

@echo off

(echo @echo %%~nx0:  %%*) >for.bat

echo( & echo [1] call for
call for

echo( & echo [2] call for %%%%?
call for %%%%?

echo( & echo [3] cmd /c call for %%%%? in ("") do ?
@cmd /c call for %%? in ("") do ?

echo( & echo [4] cmd /c call for /f %%%%? in ("") do ?
@cmd /c call for /f %%? in ("") do ?

echo( & echo [5] cmd /c call for /f %%%%? in ('') do ?
@cmd /c call for /f %%? in ('') do ?

echo( & echo [6] cmd /c call for /f "usebackq" %%%%? in ('') do ?
@cmd /c call for /f "usebackq" %%? in ('') do ?

echo( & echo [7] call for %%%%? in ("") do ?
call for %%%%? in ("") do ?

echo *** this line never reached ***

del for.bat
Output under Win7x64.sp1:

Code: Select all

[1] call for
 for was unexpected at this time.

[2] call for %%?

[3] cmd /c call for %%? in ("") do ?
for.bat:  %? in""

[4] cmd /c call for /f %%? in ("") do ?
for.bat:  %? in""

[5] cmd /c call for /f %%? in ('') do ?
for.bat:  %? in''

[6] cmd /c call for /f "usebackq" %%? in ('') do ?
for.bat:  %? in''

[7] call for %%? in ("") do ?
for.bat:  %? in""
What the above shows is that in the 'for' case (a) the command-line-buffer holds a multi-char string, (b) it is the same string for different 'for' flavors, and (c) the interpreter gets totally lost once calling the external .bat file, to the extent that the calling batch file ends unless those 'call for' are wrapped inside a 'cmd /c'.

Liviu

Re: For/if parsing under delayedExpansion vs. general rules

Posted: 21 Oct 2017 19:12
by penpen
The single characters indeed seem to be the token type of the first child of the actual parse tree node (same values):
http://www.dostips.com/forum/viewtopic.php?p=54448#p54448.

I just noticed, that the computed parse tree seems to be a non final version (which i think is very unusual and astounding):

Code: Select all

Z:\>enableDebug.bat
Z:\><nul set/p"="
Cmd: set/p"="  Type: 0 Redir:  0 <nul

Z:\><nul set /p "="
Cmd: set  Type: 0 Args: ` /p "="' Redir:  0 <nul

Z:\>


penpen

Edit: Added the last part.