Assembly language code "in-line" for Batch files!
Posted: 22 Feb 2015 01:22
After I read this penpen's post:
... I couldn't resist the temptation of write my own assembler in Batch! (I don't understand why this always happens to me ). However, in this case my objective was very specific: create a valid executable file in the simplest possible way from assembly source code placed inside the Batch file (a Batch-assembly hybrid!). In order to achieve this goal, I choose the simplest 16-bits instructions from the 80286 CPU and the straightforward MS-DOS .com file format, and set some limitations in the allowed assembly source code. In despite of these restrictions, the assembly code is standard and it may be assembled by any other assembler after all the required paraphernalia was added. The assembly code used in my Batch program is pretty simple. Here it is!
Of course, this program can only be used to generate .com files that will not run in 64-bits versions of Windows; however, the most important aspect of Batch assembler is that it put assembly language topics at the reach of Batch file programmers in the same way than other Batch file "chimeras" did with other languages (like JScript, VBS, PowerShell, mshta, jscript.net, etc), so interested users could do a further research on this point and even adopt some assembly practices in their Batch files (like I did with the ":@F" forward repeated label). Batch assembler can be used as an educative tool to learn assembly language basics designed for Batch file programmers (not just for you, Ed! ).
I tested Batch assembler in Win XP and Win 8-32 bits. I assembled a few of my large old DOS programs and correctly generated .com files of a little less than 1 KB size. However, I did not tested all possible instruction/operands combinations, so certain specific forms may have errors. If you find a bug in Batch assembler, please report it (remember that PTR operator is not yet implemented).
As I usually do in projects like this one ("proof of concept"), the first version of this program have a very limited error checking. There are multiple situations that may crash the program, but if you write correct code you should obtain correct results (unless there is a bug!). Of course, a more extensive error checking and more features can be added (making the program larger and slower), but I think that invest more efforts in a program that can only generate .com files is just not worth it (unless new horizons be opened).
I got a lot of pleasure out of writting Batch and assembly code in the same file! I hope you may enjoy my Batch assembler program in the same way.
Antonio
penpen wrote:I've once started to write an assembler using batch (actually i don't know if i will ever finish it)...
... I couldn't resist the temptation of write my own assembler in Batch! (I don't understand why this always happens to me ). However, in this case my objective was very specific: create a valid executable file in the simplest possible way from assembly source code placed inside the Batch file (a Batch-assembly hybrid!). In order to achieve this goal, I choose the simplest 16-bits instructions from the 80286 CPU and the straightforward MS-DOS .com file format, and set some limitations in the allowed assembly source code. In despite of these restrictions, the assembly code is standard and it may be assembled by any other assembler after all the required paraphernalia was added. The assembly code used in my Batch program is pretty simple. Here it is!
Code: Select all
@echo off
rem BatchAsm.bat: Limited version of a x86 16-bits "in-line" assembler written in Batch
rem Antonio Perez Ayala
rem 2015/02/21 - First version
rem Example: Create example.com executable file
rem The definition of the following variable activate the creation of listing .lst file
setlocal
set .list=1
rem The name of the .com file preceded by colon must appear after "goto" in
rem "call :asm" line as shown below; the assembly source code starts at next line
rem (TO DO: change this method by a macro with one parameter ;)
call :asm example.com & goto :example.com
jmp start ;jumps over data area
CR EQU 13
LF EQU 10
EXCLAM EQU 33 ;Ascii code of "!"
text1 DB "Hello $"
text2 DB "World",EXCLAM,CR,LF
TEXT2_LEN EQU $-text2 ;length of previous string
PRINT_STRING EQU 9 ;DOS function
VIDEO_OUTPUT EQU 2 ;DOS function
TERMINATE_PROGRAM EQU 0 ;DOS function
start:
;Display a string terminated in "$" using DOS function 9
mov dx, OFFSET text1 ;DX -> text1
mov ah, PRINT_STRING ;AH = DOS function
int 21H ;show the DX->"string$"
;Display a string given its length via DOS function 2 and a loop
lea bx, text2 ;BX -> text2 (using LEA instead of OFFSET)
mov cx, TEXT2_LEN ;CX = number of chars
;
nextChar:
mov dl, [bx] ;DL = this char
inc bx ;advance BX to next char
mov ah, VIDEO_OUTPUT ;AH = DOS function
int 21H ;show the char
loop nextChar ;and repeat for CX chars
;Terminate program
mov al, 0 ;AL = errorlevel
mov ah, TERMINATE_PROGRAM ;AH = DOS function
int 21H ;terminate program
:example.com
rem Previous ":filename.com" line mark the end of the assembly source code
if errorlevel 1 echo Error in assembly & goto :EOF
echo Run example.com program:
example
goto :EOF
+===================================================+
| Assembler "in-line" (:asm subroutine) |
+===================================================+
:asm filename
setlocal EnableDelayedExpansion
set "_ascii= ^!"#$%%^&'()*+,-./0123456789:;^<=^>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^
^^^_`abcdefghijklmnopqrstuvwxyz{^|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ ¡¢£¤¥¦§¨©ª«¬^
®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ"
rem Define error messages
set "i=1"
for %%a in (
"_badAddr=Addressing mode not implemented or invalid: '%%_errVal%%'"
"_noMore=Code label can not include any additional element"
"_noLabel=Code label not found: '%%_errVal%%'"
"_farLabel=Code label '%%_errVal%%' too far %%_errVal2%% by %%_errVal3%% bytes"
"_badType=Data type in LABEL directive not implemented or invalid: '%%_errVal%%'"
"_notBothMem=Destination and source operands can not be both variables"
"_badSizes=Destination and source operands must have the same size"
"_notImmedD=Destination operand can not be a constant: '%%_errVal%%'"
"_noSize=Destination operand have no size: '%%_errVal%%'"
"_regD=Destination operand must be a register: '%%_errVal%%'"
"_notData=Destination label can not be a data variable: '%%_errVal%%'"
"_notYet=Instruction/directive not implemented: '%%_errVal%%'/'%%_errVal2%%'"
"_xchgOps=Instruction not implemented in this form; exchange the operands"
"_notByte=Operand can not be Byte size: '%%_errVal%%'"
"_notVar=Operand must be a data variable: '%%_errVal%%'"
"_notImmedS=Source operand can not be a constant: '%%_errVal%%'"
"_notRegS=Source operand can not be a register: '%%_errVal%%'"
"_syntax=Syntax error"
"_notDef=Undefined variable: '%%_errVal%%'"
) do (
for /F "tokens=1,2 delims==" %%b in (%%a) do (
set /A "i+=1, %%b=i"
set "errorMssg[!i!]=%%c"
)
)
rem Define op-codes for No operand (string) operations
for %%a in ( "cld=0xFC" "movsB=0xA4" "rep=0xF2"
"std=0xFD" "movsW=0xA5" "repNE=0xF2"
"cmpsB=0xA6" "repNZ=0xF2"
"cmpsW=0xA7" "repE=0xF3"
"stosB=0xAA" "repZ=0xF3"
"stosW=0xAB"
"lodsB=0xAC"
"lodsW=0xAD"
"scasB=0xAE"
"scasW=0xAF" ) do (
for /F "tokens=1,2 delims==" %%b in (%%a) do (
set /A "NoOperCode[%%b]=%%c"
)
)
rem Define op-codes for One operand operations: opCode+regPart
for %%a in ( "inc=0xFE+0" "not=0xF6+2" "pop=0x8F+0"
"dec=0xFE+1" "neg=0xF6+3" "push=0xFF+6"
"mul=0xF6+4"
"imul=0xF6+5"
"div=0xF6+6"
"idiv=0xF6+7" ) do (
for /F "tokens=1-3 delims==+" %%b in (%%a) do (
set /A "OneOperCode[%%b]=%%c, reg[%%b]=%%d"
)
)
rem Define op-codes for Two operand operations: reg/mem®/mem,reg/mem&immed+regPart
for %%a in ( "add=0x00,0x80+0" "test=0x84,0xF6+0" "xchg=0x86"
"or=0x08,0x80+1"
"and=0x20,0x80+4" "mov=0x88,0xC6+0" "lea=0x8D"
"sub=0x28,0x80+5"
"xor=0x30,0x80+6"
"cmp=0x38,0x80+7" ) do (
for /F "tokens=1-4 delims==,+" %%b in (%%a) do (
set /A "TwoOperCode[%%b]=%%c, TwoOperImmed[%%b]=%%d, dest_reg[%%b]=%%e" 2> NUL
)
)
rem Cancel errorlevel=1 from special cases
ver > NUL
rem Locate the start of the assembly code
set "_start="
for /F "delims=:" %%a in ('findstr /N ":%1" "%~F0"') do (
if not defined _start set "_start=%%a"
)
rem Assemble the code and generate auxiliary code-blocks
del %1 "%~N1.lst" 2> NUL
set /P "=Assembling." < NUL
set "pc=10000" // Program counter for processed input lines
set "$=256" // ORG 100H ;instruction pointer for object code
set "_errorCode="
for /F "usebackq skip=%_start% tokens=*" %%a in ("%~F0") do (
if /I "%%a" equ ":%1" goto :@F
set /A "pc+=1, pcMOD10=pc%%10"
if !pcMOD10! equ 0 set /P "=." < NUL
if defined .list set "[ !pc:~1! @ !$! ]= %%a"
rem Assemble some instructions individually: jmp, loop's, call, ret, int, aad, aam
call :%%a 2> NUL
rem Assemble the rest of instructions in groups: jCond's and by number of operands
if !errorlevel! equ 1 call :asmGroups %%a
if errorlevel 2 set "_errorCode=!errorlevel!" & set "_errorLine=%%a" & echo/ & goto asmEnd
)
:@F
echo/
rem Fix-up forward references
set /P "=Fixing labels." < NUL
set "_errorLine="
for /F "tokens=2,3 delims=[]=" %%a in ('set fixUpNear[ 2^>NUL') do (
set /P "=." < NUL
if not defined %%b set "_errorCode=%_noLabel%" & set "_errVal=%%b" & echo/ & goto asmEnd
set /A "_disp=%%b-![%%a]!"
set "[%%a]=!_disp!"
)
for /F "tokens=2,3 delims=[]=" %%a in ('set fixUpShort[ 2^>NUL') do (
set /P "=." < NUL
if not defined %%b set "_errorCode=%_noLabel%" & set "_errVal=%%b" & echo/ & goto asmEnd
for /F "tokens=1,2 delims=," %%c in ("![%%a]!") do (
set /A "_disp=%%b-%%d, _exceed=_disp-127"
if !_exceed! gtr 0 (
set "_errorCode=%_farLabel%"
set "_errVal=%%b"
set "_errVal2=ahead"
set "_errVal3=!_exceed!"
echo/
goto asmEnd
)
set "[%%a]=%%c,!_disp!"
)
)
echo/
rem Create PutBytes.com auxiliary program, if not exists
if exist PutBytes.com goto :@F
setlocal DisableDelayedExpansion
set LF=^
%empty line 1/2%
%empty line 2/2%
< NUL (
set /P "=ë0¬<"tZ^<'tV^<0rG^<9wC,0Šàë,Í!€þ,tâë4ŠðŠÔŠç€úÿ"
setlocal EnableDelayedExpansion
set /P "=tì€ìüëç³q€ëd·f€ïd2ä°‚‹ðü뽬<0rØ<9wÔ,0Õ!LF!"
endlocal
set /P "=Šàëï3ÀÍ!¬<,t£ëõŠð¬:Ætò:ÃtêŠÐŠçÍ!ëï"
) > PutBytes.com
endlocal
:@F
rem Generate the executable code from auxiliary code-blocks
set /P "=Generating object code." < NUL
set "pc=0"
(for /F "tokens=2,3 delims=@]=" %%a in ('set [') do (
set /A "pc+=1, pcMOD20=pc%%20"
if !pcMOD20! equ 0 set /P "=." < NUL > CON
if "%%a" equ " Byte " (
PutBytes %%b
) else if "%%a" equ " Word " (
set "line="
for %%c in (%%b) do (
set /A "lowByte=(%%c&0xFF), highByte=(%%c&0xFF00)>>8"
set "line=!line!,!lowByte!,!highByte!"
)
PutBytes !line:~1!
)
)) > %1
echo/
echo File %1 created
:asmEnd
if not defined .list goto :@F
(
echo/
echo APA = %date% %time:~0,-3% = Assembly of %1 in "%~NX0"
echo/
echo/
echo [ line @ offset]= SOURCE LINE
echo [ line @ type ]=VALUES OF GIVEN TYPE
echo/
set [
call :checkError
echo/
set fixUp 2> NUL
echo/
set symbol 2> NUL
echo/
set sizeOf[ 2> NUL
echo/
) > "%~N1.lst"
:@F
:checkError
if defined _errorCode (
echo/
call echo ERROR: !errorMssg[%_errorCode%]!
if defined _errorLine echo at line %pc:~1%: "%_errorLine%"
exit /B 1
)
exit /B 0
======= Assemble a couple instructions individually ========
:aad
rem // Op-code of AAD
set /A "$+=2, code=0xD5, byte2=0x0A"
set "[ %pc:~1% @ Byte ]=%code%,%byte2%"
exit /B 0
:aam
rem // Op-code of AAM
set /A "$+=2, code=0xD4, byte2=0x0A"
set "[ %pc:~1% @ Byte ]=%code%,%byte2%"
exit /B 0
:aad16
rem // "Macro" equivalent to AAD with factor=16
set /A "$+=2, code=0xD5, byte2=0x0F"
set "[ %pc:~1% @ Byte ]=%code%,%byte2%"
exit /B 0
:aam16
rem // "Macro" equivalent to AAM with divisor=16
set /A "$+=2, code=0xD4, byte2=0x0F"
set "[ %pc:~1% @ Byte ]=%code%,%byte2%"
exit /B 0
======================================================
======= Assemble instructions grouped by type ========
======================================================
:asmGroups instruction
goto :noOper
The word after "@" in the code-blocks specifies the size: Byte or Word.
Each block contains a series of values of that size that will be used to generate
the object code. Only Byte size blocks may include strings. For example:
set block[ %pc% @ Byte ]=1,2,3,"String",13,10,0
set block[ %pc% @ Word ]=12345,6789,4321
======= Assemble no operand (string) instructions =======
cld, std, lodsB/W, stosB/W, movsB/W, cmpsB/W, scasB/W, rep, repE/Z, repNE/NZ
:noOper code
if not defined NoOperCode[%1] goto oneOper
set /A "$+=1"
set "[ %pc:~1% @ Byte ]=!NoOperCode[%1]!"
exit /B 0
======= Assemble one operand instructions =======
inc, dec, not, neg, mul, imul, div, idiv, push, pop
:oneOper code oper
if not defined OneOperCode[%1] goto twoOper
if "%~2" equ "" exit /B %_syntax%
rem Value required in :addressingMode for PUSH immed instruction
if /I %1 equ PUSH set "dest_w=1"
call :addressingMode %2 & if errorlevel 2 exit /B !errorlevel!
if defined immed goto pushImmed
for %%a in (PUSH POP) do if /I %1 equ %%a if %w% neq 1 (
set "_errVal=%2" & exit /B %_notByte%
)
set /A "$+=2, code=OneOperCode[%1]|w, byte2=(mod<<6) | (reg[%1]<<3) | r_m"
set "[ %pc:~1% @ Byte ]=%code%,%byte2%"
if defined disp set /A "$+=2" & set "[ %pc:~1% @ Word ]=%disp%"
goto :@F
:pushImmed
set /A "$+=3"
set "[ %pc:~1% @ Byte ]=0x68" // Op-code of PUSH immed16
set "[ %pc:~1% @ Word ]=%immed%"
:@F
exit /B 0
======= Assemble two operands instructions =======
mov, lea, xchg, add, sub, and, or, xor, cmp, test
:twoOper code dest,source
if not defined TwoOperCode[%1] goto jCond
if "%~3" equ "" exit /B %_syntax%
set "dest_w="
call :addressingMode %2 dest_ & if errorlevel 2 exit /B !errorlevel!
if /I "%~3" neq "OFFSET" (
call :addressingMode %3 source_ & if errorlevel 2 exit /B !errorlevel!
) else (
call :checkVar %4 & if errorlevel 2 exit /B !errorlevel!
set "source_immed=!%4!"
)
if defined dest_reg ( rem twoOper reg,...
if not defined source_immed ( rem twoOper reg,reg or reg,mem
set "d=1" // op1=dest = reg, op2=source = mod+r_m
rem Check special cases
if /I %1 equ LEA (
if defined source_reg set "_errVal=%~3" & exit /B %_notRegS%
if "%dest_w%" equ "0" set "_errVal=%2" & exit /B %_notByte%
set "d=0"
) else (
if defined source_w if "%dest_w%" neq "%source_w%" exit /B %_badSizes%
if /I %1 equ TEST set "d=0"
)
set /A "$+=2, code=TwoOperCode[%1] | (d<<1) | dest_w"
set /A "byte2=(source_mod<<6) | (dest_reg<<3) | source_r_m"
set "[ %pc:~1% @ Byte ]=!code!,!byte2!"
if defined source_disp ( rem twoOper reg,mem
if /I %1 equ XCHG exit /B %_xchgOps%
set /A "$+=2"
set "[ %pc:~1% @ Word ]=%source_disp%"
)
) else ( rem twoOper reg,immed
if not defined TwoOperImmed[%1] set "_errVal=%~3" & exit /B %_notImmedS%
set /A "$+=2, code=TwoOperImmed[%1] | dest_w"
set /A "byte2=(dest_mod<<6) | (dest_reg[%1]<<3) | dest_r_m"
set "[ %pc:~1% @ Byte ]=!code!,!byte2!"
if "%dest_w%" equ "0" ( rem Dest is Byte
set /A "$+=1"
set "[ %pc:~1% @ Byte ]=![ %pc:~1% @ Byte ]!,%source_immed%"
) else ( rem Dest is Word
set /A "$+=2"
set "[ %pc:~1% @ Word ]=%source_immed%"
)
)
) else ( rem twoOper mem,...
if not defined source_immed ( rem twoOper mem,reg or mem,mem
set "d=0" // op1=dest = mod+r_m, op2=source = reg
if defined source_reg ( rem twoOper mem,reg
if /I %1 equ LEA set "_errVal=%2" & exit /B %_regD%
if /I %1 equ TEST exit /B %_xchgOps%
if defined dest_w if "%dest_w%" neq "%source_w%" exit /B %_badSizes%
set /A "$+=4, code=TwoOperCode[%1] | dest_w"
set /A "byte2=(dest_mod<<6) | (source_reg<<3) | dest_r_m"
set "[ %pc:~1% @ Byte ]=!code!,!byte2!"
set "[ %pc:~1% @ Word ]=%dest_disp%"
) else ( rem twoOper mem,mem
exit /B %_notBothMem%
)
) else ( rem twoOper mem,immed
if not defined TwoOperImmed[%1] set "_errVal=%~3" & exit /B %_notImmedS%
if not defined dest_w set "_errVal=%2" & exit /B %_noSize%
set /A "$+=6, code=TwoOperImmed[%1] | dest_w"
set /A "byte2=(dest_mod<<6) | (dest_reg[%1]<<3) | dest_r_m"
set "[ %pc:~1% @ Byte ]=!code!,!byte2!"
if "%dest_w%" equ "0" (
rem Dest is Byte: pad the last Word with a NOP after the 8-bits constant
set /A "source_immed+=0x90<<8"
)
set "[ %pc:~1% @ Word ]=%dest_disp%,!source_immed!"
)
)
exit /B 0
======= Assemble transfer instructions =======
Jcond instructions by related group
jmp, jcxz, loop, call, ret and int individually
:jCond Jcond [SHORT] label
set "cond=%1"
if /I "%cond:~0,1%" neq "J" goto asmLabel
set "cond=0"
for %%a in ( JO JNO JB JNB JZ JNZ JBE JNBE JS JNS JP JNP JL JNL JLE JNLE
_ _ JNAE JAE JE JNE JNA JA _ _ JPE JPO JNGE JGE JNG JG
_ _ JC JNC ) do (
if /I "%1" equ "%%a" set /A "cond&=0x0F" & goto :@F
set /A cond+=1
)
goto asmLabel
:@F
shift
if /I "%1" neq "SHORT" (
rem // Op-code of Jcond disp16 = Near
set /A "$+=1, code=0x0F, byte2=0x80+cond"
set "[ %pc:~1% @ Byte ]=!code!,!byte2!"
goto jmpNearTail
) else (
rem // Op-code of Jcond disp8 = Short
set /A "code=0x70+cond"
shift
goto jmpShortTail
)
:jmp [SHORT] label
if /I "%1" neq "SHORT" (
rem // Op-code of JMP disp16 = Near Direct
set /A "code=0xE9"
set "[ %pc:~1% @ Byte ]=!code!"
goto jmpNearTail
) else (
rem // Op-code of JMP disp8 = Short Direct
set /A "code=0xEB"
shift
goto jmpShortTail
)
:call label
rem // Op-code of CALL disp16 = Near Direct
set /A "code=0xE8"
set "[ %pc:~1% @ Byte ]=%code%"
:jmpNearTail label
set /A "$+=3"
if defined %1 (
if defined sizeOf[%1] set "_errVal=%1" & exit /B %_notData%
set /A "disp=%1-$"
) else (
set /A "disp=$"
set "fixUpNear[ %pc:~1% @ Word ]=%1"
)
set "[ %pc:~1% @ Word ]=%disp%"
exit /B 0
:loop label
rem // Op-code of LOOP disp8 = Short
set /A "code=0xE2" & goto jmpShortTail
:loopE label
:loopZ label
rem // Op-code of LOOPE/Z disp8 = Short
set /A "code=0xE1" & goto jmpShortTail
:loopNE label
:loopNZ label
rem // Op-code of LOOPNE/NZ disp8 = Short
set /A "code=0xE0" & goto jmpShortTail
:jcxz label
rem // Op-code of JCXZ
set /A "code=0xE3"
:jmpShortTail label
set /A "$+=2"
if defined %1 (
if defined sizeOf[%1] set "_errVal=%1" & exit /B %_notData%
set /A "disp=%1-$, _exceed=-(128+disp), disp&=0xFF"
if !_exceed! gtr 0 (
set "_errVal=%1"
set "_errVal2=behind"
set "_errVal3=!_exceed!"
exit /B %_farLabel%
)
) else (
set /A "disp=$"
set "fixUpShort[ %pc:~1% @ Byte ]=%1"
)
set "[ %pc:~1% @ Byte ]=%code%,%disp%"
exit /B 0
:ret
rem // Op-code of RET = Near
set /A "$+=1, code=0xC3"
set "[ %pc:~1% @ Byte ]=%code%"
exit /B 0
:int intNum
set "intNum=%1"
if /I "%intNum:~-1%" equ "H" set /A "intNum=0x%intNum:~0,-1%"
rem // Op-code of INT
set /A "$+=2, code=0xCD"
set "[ %pc:~1% @ Byte ]=%code%,%intNum%"
exit /B 0
======= Assemble code labels and EQU, LABEL, DB and DW directives ========
:asmLabel codeLabel: | constLabel EQU value |
:: dataLabel LABEL {BYTE|WORD} | [dataLabel] {DB|DW} list,of,values
set "_label=%1"
if "%_label:~-1%" neq ":" goto checkEQU
:codeLabel
set "%_label:~0,-1%=%$%"
if defined .list set "[ %pc:~1% @ Label]=%$%" & set "symbol %_label:~0,-1% = %$%"
if "%~2" neq "" exit /B %_noMore%
exit /B 0
:checkEQU
if /I "%~2" neq "EQU" goto checkLABEL
if "%~3" equ %3 set _value="%~3"& goto if _value is char
set "_value=%3"
if "%_value:~0,1%%_value:~-1%" neq "''" goto else
:if _value is char
set "_char=!_value:~1,1!"
for /L %%i in (0,1,223) do if "!_char!" equ "!_ascii:~%%i,1!" set /A "_value=%%i+32"
goto endif
:else
if /I "%_value:~-1%" equ "H" set "_value=0x%_value:~0,-1%"
:endif
set /A "%_label%=%_value%"
if defined .list set "[ %pc:~1% @ Const]=!%_label%!" & set "symbol %_label% = !%_label%!"
exit /B 0
:checkLABEL
if /I "%~2" neq "LABEL" goto checkDW
set "%_label%=%$%"
if defined .list set "symbol %_label% = %$%"
if /I "%~3" equ "BYTE" set "sizeOf[%_label%]=1"
if /I "%~3" equ "WORD" set "sizeOf[%_label%]=2"
if not defined sizeOf[%_label%] set "_errVal=%3" & exit /B %_badType%
exit /B 0
:checkDW
for %%a in (DB DW) do if /I "%_label%" equ "%%a" set "_label="
if defined _label (
set "%_label%=%$%"
if defined .list set "symbol %_label% = %$%"
shift
)
set "_block="
if /I "%~1" neq "DW" goto checkDB
if defined _label set "sizeOf[%_label%]=2"
:nextW
shift
set "_value=%~1"
if not defined _value set "[ %pc:~1% @ Word ]=%_block:~1%" & exit /B 0
if /I "%_value:~-1%" equ "H" set "_value=0x%_value:~0,-1%"
set /A "_value=%_value%"
set "_block=%_block%,%_value%"
set /A $+=2
goto nextW
:checkDB
if /I "%~1" neq "DB" set "_errVal=%_label%" & set "_errVal2=%1" & exit /B %_notYet%
if defined _label set "sizeOf[%_label%]=1"
:nextB
shift
if "%~1" equ "" set "[ %pc:~1% @ Byte ]=!_block:~1!" & exit /B 0
if "%~1" equ %1 set _value="%~1"& goto if _value is string
set "_value=%1"
if "%_value:~0,1%%_value:~-1%" neq "''" goto else
:if _value is string
set _block=!_block!,"!_value:~1,-1!"
set "_len=0"
for /L %%i in (5,-1,0) do (
set /A "_newLen=_len+(1<<%%i)"
for %%j in (!_newLen!) do if "!_value:~%%j,1!" neq "" set "_len=!_newLen!"
)
set /A "$+=_len-1"
goto endif
:else
if /I "!_value:~-1!" equ "H" set "_value=0x!_value:~0,-1!"
set /A "_value=(%_value%)&0xFF"
set "_block=!_block!,%_value%"
set /A "$+=1"
:endif
goto nextB
======= Auxiliary subroutine that identify the addressing mode of an operand
Parameters: operand returnPrefix
Returns values in variables with the return prefix given and these names:
"mod", "reg", "r_m" and "w"
also "disp" if the operand include "var+const", "var" or "+const"
or "immed" if the operand is *just* "const"
:addressingMode operand returnPrefix=
setlocal EnableDelayedExpansion
set "reg="
set "r_m="
set "disp="
set "immed="
rem Identify if operand is a CPU register
set /A mod=3, w=0, i=0
for %%a in (AL CL DL BL AH CH DH BH) do (
if /I "%~1" equ "%%a" set /A "reg=r_m=i" & goto modeOK
set /A i+=1
)
set /A w=1, i=0
for %%a in (AX CX DX BX SP BP SI DI) do (
if /I "%~1" equ "%%a" set /A "reg=r_m=i" & goto modeOK
set /A i+=1
)
set "w="
rem Check if operand is enclosed in quotes
set "_char="
if "%~1" equ %1 set _operand=%1& set "_char=!_operand:~1,1!" & goto else_Operand_is_const
rem Check operand with this format: var[base+index]+const
set "_operand=%1"
if "%_operand:[=%" equ "%_operand%" goto checkVarConst
rem Operand have the [base+index] part
for /F "tokens=1-3 delims=[]" %%a in ("{%_operand%}") do (
set "_var=%%a" & set "base_index=[%%b]" & set "_const=%%c"
)
set "_var=%_var:~1%" & set "_const=%_const:~0,-1%"
rem Identify the [base+index] part
set i=0
for %%a in ([BX+SI] [BX+DI] [BP+SI] [BP+DI] [SI] [DI] [BP] [BX]) do (
if /I "%base_index%" equ "%%a" set "r_m=!i!" & goto :@F
set /A i+=1
)
endlocal & set "_errVal=%base_index%" & exit /B %_badAddr%
:@F
if "%_var%%_const%" neq "" goto :checkDisp
rem Operand is [base+index] with no disp
if /I "%base_index%" neq "[BP]" (
rem Standard cases
set "mod=0"
) else (
rem Special case: [BP] with no disp, insert a disp16=0
set /A "disp=0, mod=2"
)
goto modeOK
:checkDisp
rem Operand is [base+index] with disp (var+const)
if defined _var call :checkVar %_var%
if errorlevel 2 endlocal & set "_errVal=%_errVal%" & exit /B %errorlevel%
set /A "disp=%_var%%_const%, mod=2"
goto modeOK
:checkVarConst
rem Operand have no base_index part: is var+const, var or const
if "!_operand:~0,1!" equ "-" goto else
for %%s in (+ -) do if "!_operand:%%s=!" neq "%_operand%" set "_sign=%%s" & goto if defined _sign
goto else
:if defined _sign
rem Operand is var+const
for /F "delims=%_sign%" %%a in ("%_operand%") do (
call :checkVar %%a
if errorlevel 2 for /F %%b in ("!_errVal!") do endlocal & set "_errVal=%%b" & exit /B !errorlevel!
)
set /A "disp=%_operand%, r_m=6, mod=0"
goto endif
:else
call :checkVar %_operand% > NUL & if errorlevel 2 goto else_Operand_is_const
:if not errorlevel 2
rem Operand is var
set /A "disp=%_operand%, r_m=6, mod=0"
goto endif
:else_Operand_is_const
rem If is the first operand: is wrong (excepting in PUSH immed)
if not defined dest_w endlocal & set "_errVal=%1" & exit /B %_notImmedD%
if "%_operand:~0,1%%_operand:~-1%" equ "''" set "_char=%_operand:~1,1%"
if defined _char (
for /L %%i in (0,1,223) do if "!_char!" equ "!_ascii:~%%i,1!" set /A "_operand=%%i+32"
) else (
if /I "!_operand:~-1!" equ "H" set "_operand=0x!_operand:~0,-1!"
set "_max=0xFF"
if "%dest_w%" equ "1" set "_max=0xFFFF"
set /A "_operand=(!_operand!)&_max"
)
set "immed=!_operand!"
:endif
:endif
:modeOk
(
endlocal
for %%a in ("mod=%mod%" "reg=%reg%" "r_m=%r_m%" "w=%w%" "disp=%disp%" "immed=%immed%") do set "%2%%~a"
)
exit /B 0
======= Auxiliary subroutine that check if a data variable exist
:checkVar var
if "%~1" equ "" exit /B %_syntax%
if defined %1 (
if defined sizeOf[%1] (
set /A "w=sizeOf[%1]-1"
) else (
set "_errVal=%1" & exit /B %_notVar%
)
) else (
set "_errVal=%1" & exit /B %_notDef%
)
exit /B 0
Of course, this program can only be used to generate .com files that will not run in 64-bits versions of Windows; however, the most important aspect of Batch assembler is that it put assembly language topics at the reach of Batch file programmers in the same way than other Batch file "chimeras" did with other languages (like JScript, VBS, PowerShell, mshta, jscript.net, etc), so interested users could do a further research on this point and even adopt some assembly practices in their Batch files (like I did with the ":@F" forward repeated label). Batch assembler can be used as an educative tool to learn assembly language basics designed for Batch file programmers (not just for you, Ed! ).
I tested Batch assembler in Win XP and Win 8-32 bits. I assembled a few of my large old DOS programs and correctly generated .com files of a little less than 1 KB size. However, I did not tested all possible instruction/operands combinations, so certain specific forms may have errors. If you find a bug in Batch assembler, please report it (remember that PTR operator is not yet implemented).
As I usually do in projects like this one ("proof of concept"), the first version of this program have a very limited error checking. There are multiple situations that may crash the program, but if you write correct code you should obtain correct results (unless there is a bug!). Of course, a more extensive error checking and more features can be added (making the program larger and slower), but I think that invest more efforts in a program that can only generate .com files is just not worth it (unless new horizons be opened).
I got a lot of pleasure out of writting Batch and assembly code in the same file! I hope you may enjoy my Batch assembler program in the same way.
Antonio