I strongly considered hard-coding pre-computed constants for all the possible multiplications, but it turns out that would ahve required six lookup tables with 255 values each to cover multiplying by 2, 3, 9, 11, 13 and 14. Even I draw the line at hand typing over 1500 "set /a thing[n]=0xValue" entries, so I had no choice but to fnd an algorithm and code it.
It turns out the code ended up pretty small:
Code: Select all
::define 'GaloisMult' macro
::takes two multiplicands as inputs plus a variable name in which to store the product over the Rijndael Galois field
set macro.GaloisMult=do (%\n%
setlocal enableDelayedExpansion%\n%
set /a "multA=(%%~a)"%\n%
set /a "multB=(%%~b)"%\n%
set "product=0"%\n%
set "carry=0"%\n%
for /l %%n in (1,1,8) do (%\n%
set /a "lsbtest=!multB!^&0x01"%\n%
if !lsbtest!==1 set /a "product=!product!^^!multA!"%\n%
if !product! GTR 255 set /a "product=!product!-256"%\n%
set /a "carry=!multA!^&0x80"%\n%
set /a "multA=!multA!<<1"%\n%
if !multA! GTR 255 set /a "multA=!multA!-256"%\n%
if not !carry!==0 set /a "multA=!multA!^^0x1b"%\n%
set /a "multB=!multB!>>1"%\n%
)%\n%
for %%v in (!product!) do endlocal^&set "%%~c=%%v"%\n%
)
That macro does what it says, and I spot-checked it against a reasonable number of values from online pre-computed tables, so it should be good to go. It could be optimized by breaking out of the FOR loop early if either multiplicand reaches 0, but I couldn't figure out how to break a FOR loop that is part of a macro definition? Does anyone have any tips there?
I tried adding "if !multA!==0 for %%v in (!product!) do endlocal^&set "%%~c=%%v"%\n%" at the top, along with the same check against MultB, and it "seems" to work (the correct product comes back) but I get a bunch of "missing operand" errors as the loop continues to expand in the background instead of actually breaking. I tried to append "^& exit" to the above line, which closes my whole CMD window, and I tried making a label just above the product return at the bottom and doing a GOTO at the top of the loop when A or B are 0, but that doesn't break at all and instead I get "setlocal has reached it maximum level of recursion" (whoops! )
Optimizing the macro by breaking early isn't exactly life or death, I just figure if we have to run all 16 elements of the state through 8 loop iterations per round, over 10 rounds, thats ~1300 loop iterations. And that's per every 128-bits of data we want to encrypt, so the number of loops could add up quick... It would be nice if the bytes that only needed 1-2 loops before hitting 0 would break early instead of wasting time doing 6-7 more empty loops. If anyone has any ideas I'd love to hear them. Thanks!