@Magialisk:
Yes, all operations are simpler if pre-computed constants are used and they allows some interesting tricks, like these ones:
Code: Select all
Core part of MixColumns():
for %%m in ("0=G2 G3 + +" "1=+ G2 G3 +" "2=+ + G2 G3" "3=G3 + + G2") do (
Core part of InvMixColummns():
for %%m in ("0=Ge Gb Gd G9" "1=G9 Ge Gb Gd" "2=Gd G9 Ge Gb" "3=Gb Gd G9 Ge") do (
Let's pass to the multi-process feature. The base idea of this method is that each parallel process be executed in a different CPU core, so there will be as many active processes as CPU cores. The controller process takes one CPU core, so there will be CPU-1 parallel encryption processes; lets name this number N. In the description below, assume N=3 and numRecords=9 in the file.
Note that the "controller process" is BinToHex subroutine itself so this is not a CPU core wasted just as controller, but used both as hexadecimal conversion
and multi-process controller.
My first idea was to split the hexadecimal file in N parts in the BinToHex subroutine and pass each part to a different encrypt process as soon as it be ready. This method just needs a START command each time that numRecords/N records are converted as the "multi-process controller".
Code: Select all
file.ext -> BinToHex -> filePart#.hex -> Encrypt# -> filePart#.aes -> JoinParts -> file.aes
1111 -> \
2222 -> \
3333 -> ---> -> filePart1.hex -> Encrypt1 -> filePart1.aes -> ren filePart1.aes file.aes
4444 -> \
5555 -> \
6666 -> ---> -> filePart2.hex -> Encrypt2 -> filePart2.aes -> type filePart2.aes >> file.aes
7777 -> \
8888 -> \
9999 -> ---> -> filePart3.hex -> Encrypt3 -> filePart3.aes -> type filePart3.aes >> file.aes
However, in this method the last available CPU core will be used until the last hexa part (that is, the whole file) be generated, and when the first encryption process ends its CPU core will be no longer used, so this method waste the available CPU cores a lot.
A more efficient method consist in pass each hexadecimal converted record to the next encryption process in turn, so all processes will be active as soon as N records be converted. That is:
Code: Select all
file.ext -> BinToHex -> filePart#.hex -> Encrypt# -> file.aes
1111 -> >> filePart1.hex -> Encrypt1 -> ae11
2222 -> >> filePart2.hex -> Encrypt2 -> ae22
3333 -> >> filePart3.hex -> Encrypt3 -> ae33
4444 -> >> filePart1.hex -> Encrypt1 -> ae44
5555 -> >> filePart2.hex -> Encrypt2 -> ae55
6666 -> >> filePart3.hex -> Encrypt3 -> ae66
7777 -> >> filePart1.hex -> Encrypt1 -> ae77
8888 -> >> filePart2.hex -> Encrypt2 -> ae88
9999 -> >> filePart3.hex -> Encrypt3 -> ae99
The simplest way to coordinate this scheme is via a "pseudo-pipe" file: the BinToHex routine append records to the next file part in turn, and each encryption routine just read them via a redirected SET /P command. The key for this to work is that always be an available record in each "pipe" file when its encrypt process read it (that the pipe buffer never empties); this is achieved if the time required to convert N records in BinToHex subroutine is less or equal than the time each encryption subroutine requires to encrypt one record. If a given computer have too many CPU cores, so the first core requires more time to convert N records than the time each core requires to encrypt one record, then some CPU cores must be necessarily wasted: it is not possible to encrypt a file in less time than the time required to convert it! The maximum number of CPU cores that this method may use may be obtained via separate timing tests for convert and encrypt routines followed by timeEncrypt/timeConvert operation.
The Batch code below is an example of this method; it may be used as general format for any application that follows this scheme: process a file via a fast part followed by a slow one that may be divided/repeated in several parallel processes using "multi pipe files". In the example below the time that write part (convert) and read part (encrypt) takes to process one record is simulated via a ping command, so you may review how the behavior of the method changes as these times are adjusted. In this case the choosen ping values for write and read parts, 2 and 4 respectively, produce a perfect synchronization: each record is read (encrypted) as soon as it was wrote (converted), so the encryption process ends just instants after the conversion process ends. Of course, in a real application you can not adjust the time each routine takes, just change the number N of parallel processes.
EDIT: The following paragraph and code modification was made after Magialisk's comment about "this part can not be optional".
The generation of the final file is comprised by the output of each encrypted record redirected to
the same output file. This may cause problems if two parallel processes tries to write to the output file at same time. This is solved via a "semaphore file" called "outputTurn.#" that is renamed to the next output turn in each parallel process.
Code: Select all
@echo off
setlocal EnableDelayedExpansion
rem Multi-process dispatcher:
set "param=%~1"
if "!param:~0,1!" equ ":" (
shift
goto !param:~1!
)
set N=3
echo %time% - Write side running > output.txt
rem Write side (fast):
del pipefile?.txt 2>NUL
echo X >outputTurn.0
for /L %%i in (0,1,14) do (
echo !time! - Write: Record %%i >> output.txt
set /A nextTurn=%%i %% N
echo Record %%i >> pipefile!nextTurn!.txt
ping -n 2 localhost >NUL
rem Multi-process starter:
if %%i lss %N% start "" /B "%~F0" :ReadSide !nextTurn!
)
echo %time% - Write records end >> output.txt
:waitForReadSides
if exist pipefile?.txt goto waitForReadSides
echo %time% - Write side end >> output.txt
del outputTurn.*
type output.txt
goto :EOF
rem Read side (slow):
:ReadSide turn
echo %time% - Read side turn %1 - started >> output.txt
call :nextRead %1 < pipefile%1.txt
echo %time% - Read side turn %1 - end >> output.txt
del pipeFile%1.txt
exit
:nextRead turn
set "line="
set /P line=
if not defined line goto endRead
if not exist outputTurn.%1 call :waitMyTurn %1
echo !time! - Read turn %1: %line% >> output.txt
set /A nextTurn=(%1+1) %% N
ren outputTurn.%1 outputTurn.%nextTurn%
ping -n 4 localhost >NUL
goto nextRead
:endRead
exit /B
:waitMyTurn
if not exist outputTurn.%1 goto waitMyTurn
exit /B
I included this method in the last edit of my AES FIPS-197 encryption program above (excepting the part about "outputTurn.#" file mentioned in the
edit); however, I did not fully tested it because my computer have just 2 CPU cores.
Antonio