Processing text files that have unlimited size lines!
Posted: 29 Sep 2013 00:48
In this SO topic the OP requested to process text files of about 30-40 KB size that have not NewLines and separate its contents in lines marked by "$" character. The size of the file prevents to process it with FOR /F command (that can read lines up to 8 KB size) nor with SET /P command (that can read lines up to 1 KB size).
After some tests I discovered that after SET /P command reach its maximum size and return the characters read, it does NOT move the file pointer of the input file; this mean that the next SET /P command over the same file (redirected to a code block) continue reading the following characters. This point made possible to read the contents of a text file in blocks of 1 KB size and accumulate and process they in any way as long as the result fits in a variable, that is, be not longer than 8 KB.
Previous program is not efficient because it read the same data already processed every time it needs to read the next block, but it is an example of how to do it. This method can be modified in order to read the file just one time.
Antonio
After some tests I discovered that after SET /P command reach its maximum size and return the characters read, it does NOT move the file pointer of the input file; this mean that the next SET /P command over the same file (redirected to a code block) continue reading the following characters. This point made possible to read the contents of a text file in blocks of 1 KB size and accumulate and process they in any way as long as the result fits in a variable, that is, be not longer than 8 KB.
Code: Select all
@echo off
setlocal EnableDelayedExpansion
set nextBlock=0
set "thisBlock="
:nextBlock
rem Read the next block of characters
set /A nextBlock+=1
(for /L %%i in (1,1,%nextBlock%) do (
set "block="
set /P "block="
)) < input.txt
if not defined block goto endFile
rem Append next block to last part of previous block
set "thisBlock=!thisBlock!!block!"
rem Process current block: separate lines at "$" character
:nextLine
for /F "tokens=1* delims=$" %%a in ("!thisBlock!") do (
set "nextLine=%%a"
set "thisBlock=%%b"
)
if defined thisBlock (
echo !nextLine!
goto nextLine
)
set "thisBlock=!nextLine!"
goto nextBlock
:endFile
if defined thisBlock echo !thisBlock!
Previous program is not efficient because it read the same data already processed every time it needs to read the next block, but it is an example of how to do it. This method can be modified in order to read the file just one time.
Antonio