Oooh - very nice Antonio
That does not give the size of the EOL terminator, which can be important if you want to test if it is possible that the lines are truly constant length.
The EOL characters are easy to determine with my FC technique. Your technique can be used to quickly determine how large the comparison file must be in my FC technique.
Dave Benham
Determining the number of lines in a file.
Moderator: DosItHelp
Re: Determining the number of lines in a file.
Antonio, I won't have time to look at your code until I get back to work on Monday and just looking at it right now I am not following how it work but once I see output the light bulb in my head usually turns on. Two thoughts do come to my mind.
1) As Dave stated, I do need to know what the line terminators are. If there is a CR\LF or just a LF, I need to know that before I upload the file to our mainframe. Dave's code works perfect for that.
2) I see a lot of SET /P statements in your code. Is not SET /P limited to 1026 bytes? I deal with very large text files. Some times the line lengths can be 9000 bytes long.
1) As Dave stated, I do need to know what the line terminators are. If there is a CR\LF or just a LF, I need to know that before I upload the file to our mainframe. Dave's code works perfect for that.
2) I see a lot of SET /P statements in your code. Is not SET /P limited to 1026 bytes? I deal with very large text files. Some times the line lengths can be 9000 bytes long.
Re: Determining the number of lines in a file.
The SET /P line length limit is not an issue, only the FINDSTR offset prefix is needed from the line.
Antonio's technique uses chained pipes. The last pipe target only processes a single line and then terminates, which causes the prior pipes to terminate with a non-existent pipe error. This is why it is so fast: FINDSTR never gets a chance to finish scanning the entire file. The pipe failure error messages are redirected to nul.
The technique really doesn't help you much if you are using my FC technique and all of your line lengths are less than some predetermined maximum length. But if the line length is open ended, it can be used to quickly determine how large the comparison file needs to be for my FC technique.
Dave Benham
Antonio's technique uses chained pipes. The last pipe target only processes a single line and then terminates, which causes the prior pipes to terminate with a non-existent pipe error. This is why it is so fast: FINDSTR never gets a chance to finish scanning the entire file. The pipe failure error messages are redirected to nul.
The technique really doesn't help you much if you are using my FC technique and all of your line lengths are less than some predetermined maximum length. But if the line length is open ended, it can be used to quickly determine how large the comparison file needs to be for my FC technique.
Dave Benham
Re: Determining the number of lines in a file.
No temp file
Code: Select all
@echo off
for /F "tokens=2 delims=:=" %%a in ('findstr /O "." "%~1" 2^>NUL ^| findstr /v "0:" 2^>nul ^| ( set /p "1=" ^& set 1 ^)') do set line=%%a
echo %line%
pause
Re: Determining the number of lines in a file.
I borrowed Dave's method to get the EOL size and foxidrive's format to get the line length with no temp file, and completed my solution. Here it is:
Antonio
Code: Select all
@echo off
setlocal EnableDelayedExpansion
set fileSize=%~Z1
echo File size: %fileSize%
rem Get the length of the first line
for /F "tokens=2 delims==:" %%a in (
'findstr /O "^" "%~1" 2^>NUL ^| findstr /V "^0:" 2^>NUL ^| (set /P @^=^& set @ ^)'
) do set lineLen=%%a
echo Line length: %lineLen%
rem Get the EOL size (1=LF, 2=CR+LF)
set EOLSize=0
del tempFile.tmp 2>NUL
fsutil file createnew tempFile.tmp %lineLen% > NUL
for /F %%a in ('fc /B "%~1" tempFile.tmp ^| findstr /C:" 0D " /C:" 0A "') do set /A EOLSize+=1
del tempFile.tmp
echo EOL size: %EOLSize%
set /A recordLen=lineLen-EOLSize
echo -----------------------------
echo Record length: %recordLen%
rem Split the file size in groups of 4 digits
set N=0
:nextGroup
set group=%fileSize:~-4%
:checkLeftZero
if "%group:~0,1%" neq "0" goto noLeftZero
set group=%group:~1%
if defined group goto checkLeftZero
:noLeftZero
if not defined group set group=0
set /A N+=1
set group[%N%]=%group%
set fileSize=%fileSize:~0,-4%
if defined fileSize goto nextGroup
rem Divide the groups by the line length and assemble the quotient
set quotient=
set remainder=0
for /L %%i in (%N%,-1,1) do (
set /A group=remainder*10000+group[%%i], group[%%i]=group/lineLen, remainder=group%%lineLen
if not defined quotient (
if !group[%%i]! neq 0 set quotient=!group[%%i]!
) else (
set group=000!group[%%i]!
set quotient=!quotient!!group:~-4!
)
)
echo Number of records: %quotient%
echo -----------------------------
echo Remainder: %remainder%
echo/
rem Analyze the remainder
if %remainder% equ 0 (
echo Format correct
if %EOLSize% equ 1 echo (perhaps last line have EOF instead the EOL = LF^)
) else if %remainder% equ 1 (
echo EOF inserted at end of file
) else (
set /A missing=lineLen-remainder
if !missing! equ %EOLSize% (
echo Last line have not EOL
) else (
echo FORMAT INCORRECT
)
)
echo/
Antonio
Re: Determining the number of lines in a file.
Thanks for all the feedback guys. I will test out some of this new code when I got some free time at work.