Determining the number of lines in a file.

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: Determining the number of lines in a file.

#46 Post by dbenham » 09 Feb 2013 09:16

Oooh - very nice Antonio :D

That does not give the size of the EOL terminator, which can be important if you want to test if it is possible that the lines are truly constant length.

The EOL characters are easy to determine with my FC technique. Your technique can be used to quickly determine how large the comparison file must be in my FC technique.


Dave Benham

Squashman
Expert
Posts: 4486
Joined: 23 Dec 2011 13:59

Re: Determining the number of lines in a file.

#47 Post by Squashman » 09 Feb 2013 10:33

Antonio, I won't have time to look at your code until I get back to work on Monday and just looking at it right now I am not following how it work but once I see output the light bulb in my head usually turns on. Two thoughts do come to my mind.

1) As Dave stated, I do need to know what the line terminators are. If there is a CR\LF or just a LF, I need to know that before I upload the file to our mainframe. Dave's code works perfect for that.

2) I see a lot of SET /P statements in your code. Is not SET /P limited to 1026 bytes? I deal with very large text files. Some times the line lengths can be 9000 bytes long.

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: Determining the number of lines in a file.

#48 Post by dbenham » 09 Feb 2013 11:16

The SET /P line length limit is not an issue, only the FINDSTR offset prefix is needed from the line.

Antonio's technique uses chained pipes. The last pipe target only processes a single line and then terminates, which causes the prior pipes to terminate with a non-existent pipe error. This is why it is so fast: FINDSTR never gets a chance to finish scanning the entire file. The pipe failure error messages are redirected to nul.

The technique really doesn't help you much if you are using my FC technique and all of your line lengths are less than some predetermined maximum length. But if the line length is open ended, it can be used to quickly determine how large the comparison file needs to be for my FC technique.


Dave Benham

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: Determining the number of lines in a file.

#49 Post by foxidrive » 09 Feb 2013 13:03

No temp file

Code: Select all

@echo off
for /F "tokens=2 delims=:=" %%a in ('findstr /O "." "%~1" 2^>NUL ^| findstr /v "0:" 2^>nul ^| ( set /p "1=" ^& set 1 ^)') do set line=%%a
echo %line%
pause

Aacini
Expert
Posts: 1914
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: Determining the number of lines in a file.

#50 Post by Aacini » 10 Feb 2013 00:15

I borrowed Dave's method to get the EOL size and foxidrive's format to get the line length with no temp file, and completed my solution. Here it is:

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set fileSize=%~Z1
echo           File size:  %fileSize%

rem Get the length of the first line
for /F "tokens=2 delims==:" %%a in (
       'findstr /O "^" "%~1" 2^>NUL ^| findstr /V "^0:" 2^>NUL ^| (set /P @^=^& set @ ^)'
                                   ) do set lineLen=%%a
echo         Line length:  %lineLen%

rem Get the EOL size (1=LF, 2=CR+LF)
set EOLSize=0
del tempFile.tmp 2>NUL
fsutil file createnew tempFile.tmp %lineLen% > NUL
for /F %%a in ('fc /B "%~1" tempFile.tmp ^| findstr /C:" 0D " /C:" 0A "') do set /A EOLSize+=1
del tempFile.tmp
echo            EOL size:  %EOLSize%

set /A recordLen=lineLen-EOLSize
echo -----------------------------
echo       Record length:  %recordLen%

rem Split the file size in groups of 4 digits
set N=0
:nextGroup
   set group=%fileSize:~-4%
   :checkLeftZero
      if "%group:~0,1%" neq "0" goto noLeftZero
      set group=%group:~1%
   if defined group goto checkLeftZero
   :noLeftZero
   if not defined group set group=0
   set /A N+=1
   set group[%N%]=%group%
   set fileSize=%fileSize:~0,-4%
if defined fileSize goto nextGroup

rem Divide the groups by the line length and assemble the quotient
set quotient=
set remainder=0
for /L %%i in (%N%,-1,1) do (
   set /A group=remainder*10000+group[%%i], group[%%i]=group/lineLen, remainder=group%%lineLen
   if not defined quotient (
      if !group[%%i]! neq 0 set quotient=!group[%%i]!
   ) else (
      set group=000!group[%%i]!
      set quotient=!quotient!!group:~-4!
   )
)

echo   Number of records:  %quotient%
echo -----------------------------
echo           Remainder:  %remainder%
echo/

rem Analyze the remainder
if %remainder% equ 0 (
   echo Format correct
   if %EOLSize% equ 1 echo (perhaps last line have EOF instead the EOL = LF^)
) else if %remainder% equ 1 (
   echo EOF inserted at end of file
) else (
   set /A missing=lineLen-remainder
   if !missing! equ %EOLSize% (
      echo Last line have not EOL
   ) else (
      echo FORMAT INCORRECT
   )
)
echo/


Antonio

Squashman
Expert
Posts: 4486
Joined: 23 Dec 2011 13:59

Re: Determining the number of lines in a file.

#51 Post by Squashman » 11 Feb 2013 08:27

Thanks for all the feedback guys. I will test out some of this new code when I got some free time at work.

Post Reply