How to count semicolons (;) in a row?

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
miskox
Posts: 630
Joined: 28 Jun 2010 03:46

How to count semicolons (;) in a row?

#1 Post by miskox » 29 Sep 2017 07:52

Hi all!

I have a .txt file (.csv delimeted by ;).

How can I count semicolons in each row?

Code: Select all

field1;field2;field3;
field1;field2;field3;
field1;field2;field3;
field1;field2;
field1;field2;field3;


In this way I would be able to tell if there is a row in a file that is corrupt (wrong) - wrong number of semicolons.

Let's say that there must be 3 semicolons in a row. If there is a row with different numbers of semicolons (less than 3 or more than 3) I want this row to be displayed.

Any ideas? I don't know where to start.

Thanks.
Saso

Squashman
Expert
Posts: 4486
Joined: 23 Dec 2011 13:59

Re: How to count semicolons (;) in a row?

#2 Post by Squashman » 29 Sep 2017 08:34

I believe this function will help you.
viewtopic.php?f=3&t=6429#p41035

Code: Select all

@echo off

FOR /F "delims=" %%G IN (input.txt) DO (
   CALL :occur "%%~G"
)
pause
GOTO :EOF
:occur
setlocal EnableDelayedExpansion
set i=0
set "x=%~1"
set "x!i!=%x:;=" & set /A i+=1 & set "x!i!=%"
echo number of semicolons: %i%
endlocal

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: How to count semicolons (;) in a row?

#3 Post by aGerman » 29 Sep 2017 11:29

Actually you don't even need to create an assoziative array.

Code: Select all

@echo off &setlocal
set "file=test.csv"
set "n=3"

for /f usebackq^ delims^=^ eol^= %%i in ("%file%") do (
  setlocal
  set "x=%%i"
  call :count || echo %%i
  endlocal
)
pause
exit /b

:count
set "x=%x:;=" & set /a n-=1 & set "x=%"
exit /b %n%

Steffen
Last edited by aGerman on 29 Sep 2017 14:30, edited 1 time in total.
Reason: disabled eol to ensure lines with leading semi-colons will be processed

Squashman
Expert
Posts: 4486
Joined: 23 Dec 2011 13:59

Re: How to count semicolons (;) in a row?

#4 Post by Squashman » 29 Sep 2017 11:48

I like how you set the exitcode and used that with conditional execution. Very clever!

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: How to count semicolons (;) in a row?

#5 Post by dbenham » 29 Sep 2017 14:40

Yes, very clever solution.

But the n value must be reset each iteration.
And there is no need to set any string variables when counting.
And with a bit more work, the technique can safely deal with quotes and poison characters, though quoted ; will be mistakenly counted either way.
And no need to SETLOCAL/ENDLOCAL within loop, unless you really can't have an extra variable defined after the test.

Code: Select all

@echo off
setlocal
set "file=test.csv"
set "count=3"

for /f "usebackq delims=" %%i in ("%file%") do (
  set "x=%%i"
  call :count || echo %%i
)
pause
exit /b

:count
set /a n=count
set "x=%x:"=%"
break "%x:;="&set /a n-=1&break "%"
exit /b %n%

I'm not sure what is the fastest "null op" command. Other options are

Code: Select all

rem.
rem^ %=with a trailing space after the caret=%

The technique will have problems if a line length gets anywhere close to 8191 bytes long.

Another option is to use the fast strlen function to get the string length, remove all semicolons, and then compute the new length, and subtract to get the count.

Dave Benham

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: How to count semicolons (;) in a row?

#6 Post by aGerman » 29 Sep 2017 14:55

You are absolutely right about the setlocal/endlocal. And to be honest I never thought about to use any other command than SET :shock:

Bookmarked :wink:

Steffen

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: How to count semicolons (;) in a row?

#7 Post by dbenham » 30 Sep 2017 10:13

I just realized that the n count variable only needs to be reset if you remove the SETLOCAL/ENDLOCAL.

I'm thinking that resetting one variable is more efficient than SETLOCAL/ENDLOCAL. But I don't think I've actually ever done any testing.


Dave Benham

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: How to count semicolons (;) in a row?

#8 Post by aGerman » 30 Sep 2017 12:20

Yes executing setlocal/endlocal multiple times is slow. I don't know exactly how it was implemented but since endlocal has to restore the original environment I assume that setlocal saved a copy of it. Resetting only one environment variable is for sure much more efficient.

Steffen

miskox
Posts: 630
Joined: 28 Jun 2010 03:46

Re: How to count semicolons (;) in a row?

#9 Post by miskox » 02 Oct 2017 07:02

Thank you all!

It really works - though I really don't understand the code.

Thanks again.

Saso

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: How to count semicolons (;) in a row?

#10 Post by aGerman » 02 Oct 2017 13:21

miskox wrote:I really don't understand the code.

I merged Dave's proposal with my FOR options and added some remarks.

Code: Select all

@echo off &setlocal
set "file=test.csv"
set "count=3"

REM Read the file line-wise and assign them to %%i.
REM The escape sequences are to set eol to nothing. Thus, lines with leading semi-colons are processed as well.
for /f usebackq^ delims^=^ eol^= %%i in ("%file%") do (
  REM Assign the content of %%i to x.
  set "x=%%i"
  REM Call subroutine :count. If an errorlevel other than 0 was returned then echo will be executed.
  call :count || echo %%i
)
pause
exit /b

:count
REM Reset the variable n.
set /a n=count
REM Remove all quotation marks in x.
set "x=%x:"=%"
REM Access the internal iterations that CMD does if a character will be replaced.
REM Thus, for each occurrence of a semi-colon n will be decreased by one.
REM Only if exactly 3 semi-colons where found n is 0 after this operation.
break "%x:;="&set /a n-=1&break "%"
REM Return n as errorlevel to the point where :count was called.
exit /b %n%

Steffen

miskox
Posts: 630
Joined: 28 Jun 2010 03:46

Re: How to count semicolons (;) in a row?

#11 Post by miskox » 03 Oct 2017 04:19

Steffen, thank you. The part I don't understand is:

Code: Select all

REM Access the internal iterations that CMD does if a character will be replaced.
REM Thus, for each occurrence of a semi-colon n will be decreased by one.
REM Only if exactly 3 semi-colons where found n is 0 after this operation.
break "%x:;="&set /a n-=1&break "%"


I ran the code with ECHO ON to see what happens. I see that BREAK (or CMD?) executes these commands so many times as how many fields are there (without being a loop there or something). This part is a mystery to me.

Code: Select all

>break "field15"  & set /a n-=1  & break "field25"  & set /a n-=1  & break "field35"  & set /a n-=1  & break ""


Thanks.
Saso

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: How to count semicolons (;) in a row?

#12 Post by aGerman » 03 Oct 2017 05:18

That's the tricky part Saso :lol:
If you remove the SET /A command and the concatenated second BREAK then the following is left over:

Code: Select all

break "%x:;=%"

Forget about BREAK. This is a command which absolutely does no operation and is only in place to have a valid syntax. What we are after is the string manipulation where the semi-colons in x are replaced with "nothing". In order to perform this replacement the CMD has to begin at the first character and search for the first occurrence of a semi-colon. Then it will be replaced. After that the CMD begins again with the next character and searches the next semi-colon. This will be repeated until the end of the string was reached.
As your test with ECHO ON shows we are able to access these internal iterations if you put command concatenations inside of the replacement syntax. That's an undocumented behavior but it works :wink:

Steffen

Aacini
Expert
Posts: 1914
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: How to count semicolons (;) in a row?

#13 Post by Aacini » 03 Oct 2017 06:57

@miskox,

A detailed explanation of this method is given at this thread.

Antonio

Post Reply