Page 1 of 1

Filter out a line that is too long from a txt file...

Posted: 17 Jun 2010 10:02
by dust
Hello,

How can I filter out a line that is too long from a text file?
The maximum length each line should be is 24 characters.
Every line longer than that should be deleted.

Here is a extract from the text file:
K# 331 20100616 06:23:47
E# 73 20100616 06:56:06
B# 123 20100616 07:09:04
D# 122 20100616 07:11:18
D”üÈ@@ÌÀG# 172 20100616 07:21:18
K# 331 20100616 07:26:11
J# 190 20100616 07:44:41
H# 98 20100616 08:01:10
E# 73 20100616 08:11:29

I want to filter out this line: D”üÈ@@ÌÀG# 172 20100616 07:21:18
and delete it from the text file.
Now, the "wrong line" data is not always constant and has different characters sometimes.
I think the only way to do this is to count each line length and delete everything that is longer than 24
characters.

Any help will be appreciated.

Re: Filter out a line that is too long from a txt file...

Posted: 17 Jun 2010 13:27
by aGerman
To count the number of characters is not a good way. But if needed you could use that:

Code: Select all

@echo off

>"new.txt" type nul
for /f "usebackq delims=" %%a in ("old.txt") do (
  for /f "skip=1 delims=:" %%b in ('^(echo.%%a^&echo.^)^|findstr /o $') do (
    if %%b leq 27 (
      >>"new.txt" echo %%a
    )
  )
)


But this is verry slow and will not work if you would have characters like <>|& in a line. (Btw, the 27 is right for 24 characters)

I would prefer to use regular expressions to figure out if a line has the right style. For your example the pattern could be as the following.

Code: Select all

@echo off

>"new.txt" findstr /r /c:"^.# [0-9][0-9]* 20[0-9][0-9][0-1][0-9][0-3][0-9] [0-2][0-9]:[0-5][0-9]:[0-5][0-9]$" "old.txt"


Much faster and imo much better.

Regards
aGerman

Re: Filter out a line that is too long from a txt file...

Posted: 18 Jun 2010 01:52
by amel27
dust wrote:The maximum length each line should be is 24 characters.
Every line longer than that should be deleted.
most simple regular expression... :)

Code: Select all

>new.txt findstr /vbr "........................." old.txt

Re: Filter out a line that is too long from a txt file...

Posted: 18 Jun 2010 02:07
by dust
Hello,
thank you everyone for your replies this is much
appreciated.
However if I put these lines into my batch file it deletes every single line
that is in the original text file...
I dont know much about batch files - any help will be highly appreciated.

Thank you

dust

Re: Filter out a line that is too long from a txt file...

Posted: 18 Jun 2010 02:23
by dust
Hi Everyone,
my apologies - I was just being too stupid....
mixed up the "oldfile - newfile" anyways.

I went with the code from amel27.

>new.txt findstr /vbr "........................." old.txt

seems to be doing the job nicely - thank you amel.
This will definitely cull any line longer than 24 characters?

Thank you

Dust

Re: Filter out a line that is too long from a txt file...

Posted: 18 Jun 2010 02:41
by amel27
dust wrote:This will definitely cull any line longer than 24 characters?
yes, and this reject any line shorter than 25 characters:

Code: Select all

>new.txt findstr /br "........................." old.txt

Re: Filter out a line that is too long from a txt file...

Posted: 18 Jun 2010 03:17
by dust
amel27 wrote:
dust wrote:This will definitely cull any line longer than 24 characters?
yes, and this reject any line shorter than 25 characters:

Code: Select all

>new.txt findstr /br "........................." old.txt


Thank you amel27 - this helped a lot and solved my problem right away.

Good Stuffffffff!!!!

regards,

dust

Re: Filter out a line that is too long from a txt file...

Posted: 18 Jun 2010 14:25
by avery_larry
*untested*

Code: Select all

setlocal enabledelayedexpansion
for /f "usebackq delims=" %%a in (old.txt) do (
   set "tmp_var=%%a"
   if "!tmp_var:~24,1!"=="" echo.%%a>>new.txt
)


Note this will also filter out blank lines. The code to include blank lines gets more complicated.

Re: Filter out a line that is too long from a txt file...

Posted: 18 Jun 2010 22:45
by amel27
slightly modified script for variable string length:

Code: Select all

set len=25
for /l %%a in (1,1,%len%) do call set b=.%%b%%
>new.txt findstr /vbr "%b%" old.txt