Delete rows from csv file when column value is not equal to

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
batching1
Posts: 7
Joined: 22 Jul 2010 08:49

Delete rows from csv file when column value is not equal to

#1 Post by batching1 » 22 Jul 2010 08:53

Hi,

I have a csv file in the below format
col1 col2 col3 col4
1 a 0 ree
2 b 3 ere
3 c 34 werwe
4 d 0 jkl

I want to delete rows where cols3 is not equal to 0(in this case row 2 and 3) and the header should remain intact as well.

output should be :

col1 col2 col3 col4
1 a 0 ree
4 d 0 jkl

Anyhelp here?

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Delete rows from csv file when column value is not equal

#2 Post by aGerman » 22 Jul 2010 09:50

If CSV means in your case "comma seprated values" (BTW for my German settings the separator is a semicolon), then you could try this:

Code: Select all

@echo off &setlocal
set /p "header="<"your old.csv"
>"your new.csv" echo.%header%
for /f "usebackq skip=1 delims=, tokens=1-3*" %%a in ("your old.csv") do (
  if "%%c"=="0" (
    >>"your new.csv" echo.%%a,%%b,%%c,%%d
  )
)



Regards
aGerman

batching1
Posts: 7
Joined: 22 Jul 2010 08:49

Re: Delete rows from csv file when column value is not equal

#3 Post by batching1 » 22 Jul 2010 10:11

Hi,

Thank you, Yes by csv imean comman seperated values.
The code will not work for me....it shows a blank output file.
My csv filw format is similar to a excel format...columns in seperate coumns and not comma or space...just like excel you can say...

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Delete rows from csv file when column value is not equal

#4 Post by aGerman » 22 Jul 2010 10:28

Open the file using a text editor (notepad or something similar) and tell me what separator you can find (comma, semicolon, tab,...).

Regards
aGerman

batching1
Posts: 7
Joined: 22 Jul 2010 08:49

Re: Delete rows from csv file when column value is not equal

#5 Post by batching1 » 22 Jul 2010 10:34

space/tab

Also instead of number i want to delete all rows where column3 in csv is equal the word forest.

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Delete rows from csv file when column value is not equal

#6 Post by aGerman » 22 Jul 2010 10:48

OK, think it's a tab. Now we have a small problem, because this website can not display a tab character.
Open you batchfile with notepad. Copy/paste:

Code: Select all

@echo off &setlocal
set /p "header="<"your old.csv"
>"your new.csv" echo.%header%
for /f "usebackq skip=1 tokens=1-3* delims=   " %%a in ("your old.csv") do (
  if "%%c"=="forest" (
    >>"your new.csv" echo.%%a,%%b,%%c,%%d
  )
)

Now you have to exchange the 3 spaces behind delims= by one single tab.
Save it and try.

Regards
aGerman

batching1
Posts: 7
Joined: 22 Jul 2010 08:49

Re: Delete rows from csv file when column value is not equal

#7 Post by batching1 » 22 Jul 2010 11:01

Hi it did not work...i guess i am missing something here...
This is my sample file:
number(1 tab space)city(1tab space)country
1 abc forest
2 jk super
3 kj forest

now each column is seperated by 1 tab space in my csv file.
I want to check coumn 3 which is country coumn and it anyting is not equal to forest delete that row completly....

and the header of the file(number, city, country) should remain...so this testing should start from line 2.

This is was i am trying to do....your code creates a new file but has no data.

batching1
Posts: 7
Joined: 22 Jul 2010 08:49

Re: Delete rows from csv file when column value is not equal

#8 Post by batching1 » 22 Jul 2010 11:31

this time it is copying the header from old file to new file, but not processing the data.
this is what i have:

Code: Select all


@echo off &setlocal
set /p "header="<"C:\Documents and Settings\a4\Desktop\ert\z\z.csv"
>"C:\Documents and Settings\a4\Desktop\ert\z\new.csv" echo.%header%
for /f "usebackq skip=1 tokens=2* delims=   " %%a in ("C:\Documents and Settings\a4\Desktop\ert\z\z.csv") do (
  if "%%c"=="forest" (
    >>"C:\Documents and Settings\a4\Desktop\ert\z\new.csv" echo.%%a,%%b,%%c,%%d
  )
)



the z.csv look like this:

no   user   country
1   we   forest
2   u   ere
3   me   forest



it wont prcess the data and the newz file has the header only n not data...expected output is:

Code: Select all

no   user   country
1   we   forest
3   me   forest

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Delete rows from csv file when column value is not equal

#9 Post by aGerman » 22 Jul 2010 11:58

OK, if we split the line by tab you will find in the third token " forest" instead of "forest". Change it:

Code: Select all

  if "%%c"==" forest" (


Regards
aGerman

batching1
Posts: 7
Joined: 22 Jul 2010 08:49

Re: Delete rows from csv file when column value is not equal

#10 Post by batching1 » 22 Jul 2010 12:55

Hi aGerman,

I was able get it working..thank you..another question...very similar...

below is the code:

Code: Select all

zA214746,USA,forest
zA221619,USA,forest
zA228274,USA,forest
zA229340,USA,forest
John Doe,USA,forest
zA237972,USA,forest
YA987989,USA,forest
ya989898,USA,forest
woy898989,USA,forest


i want to now delete all rows if in column 1 value does not start with "ZA******"(ZA followed by 6 digit), "za******", "YA******","ya******", "woy******"

So desired output shoul dbe without john doe row as it does not start with any of the above follwed by 6 digits...

how would i do that?

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Delete rows from csv file when column value is not equal

#11 Post by aGerman » 22 Jul 2010 13:47

Long command line, but should work...

Code: Select all

@echo off &setlocal
for /f "usebackq tokens=1* delims=," %%a in ("abc.csv") do (
  echo.%%a|findstr "\<ZA[0-9][0-9][0-9][0-9][0-9][0-9]\> \<za[0-9][0-9][0-9][0-9][0-9][0-9]\> \<YA[0-9][0-9][0-9][0-9][0-9][0-9]\> \<ya[0-9][0-9][0-9][0-9][0-9][0-9]\> \<woy[0-9][0-9][0-9][0-9][0-9][0-9]\>" >nul &&(
    >>"xyz.csv" echo.%%a,%%b
  )
)


Returns for me:

Code: Select all

YA987989,USA,forest
ya989898,USA,forest
woy898989,USA,forest


Regards
aGerman

batching1
Posts: 7
Joined: 22 Jul 2010 08:49

Re: Delete rows from csv file when column value is not equal

#12 Post by batching1 » 22 Jul 2010 14:07

but why wont it pick up za****** and ZA******...it shud delete only john doe row in this example...

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Delete rows from csv file when column value is not equal

#13 Post by aGerman » 22 Jul 2010 14:38

There is no ZA and no za in your example, only zA.
If it should not be case sensitive, you could use this:

Code: Select all

@echo off &setlocal
for /f "usebackq tokens=1* delims=," %%a in ("abc.csv") do (
  echo.%%a|findstr /i "\<za[0-9][0-9][0-9][0-9][0-9][0-9]\> \<ya[0-9][0-9][0-9][0-9][0-9][0-9]\> \<woy[0-9][0-9][0-9][0-9][0-9][0-9]\>" >nul &&(
    >>"xyz.csv" echo.%%a,%%b
  )
)

I can't be better than your question/information/example.

Regards
aGerman

ghostmachine4
Posts: 319
Joined: 12 May 2006 01:13

Re: Delete rows from csv file when column value is not equal

#14 Post by ghostmachine4 » 22 Jul 2010 20:16

batching1 wrote:Hi aGerman,

I was able get it working..thank you..another question...very similar...

below is the code:

Code: Select all

zA214746,USA,forest
zA221619,USA,forest
zA228274,USA,forest
zA229340,USA,forest
John Doe,USA,forest
zA237972,USA,forest
YA987989,USA,forest
ya989898,USA,forest
woy898989,USA,forest


i want to now delete all rows if in column 1 value does not start with "ZA******"(ZA followed by 6 digit), "za******", "YA******","ya******", "woy******"

So desired output shoul dbe without john doe row as it does not start with any of the above follwed by 6 digits...

how would i do that?


use a good csv file processing tool, not batch... eg if you can download gawk for windows, you can use this one liner

Code: Select all

C:\test>gawk "BEGIN{IGNORECASE=1}!/^(za|ya|woy)/{next}1" file
zA214746,USA,forest
zA221619,USA,forest
zA228274,USA,forest
zA229340,USA,forest
zA237972,USA,forest
YA987989,USA,forest
ya989898,USA,forest
woy898989,USA,forest

john doe is gone.

Also, for your first question, you said to delete column 3 that is not 0,

Code: Select all

C:\test>more file
col1,col2,col3,col4
1,a,0,ree
2,b,3,ere
3,c,34,werwe
4,d,0,jkl

C:\test> gawk -F"," "NR>1 && $3==0" file
1,a,0,ree
4,d,0,jkl

santhosh
Posts: 41
Joined: 02 Aug 2010 05:10

Re: Delete rows from csv file when column value is not equal

#15 Post by santhosh » 02 Aug 2010 12:27

Hi friends,
plz help me out...
i tried to delete 8th column which doesnot have "ap" but i didnt succeed......kindly help me.
Note:- marked in bold letter is 8th column.
input data sat.csv

ap ch rum sun ka jan jim bin sin
kl ka kim rim ka jan jim ap bun
tn kk lun sun kl nem jim bin bun


needed output new.csv

kl ka kim rim ka jan jim ap bun

Post Reply