Compare two csv files

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
SIMMS7400
Posts: 546
Joined: 07 Jan 2016 07:47

Compare two csv files

#1 Post by SIMMS7400 » 14 Apr 2018 05:42

Hi Folks -

I have a need to compare (2) columns of (2) csv files. If a match is found, spool the entire line from FileA to a file name based on the value on column (2) of FileB. FileB is my master mapping file.

For instance, this is the content of FileA:

Code: Select all

SCENARIO, Local Cost Center,HC Category,FISCAL YEAR, PERIOD,Number of Headcount
ACT,JP000000440600,1,2017,12,3
ACT,JP000000440700,1,2017,12,4
ACT,JP000003290000,13,2017,12,0
ACT,JP000004850000,3,2017,12,0
ACT,JP000014240000,1,2017,12,14
ACT,JP000014240000,97,2017,12,5
ACT,JP000014240000,3,2017,12,0
ACT,JP000001550000,3,2017,12,0
ACT,JP000017600000,1,2017,12,6
ACT,JP000017610000,1,2017,12,3
ACT,JP000018230000,1,2017,12,4
ACT,JP000018380000,1,2017,12,1
ACT,JP000018900000,1,2017,12,3
ACT,JP000002120000,1,2017,12,5
ACT,JP000002120000,97,2017,12,2
ACT,JP000002730000,1,2017,12,3
ACT,JP000002730000,97,2017,12,1
ACT,JP000002740000,1,2017,12,4
ACT,JP000003310000,1,2017,12,5
ACT,JP000003310000,3,2017,12,0
ACT,JP000003320000,1,2017,12,5
ACT,JP000003320000,97,2017,12,1
ACT,JP000003330000,1,2017,12,6
ACT,JP000003340000,1,2017,12,2
ACT,JP000003350000,1,2017,12,2
ACT,JP000003350000,3,2017,12,0
ACT,JP000003360000,1,2017,12,3
ACT,JP000003360000,3,2017,12,0
ACT,JP000003460000,1,2017,12,1
ACT,JP000000410100,1,2017,12,10
ACT,JP000000410100,97,2017,12,4
ACT,JP000000411300,1,2017,12,6
ACT,JP000000440100,1,2017,12,18
ACT,JP000000440100,97,2017,12,4
ACT,JP000000440200,1,2017,12,7
ACT,JP000000440200,97,2017,12,2
ACT,JP000000440300,1,2017,12,7
ACT,JP000000440300,97,2017,12,3
ACT,JP000000440400,1,2017,12,6
ACT,JP000000440400,97,2017,12,3
ACT,JP000000440500,1,2017,12,4
ACT,JP000000440500,97,2017,12,1
ACT,JP000000451100,1,2017,12,10
ACT,JP000000451100,97,2017,12,1
ACT,JP000000451200,1,2017,12,4
ACT,JP000000451300,1,2017,12,6
ACT,JP000000451400,1,2017,12,8
ACT,JP000000460200,1,2017,12,20
ACT,JP000000460200,97,2017,12,3
ACT,JP000004690000,1,2017,12,1
ACT,JP000004690000,13,2017,12,0
ACT,JP000004690000,3,2017,12,0
ACT,JP000000470600,1,2017,12,6
ACT,JP000000470600,97,2017,12,2
ACT,JP000000470700,1,2017,12,7
ACT,JP000000470700,97,2017,12,1
ACT,JP000000470800,1,2017,12,6
ACT,JP000000470800,97,2017,12,2
ACT,JP000000470800,3,2017,12,0
ACT,JP000000480002,1,2017,12,19
ACT,JP000000480002,97,2017,12,4
ACT,JP000000480003,1,2017,12,12
ACT,JP000000480003,97,2017,12,8
ACT,JP000000480004,1,2017,12,9
ACT,JP000000480004,97,2017,12,6
ACT,JP000000480006,1,2017,12,6
ACT,JP000000480006,97,2017,12,2
ACT,JP000051160000,1,2017,12,26
ACT,JP000051160000,97,2017,12,3
ACT,JP000051160000,3,2017,12,0
ACT,JP000051440000,1,2017,12,35
ACT,JP000051440000,3,2017,12,0
ACT,JP000052220000,1,2017,12,36
ACT,JP000052220000,97,2017,12,8
ACT,JP000052220000,3,2017,12,0
ACT,JP000052240000,1,2017,12,9
ACT,JP000052540000,97,2017,12,1
ACT,JP000052590000,1,2017,12,3
ACT,JP000052591000,1,2017,12,1
ACT,JP000052800000,1,2017,12,6
ACT,JP000052980000,1,2017,12,12
ACT,JP000052980000,3,2017,12,0
ACT,JP000053010000,1,2017,12,1
ACT,JP000053010000,97,2017,12,2
ACT,JP000053020000,1,2017,12,2
ACT,JP000053020000,97,2017,12,2
ACT,JP000053620000,1,2017,12,1
ACT,JP000060800000,1,2017,12,32
ACT,JP000060800000,3,2017,12,0
ACT,JP000076060000,1,2017,12,6
ACT,JP000076060000,97,2017,12,1
ACT,JP000076060000,3,2017,12,0
ACT,JP000076090000,1,2017,12,16
ACT,JP000076090000,97,2017,12,2
ACT,JP000076110000,1,2017,12,14
ACT,JP000076110000,97,2017,12,2
ACT,JP000076110000,3,2017,12,0
ACT,JP000076120000,1,2017,12,5
ACT,JP000076120000,97,2017,12,1
ACT,JP000076121000,3,2017,12,0
ACT,JP000076122000,1,2017,12,6
ACT,JP000076140000,1,2017,12,2
ACT,JP000076140000,97,2017,12,3
ACT,JP000076150000,1,2017,12,7
ACT,JP000076171000,1,2017,12,8
ACT,JP000076171000,97,2017,12,2
ACT,JP000076173000,1,2017,12,65
ACT,JP000076173000,97,2017,12,10
ACT,JP000076173000,3,2017,12,0
ACT,JP000076174000,1,2017,12,95
ACT,JP000076174000,97,2017,12,3
ACT,JP000076175000,1,2017,12,42
ACT,JP000076175000,97,2017,12,15
ACT,JP000076182000,13,2017,12,0
ACT,JP000076190000,1,2017,12,24
ACT,JP000076190000,97,2017,12,3
ACT,JP000076190000,3,2017,12,0
ACT,JP000076240000,1,2017,12,20
ACT,JP000076240000,97,2017,12,1
ACT,JP000076240000,3,2017,12,0
ACT,JP000076250000,1,2017,12,28
ACT,JP000076250000,97,2017,12,1
ACT,JP000076251000,1,2017,12,18
ACT,JP000076270000,1,2017,12,5
ACT,JP000076360000,1,2017,12,2
ACT,JP000076360000,97,2017,12,1
ACT,JP000076410005,1,2017,12,4
ACT,JP000076410006,1,2017,12,1
ACT,JP000076410008,1,2017,12,4
ACT,JP000076410009,1,2017,12,1
ACT,JP000076411001,1,2017,12,46
ACT,JP000076411001,97,2017,12,24
ACT,JP000076420000,1,2017,12,6
ACT,JP000076420000,97,2017,12,6
ACT,JP000076420000,3,2017,12,1
ACT,JP000076420001,1,2017,12,5
ACT,JP000076420001,97,2017,12,1
ACT,JP000076420001,3,2017,12,0
ACT,JP000076420003,1,2017,12,3
ACT,JP000076420004,1,2017,12,9
ACT,JP000076540000,1,2017,12,1
ACT,JP000076560000,1,2017,12,15
ACT,JP000076560000,97,2017,12,3
ACT,JP000076560000,3,2017,12,0
ACT,JP000076570000,3,2017,12,0
ACT,JP000076570001,3,2017,12,0
ACT,JP000076580000,1,2017,12,2
ACT,JP000076580000,97,2017,12,1
ACT,JP000076580003,1,2017,12,5
ACT,JP000076580007,1,2017,12,8
ACT,JP000076580007,97,2017,12,1
ACT,JP000076580008,1,2017,12,2
ACT,JP000076580009,1,2017,12,5
ACT,JP000076580011,1,2017,12,3
ACT,JP000076670000,1,2017,12,2
ACT,JP00007660000C,1,2017,12,22
ACT,JP00007660000C,97,2017,12,7
ACT,JP00007660100C,1,2017,12,33
ACT,JP00007660100C,97,2017,12,7
ACT,JP00007660200C,1,2017,12,46
ACT,JP00007660300C,1,2017,12,46.6
ACT,JP00007660300C,97,2017,12,5
ACT,JP00007660400C,1,2017,12,33
ACT,JP00007660400C,97,2017,12,26
ACT,JP00007660500C,1,2017,12,57
ACT,JP00007660500C,97,2017,12,8
ACT,JP00007660600C,1,2017,12,4
ACT,JP00007660600C,97,2017,12,1
ACT,JP000076281000,3,2017,12,1
ACT,JP000052810000,3,2017,12,1
This is the content of FileB:

Code: Select all

Cost Center ID #1,ACT No
JP000000440600,3
JP000000440700,3
JP000003290000,3
JP000004850000,3
JP000014240000,3
JP000017600000,3
JP000017610000,3
JP000018230000,3
JP000018380000,3
JP000018900000,3
JP000002120000,3
JP000002730000,3
JP000002740000,3
JP000003310000,3
JP000004690000,3
JP000003320000,3
JP000003330000,3
JP000003340000,3
JP000003350000,3
JP000003360000,3
JP000003460000,3
JP000000410100,3
JP000000411300,3
JP000000440100,3
JP000000440200,3
JP000000440300,3
JP000000440400,3
JP000000440500,3
JP000000451100,3
JP000000451200,3
JP000000451300,3
JP000000451400,3
JP000000460200,3
JP000000470600,3
JP000000470700,3
JP000000470800,3
JP000000480002,3
JP000000480003,3
JP000004830000,3
JP000000480006,3
JP000076570002,3
JP000001550000,3
JP000003300000,3
JP000004890000,3
JP000051160000,2
JP000051440000,2
JP000052220000,2
JP000052240000,2
JP000052540000,2
JP000052590000,2
JP000052591000,2
JP000052800000,2
JP000052980000,2
JP000052980000,2
JP000053010000,2
JP000053020000,2
JP000053620000,2
JP000060800000,2
JP000076560000,2
JP000076060000,1
JP000076090000,1
JP000076110000,1
JP000076120000,1
JP000076121000,1
JP000076122000,1
JP000076140000,1
JP000076150000,1
JP000076171000,1
JP000076173000,1
JP000076174000,1
JP000076175000,1
JP000076182000,1
JP000076190000,1
JP000076240000,1
JP000076250000,1
JP000076251000,1
JP000076270000,1
JP000076360000,1
JP000076410005,1
JP000076410006,1
JP000076410008,1
JP000076410009,1
JP000076411001,1
JP000076420000,1
JP000076420001,1
JP000076420003,1
JP000076420004,1
JP000076540000,1
JP000076560000,1
JP000076570000,1
JP000076570001,1
JP000076580000,1
JP000076580003,1
JP000076580004,1
JP000076580006,1
JP000076580007,1
JP000076580008,1
JP000076580009,1
JP000076580011,1
JP000076670000,1
JP00007660200C,1
JP00007660300C,1
JP00007660400C,1
JP00007660500C,1
JP00007660600C,1
JP00007660000C,1
JP00007660100C,1
JP000004690000,1
JP000076670001,1
JP000076671000,1
JP000076672000,1
JP000076673000,1
JP000076674000,1
JP000076760000,1
JP000019200000,1
JP000076420005,1
JP000076420007,1
So, if a value of column 2 of FileA is found in column 1 of FileB, spool the entire line of FileA into a file name HCT_ACT_0*.csv.
The asterix (number) is determine by column 2 of FileB (mapping file).

I'm having some trouble getting two forloops to work and I' sure there is a better option than what I'm trying to build. Thank you very much!

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: Compare two csv files

#2 Post by aGerman » 14 Apr 2018 08:10

You may use FINDSTR.

Code: Select all

>"HCT_ACT_0whatever.csv" (
  for /f "usebackq skip=1 tokens=2 delims=," %%i in ("FileA.csv") do findstr /bc:"%%i," "FileB.csv"
)
Steffen

//EDIT missed the rule for the asterisk

Code: Select all

for /f "usebackq skip=1 tokens=2 delims=," %%i in ("FileA.csv") do for /f "delims=" %%j in ('findstr /bc:"%%i," "FileB.csv"') do for /f "tokens=2 delims=," %%k in ("%%j") do >"HCT_ACT_0%%k.csv" echo %%j

SIMMS7400
Posts: 546
Joined: 07 Jan 2016 07:47

Re: Compare two csv files

#3 Post by SIMMS7400 » 14 Apr 2018 09:29

Hi Steffan -

Thanks for this!

The process is building the correct file names, but its spooling the contents of FileB instead of FileA. Essentailly, if the value if found in FileB, spool the entire line from FileA.

For instance, this is what file "HCT_ACT_03" looks like after I run the script:

Code: Select all

JP000000480006,3
JP000000440600,3
JP000000440700,3
JP000003290000,3
JP000004850000,3
JP000014240000,3
JP000014240000,3
JP000014240000,3

SIMMS7400
Posts: 546
Joined: 07 Jan 2016 07:47

Re: Compare two csv files

#4 Post by SIMMS7400 » 14 Apr 2018 12:05

I was thinking something like this:

Code: Select all

@ECHO OFF
SETLOCAL ENABLEDELAYEDEXPANSION

cd /d %~dp0

DEL /F /Q "HCT_ACT_*.csv" >nul 2>&1
FOR /F "usebackq tokens=1-6 delims=," %%A IN ("FileA.csv") DO (
	FOR /F "usebackq tokens=1-2 delims=," %%a IN ("Mapping.csv") DO (
		IF "%%B"=="%%a" (
			IF "%%b"=="1" (
			ECHO %%A,%%B,%%C,%%D,%%E,%%F>>"HCT_ACT_01.csv"
			)
			IF "%%b"=="2" (
			ECHO %%A,%%B,%%C,%%D,%%E,%%F>>"HCT_ACT_02.csv"
			)
			IF "%%b"=="3" (
			ECHO %%A,%%B,%%C,%%D,%%E,%%F>>"HCT_ACT_03.csv"
			)
		)
	)
)

pause

Aacini
Expert
Posts: 1914
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: Compare two csv files

#5 Post by Aacini » 14 Apr 2018 21:54

Code: Select all

@echo off
setlocal EnableDelayedExpansion

rem Load the master mapping table from FileB
for /F "skip=1 tokens=1,2 delims=," %%a in (FileB.txt) do set "FileB[%%a]=%%b"

rem Do the mapping
for /F "skip=1 tokens=1,2* delims=," %%a in (FileA.txt) do (
   rem So, if a value of column 2 of FileA is found in column 1 of FileB
   if defined FileB[%%b] (
      rem spool the entire line of FileA into a file name HCT_ACT_0#.csv
      rem The number is determine by column 2 of FileB
      >> HCT_ACT_0!FileB[%%b]!.csv echo %%a,%%b,%%c
   )
)
Antonio

Post Reply