First things first, you can dramatically improve performance of the original script if you update to the
newest version of JREPL.BAT, and use /JMATCHQ (which requires addition of "$txt=" - see the /JMATCHQ documentation), instead of /JMATCH .
Code: Select all
@echo off
setlocal
CALL C:\Hyperion_Batch\Scripts\NTScripts\_env.cmd
set "input=%LOCALEXPORTPATH%ASCEND_Portfolio_Attr_Input.csv"
set "output=%LOCALEXPORTPATH%ASCEND_Portfolio_Attr_Output.csv"
set "beg=var id,c,col=new Array()"
set "begln=c=0"
set "find=(?:^|,)(\q(\q\q|[^\q])*\q|[^,]*)"
set "repl=if(++c==1)id=$0;if(ln==1)col[c]=$0;$txt=ln>2&&c>1?id+col[c]+$0:false"
CALL "%UTILPATH%JREPL\JREPL.bat" find repl /x /v /jmatchq /jbeg beg /jbegln begln /f "%input%" /o "%output%"
I tested performance with a 4000 line input file, all with the most recent version. Using the old /JMATCH, it took 18.5 seconds, and with the new /JMATCHQ it only took 1 second
The performance gain might be even more impressive depending on which version of JREPL you are currently using.
Now your new requirement is only slightly changed. You no longer are skipping a line, and you want to preserve the first two columns (what I call the id) with each replication, not just one. I'm assuming there is still a possibility for quoted values, even though you do not show any in your sample. So I will preserve the complexity of allowing quoted values (possibly with embedded commas and or escaped quotes).
So why not create a single batch script that can handle both jobs (and many others)
You just need to pass in a few configuration parameters:
1 - Input File
2 - Output File
3 - How many extra columns you want to preserve as part of the "ID"
4 - How many lines to skip before you get actual data
To make construction of the FIND regex easier, I first define a FIELD variable that holds a regex that matches a single field value. The variable makes it easy to replicate that regex snippet as needed.
UNPIVOT.BATCode: Select all
:: UNPIVOT.BAT InputFile OutputFile ExtraIDColCount SkipCount
@echo off
setlocal
set "input=%~1"
set "output=%~2"
set "idCnt=%~3"
set "skip=%~4"
set "beg=var id,c,col=new Array()"
set "begln=c=0"
set "field=(?:\q(?:\q\q|[^\q])*\q|[^,]*)"
set "find=^%field%(?:,%field%){%idCnt%}|,%field%"
set "repl=if(++c==1)id=$0;if(ln==1)col[c]=$0;$txt=ln>%skip%&&c>1?id+col[c]+$0:false"
call jrepl find repl /x /v /jmatchq /jbeg beg /jbegln begln /f "%input%" /o "%output%"
So to process your original files:
Code: Select all
UNPIVOT "originalFormatInput" "originalFormatOutput" 0 2
To process your new format:
Code: Select all
UNPIVOT "newFormatInput" "newFormatOutput" 1 1
Dave Benham