Find files, concat, rename, convert, store in new folder
Posted: 17 Oct 2014 18:02
Hello all. I'm brand new to this forum. I've been around since DOS but abandoned it for Windows once they finally got that working (circa 2007 ), so I'm a real batch novice.
Several years ago I enlisted help from a forum like this one to help write a batch file. The batch worked like a charm for what it does. Now I need it to do something slightly different. What I start with is a CD containing TIF files which are named for the document name where the extension is the page number. These are scanned documents named, for example 01424.001, 01424.002, etc. depending on how many pages there are. What the batch file does is to go to the CD, collect all the TIF page files for one document and stores them inside a prepared destination folder, concatenates the collection into one file inside the destination folder, converts that file to a PDF format, and stores the new PDF file into the destination folder. Then it moves to the next document until it reaches the end of the files on the CD. When it is finished converting it deletes all the concatenated TIF files.
Now I have changed jobs. In my new situation I still need the original batch file for CD conversions; however, I need something slightly different, too. The people at the new job have been storing all these .001, .002... files in folders for about 10 years. They have a fairly good filing system by numbered volumes, but the volumes are buried inside other volumes. So to get to volume 645 I have to open volume 640-649 first. Then the files I need to convert are inside volume 645. There are almost 800,000 of the TIF files residing inside over 1,000 volume folders.
What I would like to have is a batch file (or something) that will convert this mass of files all at once. Something that will peek into the nest of folders, find the *.001 TIF files inside each folder, run the concatenation and conversion, and then store the new PDF file inside a new folder with the name of the original folder as part of the new name for the PDF. So the new PDF files that came from folder named 645 will be named something like "Volume 645 page 1424.pdf" and they will be located inside a folder named "645 Converted." And it would be incredible if the conversion routine would increment from folder to folder to folder running the conversions. This will be a one-time conversion, unless I change jobs again.
Here is the batch file I'm using now. The subroutines called tiffcp and tiff2pdf, along with other support routines, reside inside the c:\bin folder.
----------------------
@echo off & setlocal EnableExtensions ENableDelayedExpansion
set oldpath=%PATH%
set PATH=%PATH%;c:\bin;
:: Source location for the original files (SRC) on CD
set SRC=d:\
:: Destination location for the PDF files (DST)
set DST=c:\Destination\
:: Commands are only echoed until %DEB% ist set to nothing
set DBG=ECHO/
::set "DBG="
pushd %SRC%
for /F "tokens=*" %%A in ('dir /B/A-D/ONE "*.001"') do (
set "PG="
for /F "tokens=*" %%B in (
'dir /B/A-D/ONE "%%~nA.0*"') do set PG=!PG! %%~nxB
::Concatenate the files into the Destination folder with a .TIF extension
tiffcp -c lzw !PG! %DST%%%~nA.TIF
::Convert the TIF file to a PDF
tiff2pdf -o %DST%%%~nA.PDF %DST%%%~nA.TIF
::Delete the TIF files
DEL %DST%%%~nA.TIF
)
POPD
::Reset the path to the original path
set PATH=%oldpath%
------------------------
Several years ago I enlisted help from a forum like this one to help write a batch file. The batch worked like a charm for what it does. Now I need it to do something slightly different. What I start with is a CD containing TIF files which are named for the document name where the extension is the page number. These are scanned documents named, for example 01424.001, 01424.002, etc. depending on how many pages there are. What the batch file does is to go to the CD, collect all the TIF page files for one document and stores them inside a prepared destination folder, concatenates the collection into one file inside the destination folder, converts that file to a PDF format, and stores the new PDF file into the destination folder. Then it moves to the next document until it reaches the end of the files on the CD. When it is finished converting it deletes all the concatenated TIF files.
Now I have changed jobs. In my new situation I still need the original batch file for CD conversions; however, I need something slightly different, too. The people at the new job have been storing all these .001, .002... files in folders for about 10 years. They have a fairly good filing system by numbered volumes, but the volumes are buried inside other volumes. So to get to volume 645 I have to open volume 640-649 first. Then the files I need to convert are inside volume 645. There are almost 800,000 of the TIF files residing inside over 1,000 volume folders.
What I would like to have is a batch file (or something) that will convert this mass of files all at once. Something that will peek into the nest of folders, find the *.001 TIF files inside each folder, run the concatenation and conversion, and then store the new PDF file inside a new folder with the name of the original folder as part of the new name for the PDF. So the new PDF files that came from folder named 645 will be named something like "Volume 645 page 1424.pdf" and they will be located inside a folder named "645 Converted." And it would be incredible if the conversion routine would increment from folder to folder to folder running the conversions. This will be a one-time conversion, unless I change jobs again.
Here is the batch file I'm using now. The subroutines called tiffcp and tiff2pdf, along with other support routines, reside inside the c:\bin folder.
----------------------
@echo off & setlocal EnableExtensions ENableDelayedExpansion
set oldpath=%PATH%
set PATH=%PATH%;c:\bin;
:: Source location for the original files (SRC) on CD
set SRC=d:\
:: Destination location for the PDF files (DST)
set DST=c:\Destination\
:: Commands are only echoed until %DEB% ist set to nothing
set DBG=ECHO/
::set "DBG="
pushd %SRC%
for /F "tokens=*" %%A in ('dir /B/A-D/ONE "*.001"') do (
set "PG="
for /F "tokens=*" %%B in (
'dir /B/A-D/ONE "%%~nA.0*"') do set PG=!PG! %%~nxB
::Concatenate the files into the Destination folder with a .TIF extension
tiffcp -c lzw !PG! %DST%%%~nA.TIF
::Convert the TIF file to a PDF
tiff2pdf -o %DST%%%~nA.PDF %DST%%%~nA.TIF
::Delete the TIF files
DEL %DST%%%~nA.TIF
)
POPD
::Reset the path to the original path
set PATH=%oldpath%
------------------------