The task you requested is complicated and complex; however, you have not posted a single specification that could help us to develop such a solution. How many files could be? Have all the files the elements in the same order? Are there special characters in the file? What is the length of the longest line? In order to write a program that solve this request we have to
assume a lot of things; the problem will arise if the assumptions do not correspond to the actual data...
This problem is interesting, so I wrote a possible solution for it. There are many different ways to solve this problem. I choose a method based on several concurrent (parallel) processes that process a file each, so the different sections of the output result are synchronized via
WAITFOR signals. IMHO this is the most efficient method to solve this problem.
The structure of the *.xml files must be this one:
- <?xml ..... >
- <COLLADA ...>
- <asset>
- Variable number of lines
- </asset>
From next line on, this tag structure repeats:
- <tagName>
- <name id="child element 1 id" ...>
- variable number of lines
- </name>
- <name id="child element 2 id" ...>
- variable number of lines
- </name>
- . . .
- </tagName>
And the file ends in:
- </COLLADA>
The *First* input file define the shape of the output result, that is, it specifies the order of output elements. If another input file have elements in different order or have not the same elements than the first file, the method will fail.
No line longer than 8192 characters will be read (nor copied). There is no easy way to circumvent this Batch limitation.
This is the first version of my solution:
Code: Select all
@echo off
setlocal EnableDelayedExpansion
REM https://www.dostips.com/forum/viewtopic.php?f=3&t=10579
REM Antonio Perez Ayala
rem If this .bat file is asynchronously invoked as a coroutine: start it
if "%1" equ "" goto begin
if %1 equ 1 (goto StartFirst) else goto StartRest
rem Start the asynchronous process of each *.xml file:
rem First Part:
rem - Copy up to "</asset>" line of first file
rem - and omit up to "</asset>" line in rest of files
rem Second Part:
rem - Process each parent tag in first file; then
rem - process same parent tag in rest of files
:begin
set n=0
for %%f in (*.xml) do (
set /A n+=1
set "file[!n!]=%%~Nf"
)
ECHO Start Process @ %time:~0,-3%
del output.txt 2> NUL
for /L %%i in (1,1,%n%) do start "" /B "%~F0" %%i
WaitFor File1End > NUL
ECHO End Process @ %time:~0,-3%
goto :EOF
=================================================
:StartFirst Start the asynchronous coroutine to process the first file
ECHO - Process of file #%1 (!file[%1]!.xml) START
set "inTag="
for /F "usebackq delims=" %%a in ("!file[%1]!.xml") do (
if not defined inTag (
rem First part: Copy to output file up to "</asset>" input line
>> output.txt echo %%a
if "%%a" equ " </asset>" (
set "inTag=1"
set "tagName="
)
) else (
rem Second part: Copy each tag and its children
for /F "tokens=2 delims=<>" %%b in ("%%a") do (
if not defined tagName (
rem Start of tagNameN
set "tagName=%%b"
set "childName="
del childIds.txt 2> NUL
set /A "add=0"
SET /P "=- - File #%1 (!file[%1]!.xml) TAG %%b: " < NUL
>> output.txt echo %%a
) else if "%%b" equ "/!tagName!" (
rem End of tagNameN in *this* First File:
ECHO !add! items added
rem process same tagNameN in rest of files
for /L %%i in (2,1,%n%) do (
WaitFor /SI File%%iON > NUL
WaitFor File%%iOFF > NUL
)
>> output.txt echo %%a
set "tagName="
) else if not defined childName (
for /F "tokens=1,3 delims=<= " %%c in ("%%b") do set "childName=%%c" & set "childId=%%~d"
>> childIds.txt echo !childId!
>> output.txt echo %%a
set /A add+=1
) else if "%%b" equ "/!childName!" (
set "childName="
>> output.txt echo %%a
) else (
>> output.txt echo %%a
)
)
)
)
>> output.txt echo ^</COLLADA^>
del childIds.txt
ECHO - Process of file #%1 (!file[%1]!.xml) END
WaitFor /SI File1End > NUL
exit
=================================================
:StartRest Start the asynchronous coroutine to process each one of the rest of files
ECHO - Process of file #%1 (!file[%1]!.xml) START
set "inTag="
for /F "usebackq delims=" %%a in ("!file[%1]!.xml") do (
if not defined inTag (
rem First part: Omit up to "</asset>" input line
if "%%a" equ " </asset>" (
set "inTag=1"
set "tagName="
)
) else (
rem Second part: Copy each tag and its children
for /F "tokens=2 delims=<>" %%b in ("%%a") do (
if not defined tagName (
rem Start of tagNameN, wait for "master's" signal to proceed
WaitFor File%1ON > NUL
rem Load current childIds
setlocal EnableDelayedExpansion
for /F %%i in (childIds.txt) do set "child[%%i]=1"
set "tagName=%%b"
set "childName="
set /A "add=0, omit=0"
SET /P "=- - File #%1 (!file[%1]!.xml) TAG %%b: " < NUL
) else if "%%b" equ "/!tagName!" (
rem End of tagNameN in this additional File:
ECHO !add! items added, !omit! omitted
rem release childIds and inform to "master"
endlocal
set "tagName="
WaitFor /SI File%1OFF > NUL
) else if not defined childName (
for /F "tokens=1,3 delims=<= " %%c in ("%%b") do set "childName=%%c" & set "childId=%%~d"
if not defined child[!childId!] (
set "child[!childId!]=1"
>> childIds.txt echo !childId!
>> output.txt echo %%a
set /A "add+=1, inChild=1"
) else (
set /A "omit+=1"
set "inChild="
)
) else if "%%b" equ "/!childName!" (
set "childName="
if defined inChild >> output.txt echo %%a
set "inChild="
) else if defined inChild (
>> output.txt echo %%a
)
)
)
)
ECHO - Process of file #%1 (!file[%1]!.xml) END
exit
This is the output report when I run this program with the posted data:
Code: Select all
Start Process @ 17:04:17
- Process of file #1 (001.xml) START
- Process of file #2 (002.xml) START
- Process of file #3 (003.xml) START
- - File #1 (001.xml) TAG library_images: 83 items added
- - File #2 (002.xml) TAG library_images: 79 items added, 58 omitted
- - File #3 (003.xml) TAG library_images: 26 items added, 39 omitted
- - File #1 (001.xml) TAG library_materials: 83 items added
- - File #2 (002.xml) TAG library_materials: 79 items added, 58 omitted
- - File #3 (003.xml) TAG library_materials: 26 items added, 39 omitted
- - File #1 (001.xml) TAG library_effects: 83 items added
- - File #2 (002.xml) TAG library_effects: 79 items added, 58 omitted
- - File #3 (003.xml) TAG library_effects: 26 items added, 39 omitted
- - File #1 (001.xml) TAG library_geometries: 27 items added
- - File #2 (002.xml) TAG library_geometries: 46 items added, 8 omitted
- - File #3 (003.xml) TAG library_geometries: 18 items added, 11 omitted
- - File #1 (001.xml) TAG library_visual_scenes: 1 items added
- - File #2 (002.xml) TAG library_visual_scenes: 1 items added, 0 omitted
- - File #3 (003.xml) TAG library_visual_scenes: 1 items added, 0 omitted
- - File #1 (001.xml) TAG scene: 1 items added
- - File #2 (002.xml) TAG scene: 1 items added, 0 omitted
- Process of file #2 (002.xml) END
- - File #3 (003.xml) TAG scene: 1 items added, 0 omitted
- Process of file #3 (003.xml) END
- Process of file #1 (001.xml) END
End Process @ 17:17:09
It seems that the output result is correct. However, it is also obvious that the result does not include such a tags that are not included in the first file... I must modify the method in order to fix this point, but I first need to know if the
order of the elements (tags) is the same in all files.
Antonio