Discussion forum for all Windows batch related topics.
Moderator: DosItHelp
-
plasma33
- Posts: 22
- Joined: 26 Jul 2017 21:18
#1
Post
by plasma33 » 29 Jul 2017 21:38
Hello everyone,
I would like pull out sets of fixed multiple lines from around 500,000 plus lines onto a single set. A simple demonstration of what I would like to achieve is represented below:
Input:
Code: Select all
RGRGRKRGRHRGRGRGIGRMKHIGRMRRIGKMKMIGRHRLIGRIRNIGRL
|||||||||||||||||||||||||.|:||:||.|||.:.|||:::|||.
RGRGRKRGRHRGRGRGIGRMKHIGRGRKIGRMKHIGRLKHIGRMKHIGRH
RGIGRKRGIGRGRGIGRQRHIGKLKHIGRGRIIGRGRGIGRGRGIGRGRG
:.||:.:.|||.|.|||.|.||:.:.|||.|.|||||||||.:.|||.:.
KLIGKMKMIGRHRLIGRGRGIGRQRGIGRKRNIGRGRGIGRMKHIGRNKM
IGRRRRIGKKKKGDGARGRGRKRGRHRGRHRGIGRMKHIGRGRGIGKMKM
|||.:.||::::|||||||||||||||||||||||||||||.:.||||||
IGRMKHIGRRRQGDGARGRGRKRGRHRGRHRGIGRMKHIGRRKMIGKMKM
IGRHRLIGRIRMIGRLRGIGRKRGIGRGRGIGRGRRIGKMKLIGRGRRIG
|||||||||.|.|||.|||||||.|||||||||.:.||:.:.|||.:.||
IGRHRLIGRGRKIGRQRGIGRKRNIGRGRGIGRMKHIGRHRRIGRMKHIG
KKKLIGRGRRIGKMRHIGRMRQIGRNRNGDGARGRGRKRGRHRGRIRGIG
:.|.|||.:.||:.:.||:|:.|||:|.||||||||||||||||||||||
RIKHIGRMKHIGRRKMIGKMKMIGRHRLGDGARGRGRKRGRHRGRIRGIG
Output:
Code: Select all
RGRGRKRGRHRGRGRGIGRMKHIGRMRRIGKMKMIGRHRLIGRIRNIGRLRGIGRKRGIGRGRGIGRQRHIGKLKHIGRGRIIGRGRGIGRGRGIGRGRGIGRRRRIGKKKKGDGARGRGRKRGRHRGRHRGIGRMKHIGRGRGIGKMKMIGRHRLIGRIRMIGRLRGIGRKRGIGRGRGIGRGRRIGKMKLIGRGRRIGKKKLIGRGRRIGKMRHIGRMRQIGRNRNGDGARGRGRKRGRHRGRIRGIG
|||||||||||||||||||||||||.|:||:||.|||.:.|||:::|||.:.||:.:.|||.|.|||.|.||:.:.|||.|.|||||||||.:.|||.:.|||.:.||::::|||||||||||||||||||||||||||||.:.|||||||||||||||.|.|||.|||||||.|||||||||.:.||:.:.|||.:.||:.|.|||.:.||:.:.||:|:.|||:|.||||||||||||||||||||||
RGRGRKRGRHRGRGRGIGRMKHIGRGRKIGRMKHIGRLKHIGRMKHIGRHKLIGKMKMIGRHRLIGRGRGIGRQRGIGRKRNIGRGRGIGRMKHIGRNKMIGRMKHIGRRRQGDGARGRGRKRGRHRGRHRGIGRMKHIGRRKMIGKMKMIGRHRLIGRGRKIGRQRGIGRKRNIGRGRGIGRMKHIGRHRRIGRMKHIGRIKHIGRMKHIGRRKMIGKMKMIGRHRLGDGARGRGRKRGRHRGRIRGIG
Thanks.
Plasma33
-
ShadowThief
- Expert
- Posts: 1166
- Joined: 06 Sep 2013 21:28
- Location: Virginia, United States
#2
Post
by ShadowThief » 29 Jul 2017 22:45
This is the third question (including the one StackOverflow question I saw) I've seen from you about data in this format. I'm really curious about what it could possibly be used for.
-
plasma33
- Posts: 22
- Joined: 26 Jul 2017 21:18
#3
Post
by plasma33 » 29 Jul 2017 23:27
ShadowThief wrote:This is the third question (including the one StackOverflow question I saw) I've seen from you about data in this format. I'm really curious about what it could possibly be used for.
Hi there,
These are biological sequences and I am trying to extract the conserved regions (common substrings) from the aligned sequences for research purposes.
Thanks.
Plasma33
-
Aacini
- Expert
- Posts: 1914
- Joined: 06 Dec 2011 22:15
- Location: México City, México
-
Contact:
#4
Post
by Aacini » 30 Jul 2017 11:52
Code: Select all
@echo off
setlocal EnableDelayedExpansion
echo Processing file, please wait...
for /F %%a in ('copy /Z "%~F0" NUL') do set "CR=%%a"
for /L %%i in (1,1,3) do del output%%i.txt 2> nul
set /A "out=0, lineNum=0"
< nul (for /F "delims=" %%a in (input.txt) do (
set "line=%%a"
set /A "out=out%%3+1, lineNum+=1"
set /P "=!line!" >> output!out!.txt
set /P "=Line: !lineNum!!CR!"
))
(for /L %%i in (1,1,3) do type output%%i.txt & del output%%i.txt & echo/) > output.txt
-
plasma33
- Posts: 22
- Joined: 26 Jul 2017 21:18
#5
Post
by plasma33 » 30 Jul 2017 20:16
@Aacini, thanks for your code. It works like I wanted. I love how it shows the number of lines that it has processed. Also, I love how your code divides each line into a separate text file. And on top of it, your code does the processing much faster than my one. It did the processing in under 5mins for a 17mb file. Hats off and thanks again.
Plasma33
-
Aacini
- Expert
- Posts: 1914
- Joined: 06 Dec 2011 22:15
- Location: México City, México
-
Contact:
#6
Post
by Aacini » 30 Jul 2017 20:56
Ops! I just realized that the program should run slightly faster modified in this way:
Code: Select all
@echo off
setlocal EnableDelayedExpansion
echo Processing file, please wait...
for /F %%a in ('copy /Z "%~F0" NUL') do set "CR=%%a"
for /L %%i in (1,1,3) do del output%%i.txt 2> nul
set /A "out=0, lineNum=0"
< nul (for /F "delims=" %%a in (input.txt) do (
set /A "out=out%%3+1, lineNum+=1"
set /P "=%%a" >> output!out!.txt
set /P "=Line: !lineNum!!CR!"
))
(for /L %%i in (1,1,3) do type output%%i.txt & del output%%i.txt & echo/) > output.txt
Antonio
-
plasma33
- Posts: 22
- Joined: 26 Jul 2017 21:18
#7
Post
by plasma33 » 01 Aug 2017 20:46
Hello Aacini,
Yes, it does. Thanks for the modified code. You are a life saver!!
Plasma33