Page 1 of 1

Extracting specific strings from sets of multiple lines

Posted: 30 Jul 2017 01:20
by plasma33
Hello again,

I would like to extract specific strings from a set of aligned strings. That is, I would like to extract only the aligned single string that is exactly matched to one another from several sets of aligned multiple strings avoiding gaps (i.e. hyphens). Please see below for a better picture:
Input:

Code: Select all

IGRNRKGDGARGRGRGRGRHRIKLRGIG
||||||||||||||||||||||||||||
------GDGARGRGRGRGRHRIKLRGIG

IGKKRHIG
||||||||
IG------

GDGARGRGRGRGRHRIKMRGIG
||||||||||||||||||||||
GDGARGRGRGRGRHRIKMRGIG

LIGRGRIIGKK
|||||||||||
LIGR------K

GDGARGRGRGRGRHRLRLRGIGKLKIIGRHR
|||||||||||||||||||||||||||||||
GDGARGRGRGRGRHRLRLRGIGK------HR

IGRHRK------GDGARGRGRGRGRHRLRPRGIGR
|||||||||||||||||||||||||||||||||||
IGRHRKIGRKRLGDGARGRGRGRGRHRLRPRGIGR

and so on
.
.
.


Output:

Code: Select all

GDGARGRGRGRGRHRIKLRGIG
IG
GDGARGRGRGRGRHRIKMRGIG
LIGR
K
GDGARGRGRGRGRHRLRLRGIGK
HR
IGRHRK
GDGARGRGRGRGRHRLRPRGIGR
and so on
.
.
.


Thanks.

Plasma33

Re: Extracting specific strings from sets of multiple lines

Posted: 30 Jul 2017 12:15
by Aacini

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set "i=0"
(for /F "delims=" %%a in (input.txt) do (
   set /A i=i%%3+1
   set "line!i!=%%a"
   if !i! equ 3 (
      if "!line1:-=!" neq "!line1!" (
         for /F "tokens=1,2 delims=-" %%b in ("!line1!") do (
            echo %%b
            if "%%c" neq "" echo %%c
         )
      ) else if "!line3:-=!" neq "!line3!" (
         for /F "tokens=1,2 delims=-" %%b in ("!line3!") do (
            echo %%b
            if "%%c" neq "" echo %%c
         )
      ) else (
         echo !line1!
      )
   )
)) > output.txt

Re: Extracting specific strings from sets of multiple lines

Posted: 30 Jul 2017 20:18
by plasma33
@Aacini, you are a legend! Your code works like a charm. Fast and accurate!! Thanks heaps!!

Plasma33