Extracting required columns from the file

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
bipul049
Posts: 18
Joined: 04 Jul 2013 12:19

Extracting required columns from the file

#1 Post by bipul049 » 06 Jul 2013 15:01

I am trying to get something out of batch script using FOR,token,findstr.
Not getting hang of how can i find 2 strings in a file and concatenate the respective one's together. Here is my requirement.

I have a file which consists of these rows:
************FILE START**********
Sending request to WINDOWS process ABCD...
Running behind by: 8 seconds.
Some line here

Sending request to WINDOWS process EFGH...
Running behind by: 2 seconds.
Some line here

Sending request to WINDOWS process IJKL...
Running behind by: 3 seconds.
Some line here

************FILE END*************

I want my output to other file as:

WINDOWS process ABCD is slow by: 8
WINDOWS process EFGH is slow by: 2
WINDOWS process IJKL is slow by: 3

penpen
Expert
Posts: 2009
Joined: 23 Jun 2013 06:15
Location: Germany

Re: Extracting required columns from the file

#2 Post by penpen » 06 Jul 2013 17:40

Something like this should work:

Code: Select all

@echo off
setlocal enableDelayedExpansion

set "INPUT_FILE=file.txt"
set "OUTPUT_FILE=out.txt"
set STATE=0

(
  for /f "tokens=4,5*" %%a in ('findstr /R /C:"^Sending request to WINDOWS process .*" /C:"^Running behind by: .*" "%INPUT_FILE%"') do (
    if !STATE! == 0 (
      set "NAME=%%c"
      set "STATE=1"
    ) else (
      echo WINDOWS process !NAME:~0,-3! is slow by: %%a
      set "STATE=0"
    )
  )
)>%OUTPUT_FILE%
endlocal

penpen

Aacini
Expert
Posts: 1914
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: Extracting required columns from the file

#3 Post by Aacini » 06 Jul 2013 21:53

You may also download my FindRepl.bat program and use it this way:

Code: Select all

< file.txt FindRepl "^Sending request to (WINDOWS process [A-Z]+).*\r\nRunning behind by: (\d*)" "\r\n$1 is slow by: $2\r\n" | FindRepl "^WINDOWS process"


Antonio

probyn
Posts: 7
Joined: 23 May 2013 20:01

Re: Extracting required columns from the file

#4 Post by probyn » 07 Jul 2013 14:06

You have received some excellent replies. Here is another way to accomplish your stated result.
(Tested on WinXP under WinVPC on Win7 Pro host.)

Code: Select all

@echo off & setlocal enabledelayedexpansion
for /f "tokens=2,4-6" %%K in (
 'findstr "process behind" x:\yourpath\yourfile.log'
) do (
  set T2=%%K&set T4=%%L&set T5=%%M&set T6=%%N
  if "!T4!"=="WINDOWS" set out=!T4! !T5! !T6:.=!
  if "!T2!"=="behind"  echo/!out! is slow by: !T4!
)
goto :EOF


Phil Robyn
u d e t o d y e l e k r e b t a n y b o r p

bipul049
Posts: 18
Joined: 04 Jul 2013 12:19

Re: Extracting required columns from the file

#5 Post by bipul049 » 09 Jul 2013 08:38

Hello penpen

The previous reply of yours is working as expected. But i have one issue here.

@echo off
setlocal enableDelayedExpansion

set "INPUT_FILE=file.txt"
set "OUTPUT_FILE=out.txt"
set STATE=0
set lag_limit=3600
(
for /f "tokens=4,5*" %%a in ('findstr /R /C:"^Sending request to WINDOWS process .*" /C:"^Running behind by: .*" "%INPUT_FILE%"') do (
if !STATE! == 0 (
set "NAME=%%c"
set "STATE=1"
) else (
IF %%a GTR %lag_limit% (
echo WINDOWS process !NAME:~0,-3! is slow by: %%a
)
set "STATE=0"
)
)
)>%OUTPUT_FILE%
endlocal

I have modified your code(the colored one). Its working fine.
But due to some reason %%a is getting value as: "to". Even if its getting this value, its getting compared in that IF LOOP and getting inside the loop and echoeing it. Why is it so? I think its taking the ascii value or something and comparing it with lag_limit and getting inside the loop.

What can i change here.

If %%a='to'
Output is: WINDOWS process ABCD is slow by: to

This is wrong. I dont want to echo it if %%a is less than lag_limit(which is 3600 here)

Thanks for your help.

bipul049
Posts: 18
Joined: 04 Jul 2013 12:19

Re: Extracting required columns from the file

#6 Post by bipul049 » 09 Jul 2013 09:55

And one more thing i noticed. If its not able to find the second string in the line, its skipping whatever process names are specified after that. For ex.

****************FILE START*****************
Sending request to WINDOWS process ABCD...
Running behind by: 8 seconds.
Some line here

Sending request to WINDOWS process EFGH...
Nothing to fetch
Some line here

Sending request to WINDOWS process IJKL...
Running behind by: 3 seconds.
Some line here
***************FILE END*****************

Its giving output only as:

WINDOWS process ABCD is slow by: 8

In this line:
for /f "tokens=4,5*" %%a in ('findstr /R /C:"^Sending request to WINDOWS process .*" /C:"^Running behind by: .*" "%INPUT_FILE%"') do (

If any condition is not met, how can i check that so that i can modify the code accordingly.

Thanks alot for your suggestions.

penpen
Expert
Posts: 2009
Joined: 23 Jun 2013 06:15
Location: Germany

Re: Extracting required columns from the file

#7 Post by penpen » 09 Jul 2013 12:16

I assumed that the following lines prefixes always alternating with each other.
1) "Sending request to WINDOWS process"
2) "Running behind by"

I also assumed that their suffixes are:
1) " ", process name, "..."
2) " ", a natural number, " seconds."

If any assumtion is violated, then this algorithm fails.
In this case the prefix 2 is not present, instead of this "Nothing to fetch" is there.
So in state 1 where the whole line is echoed the next line that is reported by findstr is again prefix 1.
From now the algorithm acts as if the prefixes are replaced by each other and the algorithm produces trash.

You may fix this algorithm to meet the new requirements in many ways:
You could let findstr let search for the prefix /C:"^Nothing to fetch":
So in state 1 could occur the prefix 2 and 3. That should be handled, so it does not output something like this: "WINDOWS process ABCD is slow by: ".

You may also remove all state settings and gettings and
instead of 'if !STATE! == 0' you may use 'if "%%a" == "WINDOWS"', as the two given oprefixes differ on this token.
I thought this could be irritating, so i have defined the STATE variable.

penpen

Edited: Corrected 'if "%%a" == "to"' to 'if "%%a" == "WINDOWS"'

Why you getting a 'to' is curious shouldn't happen, maybe another unexpected line distracts the algorithm.
Edited2: But it shouldn't as only lines startiong with this are selected by findstr, and none of them has a "to" as a fourth token.
Edited3: SRy too late... the second prefix has only three tokens, so there could be a line with this prefix iritating the algorthm: "Running behind by", whitespace, "to", whitespace
Last edited by penpen on 09 Jul 2013 15:27, edited 3 times in total.

penpen
Expert
Posts: 2009
Joined: 23 Jun 2013 06:15
Location: Germany

Re: Extracting required columns from the file

#8 Post by penpen » 09 Jul 2013 13:00

bipul049 wrote:But due to some reason %%a is getting value as: "to". Even if its getting this value, its getting compared in that IF LOOP and getting inside the loop and echoeing it. Why is it so? I think its taking the ascii value or something and comparing it with lag_limit and getting inside the loop.

Sorry i have nearly forgotten to write, why "to" is greater than "3600":

The Syntax is as follows: Operand1 comparator Operand2
If both operands are decimal numbers, then the operands are converted to 32 bit signed integer values and the result meets their natural order.

If one operand is no decimal number, then Syntax is used: String1 comparator String2
Lets assume the character representation is an n/m-tupel:
String1 := c_1_1, c_1_2, c_1_3, ..., c_1_n
String1 := c_2_1, c_2_2, c_2_3, ..., c_2_m

The two strings are compared from left to right by the integer value of the ASCII chars characterwise, using the natural order of their integer value in most cases:
First: c_1_1 comparator c_2_1,
if this leads to no result this is checked: c_1_2 comparator c_2_2,
and so on.
If one string is longer than the other then all further characters of the shorter string are assumed to have the integer ASCII value of 0.

But there are some characters that are not compared with their ascii value:
'?' (63), '@' (64), '[' (91), '\' (92), ']' (93), ...
So "? < 3" is true although 63 > 51 (integer values of their ASCII).

So if you do not exactly know:
- what characters are in the strings to compare
- how their order is,
while using batch files:
Do not expect a specific order.

penpen

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: Extracting required columns from the file

#9 Post by foxidrive » 10 Jul 2013 03:30

@bipul049

Please provide actual output from your log file and you will get code designed for it. A link to a download site would work well.

If people make too many assumptions it means you get code that doesn't work, or only works sometimes.

bipul049
Posts: 18
Joined: 04 Jul 2013 12:19

Re: Extracting required columns from the file

#10 Post by bipul049 » 10 Jul 2013 04:51

Hello,

Sorry i am very new to batch scripting but i need to accomplish this requirement.
As suggested by foxidrive, hereby I have exact requirement.This is my input file(say Input.txt)

***************************Input.txt**************************

EXTRACT ABCD2EF Last Started 2013-05-19 05:31 Status RUNNING
Description Some info here
Checkpoint Lag 00:00:00 (updated 00:00:00 ago)
Log Read Checkpoint File ./path/ab003867
2013-07-10 04:10:19.086666

EXTRACT ABCDEX Last Started 2013-06-12 16:52 Status RUNNING
Description Some info here
Checkpoint Lag 00:30:00 (updated 00:00:07 ago)
VAM Read Checkpoint 2013-07-10 04:13:02.443333
FGH: 0x0037067e:0000eb9c:0009

EXTRACT ABCDEFG Last Started 2013-05-19 05:31 Status RUNNING
Description Some info here
Checkpoint Lag 02:05:00 (updated 00:00:05 ago)
Log Read Checkpoint File ./path/cd003867
2013-07-10 04:10:19.086666

****************************FILE END***************************


I want my output to be as stated below. This being my output file(say Output.txt)

****************************Output.txt**************************
Extract ABCD2EF have lag of: 00:00:00
Extract ABCDEX have lag of: 00:30:00
Extract ABCDEX have lag of: 02:05:00
****************************FILE END****************************

Optional Requirement: Suppose my lag time(02:05:00) comes to variable %%c. In this case i want to break this time into seconds.How can i do that?

Thanks all for your help.

penpen
Expert
Posts: 2009
Joined: 23 Jun 2013 06:15
Location: Germany

Re: Extracting required columns from the file

#11 Post by penpen » 10 Jul 2013 05:58

Although the input looks different to that in your opening post it is a quite similar task, and so solved in a similar way:
Assumtions:
The process names don't contain any exclamation mark.
There is exact one line in each text block (e) that starts with "EXTRACT ".
There is exact one line in each text block (c) that starts with "Checkpoint Lag ".
In each text block line c follows line e.

Then the following should help you:

Code: Select all

@echo off
setlocal enableDelayedExpansion

set "INPUT_FILE=Input.txt"
set "OUTPUT_FILE=Output.txt"

(
   for /f "tokens=1-3" %%a in ('findstr /R /C:"^EXTRACT .*" /C:"^Checkpoint Lag .*" "%INPUT_FILE%"') do (
      if "%%a" == "EXTRACT" (
         set "NAME=%%b"
      ) else (
         for /f "tokens=1-3 delims=:" %%d in ("%%c") do (
            set /a "SECONDS=((1%%d-100)*3600)+((1%%e-100)*60)+(1%%f-100)"
         )
         echo Extract !NAME! have lag of: %%c ^(!SECONDS! seconds^).
      )
   )
) > %OUTPUT_FILE%

endlocal
goto:eof
Breaking the times in seconds is done in the inner for loop.

penpen

Edit: Fixed the calculation error: 'set /a "SECONDS=(%%d*3600)+(%%e*60)+%%f"'.
Last edited by penpen on 29 Apr 2014 03:33, edited 1 time in total.

bipul049
Posts: 18
Joined: 04 Jul 2013 12:19

Re: Extracting required columns from the file

#12 Post by bipul049 » 10 Jul 2013 08:04

Thanks a lot penpen. This is working exactly the way i wanted. And yes taking the assumptions into consideration is very imp. These assumptions very much comply to the requirements i had.

Thanks a lot for prompt and great response.

bipul049
Posts: 18
Joined: 04 Jul 2013 12:19

Re: Extracting required columns from the file

#13 Post by bipul049 » 23 Feb 2014 18:31

Hello Penpen,

As i informed in my earleir post that your code works perfectly as expected.
But i have one new requirement here. Whenever my process is stopped for maintenance, then also lag increases and i keep getting n number of mails. I want that if process status is STOPPED then it shouldn't send mail. It should only send mail if condition is met while process is RUNNING.

penpen
Expert
Posts: 2009
Joined: 23 Jun 2013 06:15
Location: Germany

Re: Extracting required columns from the file

#14 Post by penpen » 24 Feb 2014 16:38

I assume that you have:
- replaced the echo command in the for loop with a kind of mail command
- the status STOPPED always is the 8th token in such a line: "EXTRACT ABCD2EF Last Started 2013-05-19 05:31 Status STOPPED"

If this is true, you only have to monitor the status, too, and only echo (=mail) when status is not STOPPED:

Code: Select all

@echo off
setlocal enableDelayedExpansion

set "INPUT_FILE=Input.txt"
set "OUTPUT_FILE=Output.txt"
set "STATUS=unknown"

(
   for /f "tokens=1-3,8" %%a in ('findstr /R /C:"^EXTRACT .*" /C:"^Checkpoint Lag .*" "%INPUT_FILE%"') do (
      if "%%a" == "EXTRACT" (
         set "NAME=%%b"
         set "STATUS=%%d"
      ) else (
         for /f "tokens=1-3 delims=:" %%d in ("%%c") do (
               set /a "SECONDS=((1%%d-100)*3600)+((1%%e-100)*60)+(1%%f-100)"
            )
         if NOT "!STATUS!" == "STOPPED" echo Extract !NAME! have lag of: %%c ^(!SECONDS! seconds^).
      )
   )
) > %OUTPUT_FILE%

endlocal
goto:eof
I hope this works for you.

penpen

Edit: Fixed the calculation error: 'set /a "SECONDS=(%%d*3600)+(%%e*60)+%%f"'.
Last edited by penpen on 29 Apr 2014 03:34, edited 1 time in total.

bipul049
Posts: 18
Joined: 04 Jul 2013 12:19

Re: Extracting required columns from the file

#15 Post by bipul049 » 28 Apr 2014 15:06

Hi Penpen,
I am back. Sorry to disturb you so much, but I NEED TO GET THIS DONE.

My requirement is something specified below.
1. My input file(input.txt) contains this:
***********************************************************

Code: Select all

REPLICAT   ABCDEF  Last Started 2014-04-28 13:05   Status STOPPED
Description          My description here for the process
Checkpoint Lag       00:00:00 (updated 00:00:09 ago)
Log Read Checkpoint  File ./dirdat/abc002345
                     First Record  RBA 443058

REPLICAT   ABCGHI   Last Started 2014-04-28 13:05   Status RUNNING
Description          My description here for the process
Checkpoint Lag       00:00:00 (updated 00:00:03 ago)
Log Read Checkpoint  File ./dirdat/abc002345
                     First Record  RBA 443058

REPLICAT   ABCJKL  Last Started 2014-04-28 13:05   Status STOPPED
Description          My description here for the process
Checkpoint Lag       00:00:07 (updated 00:00:16 ago)
Log Read Checkpoint  File ./dirdat/abc002345
                     2014-04-28 13:57:46.441666  RBA 160572376

***********************************************************
2. Below is the content of my batch file:
---------------------------------------------------------------------------

Code: Select all

@echo off
setlocal enableDelayedExpansion

set "INPUT_FILE=input.txt"
set "OUTPUT_FILE=output.txt"
set lag_limit=21600
set checkpoint_limit=600
(
   for /f "tokens=1-5" %%a in ('findstr /R /C:"^REPLICAT .*" /C:"^Checkpoint Lag .*" "%INPUT_FILE%"') do (
      if "%%a" == "REPLICAT" (
         set "NAME=%%b"
      ) else (
         for /f "tokens=1-3 delims=:" %%f in ("%%c") do (
            set /a "SECONDS=(%%f*3600)+(%%g*60)+%%h"
         )
    for /f "tokens=1-3 delims=:" %%i in ("%%e") do (
            set /a "CHECKPOINTLAG=(%%i*3600)+(%%j*60)+%%k"
         )
     if !SECONDS! GTR !lag_limit! (
       echo REPLICAT !NAME! have lag of: %%c.
     )
          if !CHECKPOINTLAG! GTR !checkpoint_limit! (
      echo REPLICAT !NAME! have Checkpoint lag of: %%e.
     )
   
      )
   
   )
   
) > %OUTPUT_FILE%
endlocal
goto:eof

-----------------------------------------------------------------------------
If anything is written to the Output file, then I am sending mail THROUGH DIFFERENT POWERSHELL SCRIPTING. As of now sending mail is not part of this discussion, so I am leaving that. I don't want anything to be written while my processes are STOPPED.

Condition to be met:
--------------------
If my process is stopped, batch file should not check anything and should not write anything to the output file.
If process status is anything other than STOPPED(like RUNNING), it should check for lag and checkpoint lag and write the output file if met threshold limit for lag(here it is 6hrs).

With your previous solution I was able to capture if lag is more than threshold but not status. Please let me know what change I will have to make to the batch file to accomplish that.

Thanks a lot for your time.

Post Reply