Unable to understand how to use findstr...
Moderator: DosItHelp
Unable to understand how to use findstr...
My mental abilities are not enough to understand how findstr should be used to extract a part of a text file, more precisely all lines from line number n to line number m, from a text file into an output file.
Could someone here help me?
Please note that my question is limited to findstr, because such a solution could be used for binary files too (which are my real target).
Thank you in advance.
Could someone here help me?
Please note that my question is limited to findstr, because such a solution could be used for binary files too (which are my real target).
Thank you in advance.
-
- Expert
- Posts: 1166
- Joined: 06 Sep 2013 21:28
- Location: Virginia, United States
Re: Unable to understand how to use findstr...
Binary files don't necessarily have newline characters, so extracting lines M through N may not always be possible. Also, what gave you the idea that findstr was going to be useful for this?
You can, however use a for loop.
You can, however use a for loop.
Code: Select all
@echo off
setlocal enabledelayedexpansion
:: Specify which line to start returning from
set get_line=7
:: Specify how many lines to return
set return_lines=5
:: In this example, lines 7 through 12 will be returned
:: Trial and error made me add this line. Don't touch it.
set /a get_line-=2
(
for /L %%A in (0,1,%get_line%) do set /p skip_line=
for /L %%A in (0,1,%return_count%) do (
set /p print_line=
echo !print_line!
)
) <file.txt
Re: Unable to understand how to use findstr...
ShadowThief, thank you very much for your support!
You address questions which are not directly relevant to my setting. All my binary files (which I intend to use) have "lines" (i.e. newline characters), and these "lines" are not to long for findstr (there is a length limit). So findstr definitely would work with my binary data. Therefore findstr would be useful for me––why do you wonder so much about this idea?
Your script works fine with text files. (I corrected the variable name in the second for and added >> output.txt to echo.) One question: At the end of the last line a newline is added (in output.txt, a blank 6th line). Do you have an idea how this could be avoided?
But, of course, your script doesn't work with binary files (00!). That's why I ask how to do the same using findstr...
You address questions which are not directly relevant to my setting. All my binary files (which I intend to use) have "lines" (i.e. newline characters), and these "lines" are not to long for findstr (there is a length limit). So findstr definitely would work with my binary data. Therefore findstr would be useful for me––why do you wonder so much about this idea?
Your script works fine with text files. (I corrected the variable name in the second for and added >> output.txt to echo.) One question: At the end of the last line a newline is added (in output.txt, a blank 6th line). Do you have an idea how this could be avoided?
But, of course, your script doesn't work with binary files (00!). That's why I ask how to do the same using findstr...
-
- Expert
- Posts: 1166
- Joined: 06 Sep 2013 21:28
- Location: Virginia, United States
Re: Unable to understand how to use findstr...
If your binary files have newlines, you can probably stick a at the top of the script and see if that works.
Code: Select all
type binary_file.exe >file.txt
Re: Unable to understand how to use findstr...
I tested this, yes, and it doesn't work, as I said (nul character is ignored). This discussion only removes us from my question from the beginning.
Re: Unable to understand how to use findstr...
Does it absolutely *have* to be findstr? If so, then I can't help you, but if not, then this is precisely the kind of task I would use some quick-and-dirty c code for:
I attached a binary compiled with tcc. It reads and writes byte by byte, so it's no speed demon
Code: Select all
#include <stdio.h>
int main(int argc, char **argv) {
FILE *fp, *ofp;
long si,ei;
if (argc <= 3) { puts("Usage: cpbinl file startline endline"); return 1; }
si=atoi(argv[2]); ei=atoi(argv[3]);
if (si < 1 || ei < 1 || ei < si) { puts("Invalid index"); return 1; }
fp=fopen(argv[1], "rb");
argv[1][0]='#'; if (fp) ofp=fopen(argv[1], "wb");
if (fp && ofp) {
long line=1, read;
unsigned char ch;
do {
read=fread(&ch, 1, 1, fp);
if (read && line >= si && line <= ei) fwrite(&ch, 1, 1, ofp);
if (ch == 0xa) line++;
} while(read);
if (fp) fclose(fp);
if (ofp) fclose(ofp);
} else puts("File error");
return 0;
}
- Attachments
-
- cpbinl.zip
- (1.48 KiB) Downloaded 426 times
Re: Unable to understand how to use findstr...
That's just fantastic, misol101!
It does not solve my problem with using findstr (and I still hope someone will show me a solution), but:
Your answer allows me to think of a very different concept for my general project. I did not know that C and TCC make it possible to write such extremely small programs.
I am extremely excited and thank you very much for your answer.
May I send you a PM?
It does not solve my problem with using findstr (and I still hope someone will show me a solution), but:
Your answer allows me to think of a very different concept for my general project. I did not know that C and TCC make it possible to write such extremely small programs.
I am extremely excited and thank you very much for your answer.
May I send you a PM?
Re: Unable to understand how to use findstr...
Sure thing, though I’m not sure how much time I will have to look into it
Re: Unable to understand how to use findstr...
If the lines does not have leading white spaces or equal sign(=) `set /p "=!var!"<nul>outFile` can be used for that matter. On Vista and beyond set /p removes leading white spaces from prompt string.
An Alternate less efficient but working method is to use prompt trick:
Code: Select all
setlocal
.
.
set "prompt=!var:$=$$!"
cmd /d /k <nul>outFile
.
.
endlocal
Code: Select all
for /L %%A in (0,1,%return_count%) do (
set /p print_line=
set "prompt=!print_line:$=$$!"
cmd /d /k <nul
if %%A LSS %return_count% echo,
)
Re: Unable to understand how to use findstr...
That is not necessarily true. As far as I'm concerned, findstr on it's own, does not have the capability to filter specific line numbers. So the assumption that any solution of your task involving findstr, can automatically handle binary files is not true.
However there is a method which can handle binary files based on the criteria you have specified. There is also a method which can only be used to handle text files (Or more precisely files without <NULL> bytes) but in a much more efficient way.
findstr solution to handle text files:
Code: Select all
@echo off
setlocal EnableExtensions DisableDelayedExpansion
:: Parameters
set "Input=input.txt"
set "Output=output.txt"
:: The minimum value for startLine is 1
set /a "startLine=5, endLine=9"
set /a "startLine-=1"
if %startLine% EQU 0 (set "skip=") else set "skip=skip=%startLine%"
(for /F "%skip% tokens=1* delims=:" %%K in ('findstr /N /R "^" "%Input%"') do (
if %%K LEQ %endLine% (echo(%%L)
))>"%Output%"
But as difference gets bigger (endLine=130, totalLines=800) the performance will degrade because batch script have no means of immediate break from FOR loops once the job is done there is no need remain in the loop anymore. The only command that can immediately break FOR loops is `exit` but that will also terminates the host cmd instance. The solution is to execute the FOR loop in a child instance of cmd along with exit command.
So the FOR loop block can replaced with this:
Code: Select all
>"%Output%" cmd /e:on /v:off /d /c for /F "%skip% tokens=1* delims=:" %%K in ^('findstr /N /R "^" "%Input%"'^) do @if %%K LEQ %endLine% ^(echo(%%L^) else exit
findstr solution to handle binary files:
Code: Select all
@echo off
setlocal EnableExtensions DisableDelayedExpansion
:: Parameters
set "Input=input.bin"
set "Output=output.bin"
set /a "startLine=5, endLine=9"
set /a "cureLine=startLine-1, Lines=endLine-startLine+1, MaxDigits=9"
(for /L %%. in (1,1,%Lines%) do (
set /a "cureLine+=1, Num=cureLine, LeadChars=1"
for /L %%. in (1,1,%MaxDigits%) do set /a "LeadChars+=!!Num, Num/=10"
findstr /N /R "^" "%Input%"|(findstr /R "^%%cureLine%%:")|((for /L %%. in (1,1,%%LeadChars%%) do @pause)>nul & findstr /R "^")
REM // This line takes into account the possibility for disabled command extensions. Use instead of above if that is a concern.
REM findstr /N /R "^" "%Input%"|(findstr /R "^%%cureLine%%:")|cmd /e:on /d /c ^(for /L %%^^^. in ^(1,1,%%LeadChars%%^) do @pause^)^>nul ^& findstr /R "^"
))>"%Output%"
Be aware that this method is extremely slow and inefficient in terms of performance but should do the job. That was the best I could come up with.