How to output a range of lines from a text file using findstr
Posted: 09 Apr 2024 22:54
You might have seen a method using findstr, where you want to output chosen lines of text from a text file.
And that certainly works but it gets unwieldly if you want to include thousands of lines.
So I asked chatgpt if I could use regex to select a range of lines
First it tried to do every group of 10 lines
And that almost works except I had asked 4533 to 6219
So I asked again and told it, hey you can probably do 5000 to 5999 with just one
And it replied
And that does actually work
So now I want to create a function which creates these regex ranges from a simple range of lines
So what is a range, well it's probably like when you print, a series of pages, individual pages and range of pages
example
You might have even page numbers that repeat, or range of pages that go backward
Ranges might also go forward then backwards, having more than 2 stops
But for now I want the simplest working function so, just two numbers
First thing, split that in two variables
Next, figure out which is the higher number
Now it gets harder
In english we have to
Figure out how many digits the higher number has
I will use the examples
Ok, in pseudocode I think it looks like this.
Then get the len of each numbers
In this case both len is 1
Now loop %_RangeHigh_len% number of times
In this case 1
first loop
get digit _RangeHigh[1] in _RangeHigh_CurrentDigit
if _RangeLow[1] is "" when it's 0 , into _RangeLow_CurrentDigit
so _RangeLow_CurrentDigit=5 and _RangeHigh_CurrentDigit=9
Increment by one _RangeLow_CurrentDigit, decrement by one _RangeHigh_CurrentDigit
if _RangeHigh_CurrentDigit minus _RangeLow_CurrentDigit is greater than zero
create a regex, ^ for the beginning of line, then [%_RangeLow_CurrentDigit%-%_RangeHigh_CurrentDigit%] and for _RangeHigh_len minus one, add [0-9] and end regex string with :
Result should be ^[5-9]:
-------------
Next example
And here is a problem, _RangeLow's first digit is 5 but in the wrong direction
What needed to happen earlier was to leftpad _RangeLow with zeroes until it has as many digits as _RangeHigh
Ok new version
New state is
_RangeLow=05
_RangeHigh=55
_RangeHigh_len=2
_RangeLow[1]->_RangeLow_CurrentDigit=0
_RangeHigh[1]->_RangeHigh_CurrentDigit=5
Increment by one _RangeLow_CurrentDigit, decrement by one _RangeHigh_CurrentDigit
_RangeLow_CurrentDigit=1
_RangeHigh_CurrentDigit=4
_RangeHigh_CurrentDigit minus _RangeLow_CurrentDigit = 3 is greater than zero
Now we create the first regex
^[%_RangeLow_CurrentDigit%-%_RangeHigh_CurrentDigit%]
or ^[1-4]
Then pad with [0-9] for _RangeHigh_len minus 1 time and end with :, so that's
Loop to the next index of _RangeHigh_len (this is _RangeHigh_len_index)
Right now we have 10 to 49 covered, we need two more regex ^[5-9]: and ^5[0-5]:
I think for the rest of loop this means a low side and a high side regex needs to be created
The low side regex should take _RangeLow_CurrentDigit and rightpad with zero all remaining positions of _RangeHigh_len, then substract 1. This is the _Current_Regex_LowLimit.
Likewise, _RangeHigh_CurrentDigit, needs to be right padded with 9 and then add one, this makes _Current_Regex_LowLimit
so
call :rightpad _RangeLow_CurrentDigit 0 %_RangeHigh_len%-%_RangeHigh_len_index%
call :rightpad _RangeHigh_CurrentDigit 9 %_RangeHigh_len%-%_RangeHigh_len_index%
_RangeLow_CurrentDigit is now 10
_RangeLow_CurrentDigit is now 49
decrement _RangeLow_CurrentDigit and increment _RangeLow_CurrentDigit
_RangeLow_CurrentDigit is now 9
_RangeLow_CurrentDigit is now 50
I have to quit at this point sorry, I will pick this up later.
_RangeLow_CurrentDigit, might need to be 09, I will see
Code: Select all
type "myfile.txt" | %SystemRoot%\System32\findstr /N /R /C:".*" | %SystemRoot%\System32\findstr /B /C:"5785:" /C:"5786:" /C:"5787:" /C:"5788:"
So I asked chatgpt if I could use regex to select a range of lines
First it tried to do every group of 10 lines
Code: Select all
type myfile.txt | %SystemRoot%\System32\findstr /N "^" | findstr /R "^453[3-9]: ^454[0-9]: ^455[0-9]: ^456[0-9]: ^457[0-9]: ^458[0-9]: ^459[0-9]: ^460[0-9]: ^461[0-9]: ^462[0-9]: ^463[0-9]: ^464[0-9]: ^465[0-9]: ^466[0-9]: ^467[0-9]: ^468[0-9]: ^469[0-9]: ^470[0-9]: ^471[0-9]: ^472[0-9]: ^473[0-9]: ^474[0-9]: ^475[0-9]: ^476[0-9]: ^477[0-9]: ^478[0-9]: ^479[0-9]: ^480[0-9]: ^481[0-9]: ^482[0-9]: ^483[0-9]: ^484[0-9]: ^485[0-9]: ^486[0-9]: ^487[0-9]: ^488[0-9]: ^489[0-9]: ^490[0-9]: ^491[0-9]: ^492[0-9]: ^493[0-9]: ^494[0-9]: ^495[0-9]: ^496[0-9]: ^497[0-9]: ^498[0-9]: ^499[0-9]: ^500[0-9]: ^501[0-9]: ^502[0-9]: ^503[0-9]: ^504[0-9]: ^505[0-9]: ^506[0-9]: ^507[0-9]: ^508[0-9]: ^509[0-9]: ^510[0-9]: ^511[0-9]: ^512[0-9]: ^513[0-9]: ^514[0-9]: ^515[0-9]: ^516[0-9]: ^517[0-9]: ^518[0-9]: ^519[0-9]: ^520[0-9]: ^521[0-9]: ^522[0-9]: ^523[0-9]: ^524[0-9]: ^525[0-9]: ^526[0-9]: ^527[0-9]: ^528[0-9]: ^529[0-9]: ^530[0-9]: ^531[0-9]: ^532[0-9]: ^533[0-9]: ^534[0-9]: ^535[0-9]: ^536[0-9]: ^537[0-9]: ^538[0-9]: ^539[0-9]: ^540[0-9]: ^541[0-9]: ^542[0-9]: ^543[0-9]: ^544[0-9]: ^545[0-9]: ^546[0-9]: ^547[0-9]: ^548[0-9]: ^549[0-9]: ^550[0-9]: ^551[0-9]: ^552[0-9]: ^553[0-9]: ^554[0-9]: ^555[0-9]: ^556[0-9]: ^557[0-9]: ^558[0-9]: ^559[0-9]: ^560[0-9]: ^561[0-9]: ^562[0-9]: ^563[0-9]: ^564[0-9]: ^565[0-9]: ^566[0-9]: ^567[0-9]: ^568[0-9]: ^569[0-9]: ^570[0-9]: ^571[0-9]: ^572[0-9]: ^573[0-9]: ^574[0-9]: ^575[0-9]: ^576[0-9]: ^577[0-9]: ^578[0-9]: ^579[0-9]: ^580[0-9]: ^581[0-9]: ^582[0-9]: ^583[0-9]: ^584[0-9]: ^585[0-9]: ^586[0-9]: ^587[0-9]: ^588[0-9]: ^589[0-9]: ^590[0-9]: ^591[0-9]: ^592[0-9]: ^593[0-9]: ^594[0-9]: ^595[0-9]: ^596[0-9]: ^597[0-9]: ^598[0-9]: ^599[0-9]: ^600[0-9]: ^601[0-9]: ^602[0-9]: ^603[0-9]: ^604[0-9]: ^605[0-9]: ^606[0-9]: ^607[0-9]: ^608[0-9]: ^609[0-9]: ^610[0-9]: ^611[0-9]: ^612[0-9]: ^613[0-9]: ^614[0-9]: ^615[0-9]: ^616[0-9]: ^617[0-9]: ^618[0-9]: ^619[0-9]:"
So I asked again and told it, hey you can probably do 5000 to 5999 with just one
And it replied
Code: Select all
type myfile.txt | %SystemRoot%\System32\findstr /N "^" | findstr /R "^453[3-9]: ^45[4-9][0-9]: ^4[6-9][0-9][0-9]: ^5[0-9][0-9][0-9]: ^6[0-1][0-9][0-9]: ^620[0-9]: ^621[0-9]:"
And that does actually work
So now I want to create a function which creates these regex ranges from a simple range of lines
Code: Select all
::Usage Call :GetRegexRange X-Y X1-Y1 X2-Y2 ... Xn-Yn
:: returns findstr compatible regex list describing the ranges: ^453[3-9]: ^45[4-9][0-9]: ^4[6-9][0-9][0-9]: ^5[0-9][0-9][0-9]: ^6[0-1][0-9][0-9]: ^620[0-9]: ^621[0-9]:
So what is a range, well it's probably like when you print, a series of pages, individual pages and range of pages
example
Code: Select all
4,6,12,22-38,52-55
Code: Select all
55,56,1-5,31-21,17,5,5,5,5,30-1
Code: Select all
20-25-17-20,10,11,99-89,56,57,59,22-24-26,1-5
Code: Select all
235-11579
Code: Select all
for /f "delims=- " %%a in ("%_MyRange%") do ( set /a _Range1=%%a & set /a _Range2=%%b )
Code: Select all
if %_Range1% LSS %_Range2% ( set /a _RangeLow=%_Range1% & set /a _RangeHigh=%_Range2% ) else ( set /a _RangeLow=%_Range1% & set /a _RangeHigh=%_Range2% )
Now it gets harder
In english we have to
Figure out how many digits the higher number has
I will use the examples
Code: Select all
5-9,5-55,15-555,27-47852,25227-45319,29-2555,40000-40008,39987-40022,45315-45319
Code: Select all
5-9
Code: Select all
call :len _RangeHigh _RangeHigh_len
call :len _RangeLow _RangeLow_len
Now loop %_RangeHigh_len% number of times
In this case 1
first loop
get digit _RangeHigh[1] in _RangeHigh_CurrentDigit
if _RangeLow[1] is "" when it's 0 , into _RangeLow_CurrentDigit
so _RangeLow_CurrentDigit=5 and _RangeHigh_CurrentDigit=9
Increment by one _RangeLow_CurrentDigit, decrement by one _RangeHigh_CurrentDigit
if _RangeHigh_CurrentDigit minus _RangeLow_CurrentDigit is greater than zero
create a regex, ^ for the beginning of line, then [%_RangeLow_CurrentDigit%-%_RangeHigh_CurrentDigit%] and for _RangeHigh_len minus one, add [0-9] and end regex string with :
Result should be ^[5-9]:
-------------
Next example
Code: Select all
5-55
Code: Select all
call :len _RangeHigh _RangeHigh_len
call :len _RangeLow _RangeLow_len
get digit _RangeHigh[1] in _RangeHigh_CurrentDigit
if _RangeLow[1] is "" when it's 0 , into _RangeLow_CurrentDigit
And here is a problem, _RangeLow's first digit is 5 but in the wrong direction
What needed to happen earlier was to leftpad _RangeLow with zeroes until it has as many digits as _RangeHigh
Ok new version
Code: Select all
call :len _RangeHigh _RangeHigh_len
call :len _RangeLow _RangeLow_len
call :leftpad _RangeLow 0 %_RangeHigh_len%
get digit _RangeHigh[1] in _RangeHigh_CurrentDigit
if _RangeLow[1] is "" when it's 0 , into _RangeLow_CurrentDigit
_RangeLow=05
_RangeHigh=55
_RangeHigh_len=2
_RangeLow[1]->_RangeLow_CurrentDigit=0
_RangeHigh[1]->_RangeHigh_CurrentDigit=5
Increment by one _RangeLow_CurrentDigit, decrement by one _RangeHigh_CurrentDigit
_RangeLow_CurrentDigit=1
_RangeHigh_CurrentDigit=4
_RangeHigh_CurrentDigit minus _RangeLow_CurrentDigit = 3 is greater than zero
Now we create the first regex
^[%_RangeLow_CurrentDigit%-%_RangeHigh_CurrentDigit%]
or ^[1-4]
Then pad with [0-9] for _RangeHigh_len minus 1 time and end with :, so that's
Code: Select all
^[1-4][0-9]:
Right now we have 10 to 49 covered, we need two more regex ^[5-9]: and ^5[0-5]:
I think for the rest of loop this means a low side and a high side regex needs to be created
The low side regex should take _RangeLow_CurrentDigit and rightpad with zero all remaining positions of _RangeHigh_len, then substract 1. This is the _Current_Regex_LowLimit.
Likewise, _RangeHigh_CurrentDigit, needs to be right padded with 9 and then add one, this makes _Current_Regex_LowLimit
so
call :rightpad _RangeLow_CurrentDigit 0 %_RangeHigh_len%-%_RangeHigh_len_index%
call :rightpad _RangeHigh_CurrentDigit 9 %_RangeHigh_len%-%_RangeHigh_len_index%
_RangeLow_CurrentDigit is now 10
_RangeLow_CurrentDigit is now 49
decrement _RangeLow_CurrentDigit and increment _RangeLow_CurrentDigit
_RangeLow_CurrentDigit is now 9
_RangeLow_CurrentDigit is now 50
I have to quit at this point sorry, I will pick this up later.
_RangeLow_CurrentDigit, might need to be 09, I will see