Undocumented FINDSTR features and limitations
Moderator: DosItHelp
Re: Undocumented FINDSTR features and limitations
Haha, no I also didn't want to discredit Dave. He already mentioned it was untested. I corrected this for all the Google users who stumble upon this thread
Regards
aGerman
Regards
aGerman
Re: Undocumented FINDSTR features and limitations
I updated my SO post: http://stackoverflow.com/a/8844873/1012053
I used to think that the Windows pipe operator appended <CR><LF> to the input if the last character in the stream was not a <LF>. But I've since discovered that FINDSTR is actually doing the alteration of the input.
FINDSTR also appends <CR><LF> to redirected input on Vista (and XP?) if the last character of the redirected file is not <LF>.
I've discovered a nasty FINDSTR "feature" running on Windows 7: it hangs indefinitely on Windows 7 if you search redirected input and the redirected file does not end with <LF>.
Dave Benham
I used to think that the Windows pipe operator appended <CR><LF> to the input if the last character in the stream was not a <LF>. But I've since discovered that FINDSTR is actually doing the alteration of the input.
FINDSTR also appends <CR><LF> to redirected input on Vista (and XP?) if the last character of the redirected file is not <LF>.
I've discovered a nasty FINDSTR "feature" running on Windows 7: it hangs indefinitely on Windows 7 if you search redirected input and the redirected file does not end with <LF>.
Dave Benham
Re: Undocumented FINDSTR features and limitations
How could I not see this before? I have so many doubts on using FINDSTR.
Added this page to my favs. Gonna read it when I've got enough time to.
Thanks, Dave.
Added this page to my favs. Gonna read it when I've got enough time to.
Thanks, Dave.
Re: Undocumented FINDSTR features and limitations
dbenham wrote:FINDSTR also appends <CR><LF> to redirected input on Vista (and XP?) if the last character of the redirected file is not <LF>.
I've discovered a nasty FINDSTR "feature" running on Windows 7: it hangs indefinitely on Windows 7 if you search redirected input and the redirected file does not end with <LF>.
XP also has the issue if the file does not end with appropriate line endings. It hangs.
Re: Undocumented FINDSTR features and limitations
I updated my SO FINDSTR post with two new sections:
1) Description of XP behavior displaying most control characters as dots
2) Bugged /S and /D options may fail to find files if short 8.3 names are encountered.
Dave Benham
1) Description of XP behavior displaying most control characters as dots
2) Bugged /S and /D options may fail to find files if short 8.3 names are encountered.
Dave Benham
Re: Undocumented FINDSTR features and limitations
Thanks.
I remember that the default is /R not /L.
Example:
print #
but
don't print anything. The default is /R not /L
I remember that the default is /R not /L.
Example:
Code: Select all
echo.#|Findstr "."
print #
but
Code: Select all
echo.#|Findstr /L "."
don't print anything. The default is /R not /L
Re: Undocumented FINDSTR features and limitations
I think you did not read the post carefully. It is more complicated than that.
I stated that the default for the /C option is literal.
The default for all other methods (anything other than /C option) depends on the content of the 1st search string. If the 1st search string contains an un-escaped meta character and the string is a valid regex, then all searches will be treated as regex. If the first string does not contain an un-escaped meta character, or if it is not a valid regex, then all search strings will be treated as literals.
The following is a regex search that matches because the first string is a valid regex that contains a meta character.
But this next example is a literal search that does not match because the first search string does not contain a meta character
Dave Benham
I stated that the default for the /C option is literal.
The default for all other methods (anything other than /C option) depends on the content of the 1st search string. If the 1st search string contains an un-escaped meta character and the string is a valid regex, then all searches will be treated as regex. If the first string does not contain an un-escaped meta character, or if it is not a valid regex, then all search strings will be treated as literals.
The following is a regex search that matches because the first string is a valid regex that contains a meta character.
Code: Select all
echo #|findstr ". a"
But this next example is a literal search that does not match because the first search string does not contain a meta character
Code: Select all
echo #|findstr "a ."
Dave Benham
Re: Undocumented FINDSTR features and limitations
Thanks for the info.
When I specify /L or /R using the /C option, I get a message that says that the /C option was omitted.
Also, I have a dude with the /O option.
I have these file:
If I use:
it print:
The offset should be 3 not 0?
When I specify /L or /R using the /C option, I get a message that says that the /C option was omitted.
Code: Select all
C:\Users\Carlos>echo.#|findstr /c /l "#"
FINDSTR: se ha omitido /c
#
C:\Users\Carlos>echo.#|findstr /c /r "#"
FINDSTR: se ha omitido /c
#
Also, I have a dude with the /O option.
I have these file:
Code: Select all
all#everybody#is#ok
If I use:
Code: Select all
findstr /N /O "#" file.txt
it print:
Code: Select all
1:0:all#everybody#is#ok
The offset should be 3 not 0?
Re: Undocumented FINDSTR features and limitations
It seems buggy,
file.txt
d#dd
aaaa#aa#
all#everybody#is#ok
aaa
d:\ABC>findstr /O /c:"#" file.txt
0:d#dd
6:aaaa#aa#
16:all#everybody#is#ok
file.txt
d#dd
aaaa#aa#
all#everybody#is#ok
aaa
d:\ABC>findstr /O /c:"#" file.txt
0:d#dd
6:aaaa#aa#
16:all#everybody#is#ok
Re: Undocumented FINDSTR features and limitations
Here again, the correct information is already in my SO post.
Note - it is the byte offset of the beginning of the line that matches (measured from the beginning of the file), not the offset of the beginning of the match itself. Also, don't forget to count the CarriageReturn/LineFeed line terminators.
The /N and /O options specify the same locations within the file, but the /N option counts the number of lines, whereas the /O option counts the number of bytes. The /N option is 1 based, the /O option is 0 based.
So the results given by the Carlos and Foxidrive examples are corect/not bugged.
Dave Benham
SO FINDSTR post wrote:lineOffset: = The decimal byte offset of the start of the matching line, with 0 representing the 1st character of the 1st line. Only printed if /O option is specified.
Note - it is the byte offset of the beginning of the line that matches (measured from the beginning of the file), not the offset of the beginning of the match itself. Also, don't forget to count the CarriageReturn/LineFeed line terminators.
The /N and /O options specify the same locations within the file, but the /N option counts the number of lines, whereas the /O option counts the number of bytes. The /N option is 1 based, the /O option is 0 based.
So the results given by the Carlos and Foxidrive examples are corect/not bugged.
Dave Benham
Re: Undocumented FINDSTR features and limitations
I've updated my SO post to clarify the /O option;
Dave Benham
SO FINDSTR post wrote:lineOffset: = The decimal byte offset of the start of the matching line, with 0 representing the 1st character of the 1st line. Only printed if /O option is specified. This is not the offset of the match within the line. It is the number of bytes from the beginning of the file to the beginning of the line.
Dave Benham
Re: Undocumented FINDSTR features and limitations
dbenham wrote:I've updated my SO post to clarify the /O option;SO FINDSTR post wrote:lineOffset: = The decimal byte offset of the start of the matching line, with 0 representing the 1st character of the 1st line. Only printed if /O option is specified. This is not the offset of the match within the line. It is the number of bytes from the beginning of the file to the beginning of the line.
Dave Benham
Thanks Dave.
That could be used to count the length of a line also, or several lines.
It was very confusing as you normally think of the character offset to be to a match of the regexp/literal. The description should say "prints file offset before each matching line."
/O Prints character offset before each matching line.
Re: Undocumented FINDSTR features and limitations
foxidrive wrote:That could be used to count the length of a line also, or several lines.
Cool idea foxidrive
Code: Select all
@echo off
setlocal
set "test=Hello world!"
:: Echo the length of TEST
call :strLen test
:: Store the length of TEST in LEN
call :strLen test len
echo len=%len%
exit /b
:strLen strVar [rtnVar]
setlocal disableDelayedExpansion
set len=0
if defined %~1 for /f "delims=:" %%N in (
'"(cmd /v:on /c echo !%~1!&echo()|findstr /o ^^"'
) do set /a "len=%%N-3"
endlocal & if "%~2" neq "" (set %~2=%len%) else echo %len%
exit /b
I haven't figured out why I must subtract 3 instead of 2, but it appears to work.
Dave Benam
Matching Whole Words
What am I not understanding about matching two whole words.
Given the following input
And using this code
I get this output
Why does it not match two whole words?
Given the following input
Code: Select all
squash, 22, 14, 15, 12, 18, 19
squashman,22,14,15,12,18,19
josh,10, 16, 19, 3, 5, 19, 18, 7, 2, 4
joshua,10, 16, 19, 3, 5, 19, 18, 7, 2, 4
Code: Select all
@echo off
set "userid=squash"
set "number=15"
echo match whole word userid
findstr "\<%userid%\>" "wholetest.txt"
echo match whole word number
findstr "\<%number%\>" "wholetest.txt"
echo match two whole words
findstr "\<%userid%\>.*\<%number%\>" "wholetest.txt"
pause
goto :EOF
Code: Select all
match whole word userid
squash, 22, 14, 15, 12, 18, 19
match whole word number
squash, 22, 14, 15, 12, 18, 19
squashman,22,14,15,12,18,19
match two whole words
Code: Select all
squash, 22, 14, 15, 12, 18, 19
Re: Undocumented FINDSTR features and limitations
The explanation is within the italicized portions of the following quote from the 2nd answer in my SO Q&A:'
Dave Benham
dbenham on StackOverflow wrote: Regex word boundary
\< must be the very first term in the regex. The regex will not match anything if any other characters precede it. \< corresponds to either the very beginning of the input, the beginning of a line (the position immediately following a <LF>), or the position immediately following any "non-word" character. The next character need not be a "word" character.
\> must be the very last term in the regex. The regex will not match anything if any other characters follow it. \> corresponds to either the end of input, the position immediately prior to a <CR>, or the position immediately preceding any "non-word" character. The preceding character need not be a "word" character.
Dave Benham