InStr, ReplStr: Case sensitive search and replace routines
Posted: 23 Jun 2011 23:44
Windows batch processing has fast and convenient string search and replace using %var:search=replace%. But this has significant limitations:
There is no native way to identify the location of a substring within a string. There are efficient methods posted on this site to do this, but they rely on search and replace and so they have the same limitations as above.
I have developed both macro and function libraries to perform both of the above functions. It is far from elegant - it relies on brute force character by character parsing of the string. But the routines are extremely flexible and powerful. There are no limitations to the characters that can be searched and/or replaced. The search is case sensitive by default, or it can be case insensitive. The routines can be called with delayed expansion enabled or disabled.
The key routine is InStr - It can find the Nth occurrence of a substring within the target starting from the beginning or the end and return the position. Or it can find all occurrences and return the positions as a space delimited string.
Once we have InStr, it is a simple matter to write a ReplStr routine that uses the position information along with standard substring operations to execute the replace functionality. I can envision that InStr could be the basis of many useful functions/macros.
The macro version is contained in the macroLib_SearchStr.bat file. It is self documenting. It requires the following libraries that can be found on my Batch "macros" with arguments - Major Update post.
The embedded help is difficult to read within the source code. It is best to load the library (run the batch file) and then use my margs and mhelp DOSKEY macros to read the help.
For those that are uncomfortable with using macros I have also included a pure function implementation at the end of this post. The function version has no dependencies, but it is older and I believe less robust. It is also only partially documented. I can't remember all of the differences between the macros and the older functions.
macroLib_SearchStr.bat
Older function versions of the routines:
Dave Benham
- The search is always case Insensitive. There is no native way to perform a case sensitive search and replace.
- There is no good way to replace = * or :
- We can't replace ! if delayed expansion is enabled
- We can't replace % if delayed expansion is disabled
There is no native way to identify the location of a substring within a string. There are efficient methods posted on this site to do this, but they rely on search and replace and so they have the same limitations as above.
I have developed both macro and function libraries to perform both of the above functions. It is far from elegant - it relies on brute force character by character parsing of the string. But the routines are extremely flexible and powerful. There are no limitations to the characters that can be searched and/or replaced. The search is case sensitive by default, or it can be case insensitive. The routines can be called with delayed expansion enabled or disabled.
The key routine is InStr - It can find the Nth occurrence of a substring within the target starting from the beginning or the end and return the position. Or it can find all occurrences and return the positions as a space delimited string.
Once we have InStr, it is a simple matter to write a ReplStr routine that uses the position information along with standard substring operations to execute the replace functionality. I can envision that InStr could be the basis of many useful functions/macros.
The macro version is contained in the macroLib_SearchStr.bat file. It is self documenting. It requires the following libraries that can be found on my Batch "macros" with arguments - Major Update post.
- macroLib_Base.bat
- macroLib_Return.bat
- macroLib_String.bat
- callMacro.bat (only needed if you don't want to use the %macro_call% syntax)
The embedded help is difficult to read within the source code. It is best to load the library (run the batch file) and then use my margs and mhelp DOSKEY macros to read the help.
For those that are uncomfortable with using macros I have also included a pure function implementation at the end of this post. The function version has no dependencies, but it is older and I believe less robust. It is also only partially documented. I can't remember all of the differences between the macros and the older functions.
macroLib_SearchStr.bat
Code: Select all
@echo off
:: File = macroLib_SearchStr.bat
:: Dependencies: macroLib_Base.bat, macroLib_Return.bat, macroLib_String.bat
:: This batch file will fail if called while delayed expansion is enabled.
::
:: This library defines macros that involve searching for a string within
:: a string.
::
:: The library is designed to be installed in a directory in your PATH.
:: Any batch file that requires it can include it by simply placing the
:: following line of code at the top before any SETLOCAL:
::
:: IF NOT DEFINED macro\load.MacroLib_SearchStr CALL macroLib_SearchStr
::
:: In this way the library becomes resident in your command shell environment
:: where it is available to any batch file that may need it. The IF condition
:: prevents unneccessary reloads of the same library.
::
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: Conditionally load dependencies (normally residing somewhere in the PATH)
if not defined macro\load.macroLib_String call macroLib_String
set macro\args.InStr= CaseOption TargetVar SearchVar OccurenceVal [RtnVar]
set macro\help.InStr= CaseOption TargetVar SearchVar OccurenceVal [RtnVar]%\n%
%\n%
Computes the position of the Nth occurrence or all occurrences of a search%\n%
string within a target string.%\n%
%\n%
CaseOption must have one of the following two values:%\n%
I (or i) = case insensitive%\n%
S (or s) = case sensitive%\n%
%\n%
The target string is contained within variable TargetVar.%\n%
%\n%
The search string is contained within variable SearchVar.%\n%
%\n%
The Nth occurence is specified by the OccurenceVal. OccurenceVal may%\n%
be specified using any expression supported by SET /A. A positive%\n%
OccurenceVal indicates the search starts from the beginning. A negative%\n%
OccurenceVal indicates the search starts from the end. An OccurenceVal%\n%
of 0 directs instr to return all matches as a space delimited string%\n%
of positions in increasing order.%\n%
%\n%
The resulting position(s) is always reported relative to the beginning%\n%
of the targetStr with 0 being the first character.%\n%
%\n%
The result is an empty string if an error occurs%\n%
%\n%
The result is returned in variable RtnVar%\n%
or the result is echoed if RtnVar is not specified%\n%
%\n%
The ERRORLEVEL is set as follows:%\n%
0 - Success%\n%
1 - The Nth occurence of SearchStr was not found%\n%
2 - A required argument was missing or invalid%xLF%
%macro_BeginDef%
set macro.InStr=do (%\n%
setlocal enableDelayedExpansion%\n%
set "macro.instr.err="%\n%
if "%%~d"=="" (set macro.instr.err=2) else (%\n%
set "macro.instr.targetStr=^!%%~b^!"%\n%
set "macro.instr.searchStr=^!%%~c^!"%\n%
if not defined macro.instr.targetStr set macro.instr.err=2%\n%
if not defined macro.instr.searchStr set macro.instr.err=2%\n%
if /i "%%~a" neq "I" if /i "%%~a" neq "S" set macro.instr.err=2%\n%
set /a "occ=(%%~d)" 2^^^>nul%\n%
if errorlevel 1 set macro.instr.err=2%\n%
)%\n%
if not defined macro.instr.err (%\n%
!macro_call! ("macro.instr.targetStr targetLen") !macro.StrLen!%\n%
!macro_call! ("macro.instr.searchStr searchLen") !macro.StrLen!%\n%
if ^^^!searchLen^^^! gtr ^^^!targetLen^^^! set macro.instr.err=1%\n%
)%\n%
if not defined macro.instr.err (%\n%
if ^^^!occ^^^! geq 0 (%\n%
set /a "beg=0, step=1, end=targetLen-searchLen"%\n%
) else (%\n%
set /a "beg=targetLen-searchLen, step=-1, end=0"%\n%
)%\n%
if ^^^!occ^^^! neq 0 (set /a "occStep=step") else set /a "occStep=0"%\n%
set "off="%\n%
set "done=0"%\n%
set /a skip=0%\n%
for %%l in (^^^!searchLen^^^!) do for /l %%o in (^^^!beg^^^!,^^^!step^^^!,^^^!end^^^!) do if ^^^!done^^^! equ 0 (%\n%
if ^^^!skip^^^! equ 0 (%\n%
set "match="%\n%
if /i %%~a==S if "^!macro.instr.targetStr:~%%o,%%l^!"=="^!macro.instr.searchStr^!" set match=1%\n%
if /i %%~a==I if /i "^!macro.instr.targetStr:~%%o,%%l^!"=="^!macro.instr.searchStr^!" set match=1%\n%
if defined match (%\n%
set /a occ-=occStep%\n%
if ^^^!occ^^^! equ 0 (%\n%
set "off=^!off^! %%o"%\n%
set /a done=occStep%\n%
)%\n%
set /a skip=searchLen-1%\n%
)%\n%
) else set /a skip-=1%\n%
)%\n%
)%\n%
if not defined macro.instr.err if not defined off set macro.instr.err=1%\n%
if defined macro.instr.err (set rtn=) else (%\n%
set "rtn=^!off:~1^!"%\n%
set macro.instr.err=0%\n%
)%\n%
!macro_call! ("^!macro.instr.err^!") !macro.SetErr!%\n%
for /f "delims=" %%v in (""^^^!rtn^^^!"") do (%\n%
endlocal%\n%
if "%%~e" neq "" (set "%%~e=%%~v") else echo(%%~v%\n%
)%\n%
)
%macro_Call% ("macro.instr") %macro.EndDef%
%macro_EndAnyRtn%
set macro\args.ReplStr= CaseOption TargetVar SearchVar ReplaceVar OccurenceVal [RtnVar]
set macro\help.ReplStr= CaseOption TargetVar SearchVar ReplaceVar OccurenceVal [RtnVar]%\n%
%\n%
Replaces the Nth occurrence or all occurrences of a search string found%\n%
within a target string with a replacement string.%\n%
%\n%
The return value may contain any combination of characters supported by DOS%\n%
except 0x0A ^<Line Feed^> or 0x0D ^<Carreage Return^>.%\n%
%\n%
CaseOption must have one of the following two values:%\n%
I (or i) = case insensitive search%\n%
S (or s) = case sensitive search%\n%
%\n%
The target string is contained within variable TargetVar.%\n%
%\n%
The search string is contained within variable SearchVar.%\n%
%\n%
The replacement string is contained within variable ReplaceVar.%\n%
An empty replacement string may be specified by an undefined variable%\n%
or by "".%\n%
%\n%
The Nth occurence is specified by the OccurenceVal. OccurenceVal may%\n%
be specified using any expression supported by SET /A. A positive%\n%
OccurenceVal indicates the search starts from the beginning. A negative%\n%
OccurenceVal indicates the search starts from the end. An OccurenceVal%\n%
of 0 directs ReplStr to replace all occurrences.%\n%
%\n%
The result is returned in variable RtnVar%\n%
or the result is echoed if RtnVar is not specified%\n%
%\n%
The ERRORLEVEL is set as follows:%\n%
0 - Success%\n%
2 - A required argument was missing or invalid%xLF%
%macro_BeginDef%
set macro.ReplStr= do (%\n%
!macro_InitRtn!%\n%
setlocal enableDelayedExpansion%\n%
if "%%~e"=="" (set macro.replstr.err=2) else (%\n%
set "macro.replstr.str=^!%%~b^!"%\n%
!macro_Call! ("%%c macro.replstr.searchLen") !macro.StrLen!%\n%
!macro_Call! ("%%a %%b %%c %%e macro.replstr.found") !macro.InStr!%\n%
set "macro.replStr.err=^!errorlevel^!"%\n%
)%\n%
if ^^^!macro.replstr.err^^^! lss 2 (%\n%
set macro.replstr.err=0%\n%
set "repl=^!%%~d^!"%\n%
set "rtnVar=%%~f"%\n%
set "rtn="%\n%
set beg=0%\n%
for %%f in (^^^!macro.replStr.found^^^!) do (%\n%
set /a len=%%f-beg%\n%
for /f "tokens=1,2" %%a in ("^!beg^! ^!len^!") do set "rtn=^!rtn^!^!macro.replstr.str:~%%a,%%b^!^!repl^!"%\n%
set /a beg=%%f+macro.replstr.searchLen%\n%
)%\n%
for %%a in (^^^!beg^^^!) do set "rtn=^!rtn^!^!macro.replstr.str:~%%a^!"%\n%
) else (%\n%
set "rtn="%\n%
set "rtnVar="%\n%
)%\n%
!macro_Call! ("macro.replstr.err 1 rtn ^!rtnVar^!") !macro.Rtn1!%\n%
)
%macro_Call% ("macro.ReplStr") %macro.EndDef%
%macro_EndAnyRtn%
set macro\args.AnyReplStr= CaseOption TargetVar SearchVar ReplaceVar OccurenceVal [RtnVar]
set macro\help.AnyReplStr= CaseOption TargetVar SearchVar ReplaceVar OccurenceVal [RtnVar]%\n%
%\n%
Replaces the Nth occurrence or all occurrences of a search string found%\n%
within a target string with a replacement string.%\n%
%\n%
The return value may contain any combination of characters supported by DOS%\n%
including 0x0A ^<Line Feed^> and 0x0D ^<Carriage Return^>.%\n%
%\n%
%%macro_EndAnyRtn%% must follow a call to this macro, and it cannot share%\n%
a code block with the call.%\n%
%\n%
CaseOption must have one of the following two values:%\n%
I (or i) = case insensitive search%\n%
S (or s) = case sensitive search%\n%
%\n%
The target string is contained within variable TargetVar.%\n%
%\n%
The search string is contained within variable SearchVar.%\n%
%\n%
The replacement string is contained within variable ReplaceVar.%\n%
An empty replacement string may be specified by an undefined variable%\n%
or by "".%\n%
%\n%
The Nth occurence is specified by the OccurenceVal. OccurenceVal may%\n%
be specified using any expression supported by SET /A. A positive%\n%
OccurenceVal indicates the search starts from the beginning. A negative%\n%
OccurenceVal indicates the search starts from the end. An OccurenceVal%\n%
of 0 directs AnyReplStr to replace all occurrences.%\n%
%\n%
The result is returned in variable RtnVar%\n%
or the result is echoed if RtnVar is not specified%\n%
%\n%
The ERRORLEVEL is set as follows:%\n%
0 - Success%\n%
2 - A required argument was missing or invalid%xLF%
%macro_BeginDef%
set macro.AnyReplStr= do (%\n%
!macro_InitRtn!%\n%
setlocal enableDelayedExpansion%\n%
if "%%~e"=="" (set macro.replstr.err=2) else (%\n%
set "macro.AnyReplStr.str=^!%%~b^!"%\n%
!macro_Call! ("%%c macro.AnyReplStr.searchLen") !macro.StrLen!%\n%
!macro_Call! ("%%a %%b %%c %%e macro.AnyReplStr.found") !macro.InStr!%\n%
set "macro.AnyReplStr.err=^!errorlevel^!"%\n%
)%\n%
if ^^^!macro.AnyReplStr.err^^^! lss 2 (%\n%
set macro.AnyReplStr.err=0%\n%
set "repl=^!%%~d^!"%\n%
set "rtnVar=%%~f"%\n%
set "rtn="%\n%
set beg=0%\n%
for %%f in (^^^!macro.AnyReplStr.found^^^!) do (%\n%
set /a len=%%f-beg%\n%
for /f "tokens=1,2" %%a in ("^!beg^! ^!len^!") do set "rtn=^!rtn^!^!macro.AnyReplStr.str:~%%a,%%b^!^!repl^!"%\n%
set /a beg=%%f+macro.AnyReplStr.searchLen%\n%
)%\n%
for %%a in (^^^!beg^^^!) do set "rtn=^!rtn^!^!macro.AnyReplStr.str:~%%a^!"%\n%
) else (%\n%
set "rtn="%\n%
set "rtnVar="%\n%
)%\n%
!macro_Call! ("macro.AnyReplStr.err 1 rtn ^!rtnVar^!") !macro.AnyRtn1!%\n%
)
%macro_Call% ("macro.AnyReplStr") %macro.EndDef%
%macro_EndAnyRtn%
::----------------------------------------------------
:: Mark that this library has been loaded.
set macro\load.%~n0=1
Older function versions of the routines:
Code: Select all
@echo off
call :%*
exit /b
:replStr [/I] TargetVar SearchVar ReplaceVar OccuranceVal [RtnVar]
%InitFcnRtn%
setlocal enableDelayedExpansion
if /i "%~1"=="/I" (
set "replStr.opt=/I"
shift /1
) else set "replStr.opt="
call :instr %replStr.opt% %1 %2 %4 replStr.found
set replStr.err=%errorlevel%
if %replStr.err% lss 2 (
set replStr.err=0
set "replStr.str=!%~1!"
call :strLen %2 replStr.searchLen
set "repl=!%~3!"
set "rtnVar=%~5"
set "rtn="
set beg=0
for %%f in (!replStr.found!) do (
set /a len=%%f-beg
for /f "tokens=1,2" %%a in ("!beg! !len!") do set "rtn=!rtn!!replStr.str:~%%a,%%b!!repl!"
set /a beg=%%f+replStr.searchLen
)
for %%a in (!beg!) do set "rtn=!rtn!!replStr.str:~%%a!"
) else (
set "rtn="
set "rtnVar="
)
( endlocal
if "%~5" neq "" (set "%~5=%rtn%") else echo(%rtn%
exit /b %replStr.err%
)
exit /b
:instr [/I] TargetVar SearchVar OccuranceVal [RtnVar]
::
:: Computes the position of the Nth occurrance of a search string within
:: a target string.
::
:: The case insensitive /I option directs instr to perform a case
:: insensitive search.
::
:: The target string is contained within variable TargetVar.
::
:: The search string is contained within variable SearchVar.
::
:: The Nth occurance is specified by the OccuranceVal. OccuranceVal may
:: be specified using any expression supported by SET /A. A positive
:: OccuranceVal indicates the search starts from the beginning. A negative
:: OccuranceVal indicates the search starts from the end. An OccuranceVal
:: of 0 directs instr to return all matches as a space delimited string
:: of positions in increasing order.
::
:: The resulting position(s) is always reported relative to the beginning
:: of the targetStr with 0 being the first character.
::
:: The result is an empty string if an error occurs
::
:: The result is returned in variable RtnVar
:: or the result is echoed if RtnVar is not specified
::
:: The ERRORLEVEL is set as follows:
:: 0 - Success
:: 1 - The Nth occurance of SearchStr was not found
:: 2 - A required argument was missing or invalid
::
setlocal enableDelayedExpansion
set instr.err=
if /i "%~1"=="/I" (
set "instr.opt=/I"
shift /1
) else set "instr.opt="
if "%~3"=="" set instr.err=2 else (
set "instr.targetStr=!%~1!"
set "instr.searchStr=!%~2!"
if not defined instr.targetStr set instr.err=2
if not defined instr.searchStr set instr.err=2
set /a occ=(%3) 2>nul
if errorlevel 1 set instr.err=2
)
if not defined instr.err (
call :strlen instr.targetStr targetLen
call :strlen instr.searchStr searchLen
if !searchLen! gtr !targetLen! set instr.err=1
)
if not defined instr.err (
if !occ! geq 0 (
set /a "beg=0, step=1, end=targetLen-searchLen"
) else (
set /a "beg=targetLen-searchLen, step=-1, end=0"
)
if !occ! neq 0 (set /a "occStep=step") else set /a "occStep=0"
set "off="
set "done=0"
set /a skip=0
for %%l in (!searchLen!) do for /l %%o in (!beg!,!step!,!end!) do if !done! equ 0 (
if !skip! equ 0 (
if %instr.opt% "!instr.targetStr:~%%o,%%l!"=="!instr.searchStr!" (
set /a occ-=occStep
if !occ! equ 0 (
set off=!off! %%o
set /a done=occStep
)
set /a skip=searchLen-1
)
) else set /a skip-=1
)
)
if not defined instr.err if not defined off set instr.err=1
if defined instr.err (set rtn=) else set "rtn=!off:~1!"& set instr.err=0
(endlocal & rem -- return values
if "%~4" neq "" (set %~4=%rtn%) else (echo:%rtn%)
exit /b %instr.err%
)
exit /b
:strLen string len -- returns the length of a string
:: -- string [in] - variable name containing the string being measured for length
:: -- len [out] - variable to be used to return the string length
:: Many thanks to 'sowgtsoi', but also 'jeb' and 'amel27' dostips forum users helped making this short and efficient
:$created 20081122 :$changed 20101116 :$categories StringOperation
:$source http://www.dostips.com
( SETLOCAL ENABLEDELAYEDEXPANSION
set "str=A!%~1!"&rem keep the A up front to ensure we get the length and not the upper bound
rem it also avoids trouble in case of empty string
set "len=0"
for /L %%A in (12,-1,0) do (
set /a "len|=1<<%%A"
for %%B in (!len!) do if "!str:~%%B,1!"=="" set /a "len&=~1<<%%A"
)
)
( ENDLOCAL & REM RETURN VALUES
IF "%~2" NEQ "" SET /a %~2=%len%
)
EXIT /b
Dave Benham