Bug in "removing a substring using string substitution"
Posted: 05 Apr 2011 09:25
Source: Remove - Remove a substring using string substitution
Code tested: WinXP
The code is supposed to remove all occurrences of "the" from the string, but fails to cater for words such as "soothe" (which is not "the" ), "breathe", "blithe" etc.
Its better to split the sentence/text using tokens (ie, delimited by spaces). An implementation in vbscript
Note: vbscript comes installed by default on most Windows distribution. There is no reason one could not get to know it and harness its capabilities (same with powershell).
Example test
Now, the exact word "the" are removed. Of course, another method is to use regular expression where the engine can check for boundaries , (normally with \b modifier), but that's another story.
Also note that it caters for case-insensitivity, so "the" or "The" , "THe" are removed. The Batch version does not cater for that.
DosItNotHelp
Code tested: WinXP
Code: Select all
C:\work>type test.bat
@echo off
set str=the cat in the hat soothe my heart
echo.%str%
set str=%str:the =%
echo.%str%
C:\work>test.bat
the cat in the hat soothe my heart
cat in hat soomy heart
The code is supposed to remove all occurrences of "the" from the string, but fails to cater for words such as "soothe" (which is not "the" ), "breathe", "blithe" etc.
Its better to split the sentence/text using tokens (ie, delimited by spaces). An implementation in vbscript
Code: Select all
s = Split( WScript.Arguments(0) )
For i=0 To UBound(s)
If LCase(s(i)) <> "the" Then
WScript.StdOut.Write s(i) & " "
End If
Next
Note: vbscript comes installed by default on most Windows distribution. There is no reason one could not get to know it and harness its capabilities (same with powershell).
Example test
Code: Select all
C:\work>cscript //nologo test1.vbs "the cat in the hat soothe my heart"
cat in hat soothe my heart
Now, the exact word "the" are removed. Of course, another method is to use regular expression where the engine can check for boundaries , (normally with \b modifier), but that's another story.
Also note that it caters for case-insensitivity, so "the" or "The" , "THe" are removed. The Batch version does not cater for that.
DosItNotHelp