FindRepl.bat:New regex utility to search and replace strings

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
Aacini
Expert
Posts: 1914
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

FindRepl.bat:New regex utility to search and replace strings

#1 Post by Aacini » 26 Jun 2013 21:00

FIND.EXE program was the first DOS command designed to search strings in files, followed by FINDSTR.EXE that include a limited regular expression management. The Windows Script Host, present in all Windows versions from XP on, provide an advanced regular expression support that had been used to write "search and replace" JScript and VBScript programs designed to be used in Batch environment, like the large dbenham's REPL.BAT program, or the small foxidrive's sar.bat one (http://www.dostips.com/forum/viewtopic.php?f=3&t=3676&p=19387#p19387).

FindRepl.bat is my own version of this type of programs. I have included several additional features that allows it to output a more extensive set of results, so the same program may be used to solve a wider range of similar problems. The source code of the program is included below with a detailed documentation after it. In order to have an easier access to the documentation, I subdivided it in numbered sections with a header before each one, so you may use the "Find in this page" feature of the explorer looking for the "#. " (number-dot-space) of the desired section in order to quickly locate it. This is the index:

  • FIND STRINGS
    1. Basic find string usage
    2. Finding blocks of lines (/E and /O switches)
    3. Basic usage of regular expressions
    4. Storing search strings in Batch variables
    5. Search an extra string in a block of lines (/B switch)
    6. Defining and using subexpressions (/$ switch)
    7. A large example
  • FIND AND REPLACE STRINGS
    1. Basic find and replace string usage
    2. Replacing line delimiters
    3. Find and replace multiple strings (/A switch)

EDIT 2014-11-20: A new version of FindRepl.bat program has been released. The code below include the new modifications; the new features of FindRepl.bat version 2 program are described at this post.

Code: Select all

@if (@CodeSection == @Batch) @then

:: The first line above is...
:: - in Batch: a valid IF command that does nothing.
:: - in JScript: a conditional compilation IF statement that is false,
::               so this Batch section is omitted until next "at-sign end".


@echo off

rem FindRepl.bat: Utility to search and search/replace strings in a file
rem http://www.dostips.com/forum/viewtopic.php?f=3&t=4697
rem Antonio Perez Ayala

rem   - Jun/26/2013: first version.
rem   - Jul/02/2013: use /Q in submatched substrings, use /VR for /V in Replace, eliminate "\r\n" in blocks.
rem   - Jul/07/2013: change /VR by /R, search for "\r\n" with /$ and no /V, /N nor blocks.

rem   - Nov/20/2014: Version 2   - New switches: /J, /L, /G, /ARG#, and || separate regexp's in /Alternations.
rem   - Nov/30/2014: Version 2.1 - Changed /ARG# by /VAR. Replacements on numbered-blocks of lines.
rem                                New sets of predefined functions for Date, File, Folder and Drive management.
rem   - Dic/15/2014: Version 2.2 - New "data generating " predefined functions for /S switch combined with /J

if "%~1" equ "/?" goto showUsage
if /I "%~1" equ "/help" goto showHelp

CScript //nologo //E:JScript "%~F0" %*
if %errorlevel% equ -1 goto showUsage
exit /B %errorlevel%

<usage>
Searches for strings in Stdin file, and prints or replaces them.

FINDREPL [/I] [/V] [/N] [rSearch] [/E:rEndBlk] [/O:s:e] [/B:rBlock] [/$:1:2...]
         [[/R] [/A] sReplace] [/Q:c] [/S:sSource]
         [/J[:n] [/L:jLast]] [/G:file] [/VAR:name=value,...]

  /I         Specifies that the search is not to be case-sensitive.
  /V         Prints only lines that do not contain a match.
  /N         Prints the line number before each line that matches.
  rSearch    Text to be searched for. Also mark the Start of a block of lines.
  /E:rEndBlk Text to be searched for that mark the End of a block of lines.
  /O:s:e     Specifies signed numbers to add to Start/End lines of blocks.
  /B:rBlock  Extra text to be searched for in the blocks of lines.
  /$:1:2...  Specifies to print saved submatched substrings instead of lines.
  sReplace   Text that will replace the matched text.
  /R         Prints only replaced lines.
  /A         Specifies that sReplace has alternative values matching rSearch.
  /Q:c       Specifies a character that is used in place of quotation marks.
  /S:sSource Text to be processed instead of Stdin file.
  /J[:n]     Specifies that sReplace/sSource texts have JScript expressions.
  /L:jLast   Execute jLast as a JScript expression after the last line of file.
  /G:file    Specifies the file to get rSearch and sReplace texts from.
  /VAR:n=v,..Specifies a series of JScript "name=value" var's to initialize.

The /Q:c switch allows to use the given character in the search/replacement
texts in the places where quotes are needed, and replace them later.

If the first character of any text is an equal-sign, it specifies the name of
a Batch variable that contain the text to be processed.

The /J switch specifies that sReplace/sSource texts contain JScript expressions
that are evaluated in order to provide the actual replacement and source texts.

All search texts must be given in VBScript regular expression format (regexp).
Matching characters: .=any char, \d=[0-9], \D=non-digit, \w=[A-Za-z0-9_],
\W=non-alphanumeric, \s=[ \t\r\n], \S=non-space. Anchors: ^=begin of line,
$=end of line, \b=begin/end of word. Quantifiers (repeats previous match):
*=zero or more times, +=one or more times, ?=zero or one time. Use parentheses
to delimit these constructs and create "subexpressions". To search for these
special characters, place a back-slash before each one of them: \*+?^$.[]{}()|

The /A (alternation) switch allows to define several alternatives separated by
pipe in rSearch/sReplace: "a|b|c" /A "x|y|z" replace a with x, b with y, etc.;
in this case the alternatives can only be case-sensitive literal strings. If /J
switch is added, the alternatives in rSearch are always regular expressions and
in sReplace are JScript expressions, and they must be separated by double
pipes: "red|green|blue||Peter|Paul|Mary" /A "'a color'||'a name'" /J


The operation performed and the output displayed varies depending on the mix
of parameters and switches given, that can be divided in three general cases.

FINDREPL  [/V] [rSearch] [/E:rEndBlk] [/O:s:e] [/B:rBlock]  [[/R] sReplace]


A) Find-only operation: when sReplace parameter is not given.

rSearch                            Show matching lines.
rSearch /V                         Show non-matching lines.
rSearch /O:s:e                     Add S and E to each matching line (block).
rSearch /E:rEndBlk                 From rSearch line(s) to rEndBlk line(s).
rSearch /E:rEndBlk /O:s:e          Add S to rSearch line and E to rEndBlk line.
                                   The last three cases define a "block".
rSearch block /B:rBlock            Search rBlock in the previous block.

When rSearch/rEndBlk includes subexpressions, the /$ switch may be added in
order to print only saved submatched substrings instead of complete lines.


B) Find-Replace operation: when sReplace parameter is added to previous ones.

rSearch sReplace                   Show all file lines after replacements.
rSearch sReplace /R                Show only replaced lines.
rSearch block sReplace             Replaces text in blocks only.
rSearch block /B:rBlock sReplace   Replaces only text matched by rBlock.

When rSearch/rBlock includes subexpressions, the replacement text may
include $0, $1, ... in order to retrieve saved submatched substrings.


C) Numbered-block operation: when rSearch parameter is null (requires /O).

/O:s:e                             Show block of lines, from line S to line E.
/O:s:e /B:rBlock                   Search rBlock in the previous block.
"" /O:s:e /B:rBlock sReplace       Replaces only text matched by rBlock.

In this case if S or E is negative, it specifies counting lines from the end of
file. If E is not given, it defaults to the last line of the file (same as -1).


The total number of matchings/replacements is returned in ERRORLEVEL.


</usage>

:showUsage
< "%~F0" CScript //nologo //E:JScript "%~F0" "^<usage>" /E:"^</usage>" /O:+1:-1
echo -^> For further help, type: %0 /help
goto :EOF

<help>

A web site may be opened in order to get further help on the following topics:

   1- FindRepl.bat documentation:
         Detailed description of FindRepl features with multiple examples.

   2- FindRepl.bat version 2 documentation:
         Additional descriptions on /J, /L, /G switches and || alternatives.

   3- Regular Expressions documentation:
         Describe the features that may be used in rSearch, rEndBlk and rBlock.

   4- Alternation and Subexpressions documentation:
         Describe how use | to separate values in rSearch with /A switch
         and features of subexpressions for /$ switch and $0..$n in sReplace.

   5- JScript expressions documentation (/J switch):
         Describe the operators that may be used in JScript expressions.

   6- Data types and functions for JScript expressions.
         Describe additional operations available in JScript:
         - String Object: functions to manipulate strings.
         - Math Object: arithmetic functions.
         - Date Object: functions for date calculations.
         See also Topic 2- Section 3. on predefined functions.

   7- FileSystemObject objects documentation:
         Describe the properties that may be used in Property functions
         and the rest of File/Folder/Drive predefined functions.

   8- Special folders documentation:
         Describe the values used in specialFolders predefined function.

   9- Windows Management Instrumentation FAQ:
         General description about the features and capabilities (classes and
         properties) that may be used in wmiCollection predefined function.

</help>

:showHelp
setlocal EnableDelayedExpansion
set n=1
set "choices="
for %%a in ("http://www.dostips.com/forum/viewtopic.php\Qf=3&t=4697"
            "http://www.dostips.com/forum/viewtopic.php\Qf=3&t=4697&p=38121#p38121"
            "http://msdn.microsoft.com/en-us/library/6wzad2b2(v=vs.84).aspx"
            "http://msdn.microsoft.com/en-us/library/kstkz771(v=vs.84).aspx"
            "http://msdn.microsoft.com/en-us/library/ce57k8d5(v=vs.84).aspx"
            "http://msdn.microsoft.com/en-us/library/htbw4ywd(v=vs.84).aspx"
            "http://msdn.microsoft.com/en-us/library/bkx696eh(v=vs.84).aspx"
            "http://msdn.microsoft.com/en-us/library/0ea7b5xe(v=vs.84).aspx"
            "http://technet.microsoft.com/en-us/library/ee692772.aspx"
           ) do (
   set "choices=!choices!!n!"
   set "option[!n!]=%%~a"
   set /A n+=1
)
< "%~F0" CScript //nologo //E:JScript "%~F0" "^<help>" /E:"^</help>" /O:+1:-1

:getOption
echo/
choice /C %choices%0 /N /M "Select one of previous topics, or press 0 to end:"
if errorlevel %n% goto :EOF
set "choice=%errorlevel%"
explorer "!option[%choice%]:\Q=?!"
echo  - Help on topic %choice% started...
goto getOption


End of Batch section


@end


// JScript section


// FINDREPL [/I] [/V] [/N] rSearch [/E:rEndBlk] [/O:s:e] [/B:rBlock] [/$:1:2...]
//          [[/R] [/A] sReplace] [/Q:c] [/S:source]
//          [/J[:n] [/L:jEnd]] [/G:file]

var options = WScript.Arguments.Named,
    args    = WScript.Arguments.Unnamed,
    env     = WScript.CreateObject("WScript.Shell").Environment("Process"),
    fso     = new ActiveXObject("Scripting.FileSystemObject"), file,

    ignoreCase   = options.Exists("I")?"i":"",
    notMatched   = options.Exists("V"),
    showNumber   = options.Exists("N"),
    search       = "",
    endBlk       = undefined,
    offset       = undefined,
    block        = undefined,
    submatches   = undefined,
    justReplaced = options.Exists("R"),
    alternation  = options.Exists("A"),
    replace      = undefined,
    quote        = options.Item("Q"),
    inputLines,
    Jexpr        = options.Exists("J"),

    lineNumber = 0, range = new Array(),
    procLines = false, procBlocks = false,
    nextMatch, result  = 0,

    match = function ( line, regex ) { return line.search(regex) >= 0; },

    parseInts =
       function ( strs ) {
          var Ints = new Array();
          for ( var i = 0; i < strs.length; ++i ) {
             Ints[i] = parseInt(strs[i]);
          }
          return Ints;
       },

    getRegExp =
       function ( param, justLoad ) {
          var result = param;
          if ( result.substr(0,1) == "=" ) result = env(result.substr(1));
          if ( quote != undefined ) result = result.replace(eval("/"+quote+"/g"),"\\x22");
          if ( ! justLoad ) result = new RegExp(result,"gm"+ignoreCase);
          return result;
       }
    ; // end var


// PREDEFINED VARIABLES AND FUNCTIONS

if ( Jexpr) {
   JexprN = options.Item("J") ? parseInt(options.Item("J")) : 10;
   var SUM = new Array(JexprN+1), N = new Array(JexprN+1), PROD = new Array(JexprN+1),
       MAX = new Array(JexprN+1), MIN = new Array(JexprN+1), n = 0;
   for ( var i = 1; i <= JexprN; i++ ) {
      SUM[i] = 0; N[i] = 0; PROD[i] = 1;
      MAX[i] = Number.NEGATIVE_INFINITY; MIN[i] = Number.POSITIVE_INFINITY;
   }
}

// Range functions

function choose(arg,i){return(arg[i]);}
function hlookup(arg,lim,a,b) {
   var ind=a;
   for ( var i=a; i<=b; i++ ) if ( arg[i]>arg[ind] && arg[i]<=lim ) ind=i;
   return(ind);
}
function sum(arg,a,b) {
   var val = 0, n = 0, v;
   for ( var i=a; i<=b; i++ ) {
      if ( ! isNaN(v=parseFloat(arg[i])) ) { val+=v; SUM[i]+=v; N[i]++; n++; }
   }
   return(val);
}
function prod(arg,a,b) {
   var val = 1, n = 0, v;
   for ( var i=a; i<=b; i++ ) {
      if ( ! isNaN(v=parseFloat(arg[i])) ) { val*=v; PROD[i]*=v; n++; }
   }
   return(val);
}
function max(arg,a,b) {
   var val=Number.NEGATIVE_INFINITY, v;
   for ( var i=a; i<=b; i++ ) {
      if ( ! isNaN(v=parseFloat(arg[i])) ) {
         if ( v>val ) val=v;
         if ( v>MAX[i] ) MAX[i]=v;
      }
   }
   return(val);
}
function min(arg,a,b) {
   var val=Number.POSITIVE_INFINITY, v;
   for ( var i=a; i<=b; i++ ) {
      if ( ! isNaN(v=parseFloat(arg[i])) ) {
         if ( v<val ) val=v;
         if ( v<MIN[i] ) MIN[i]=v;
      }
   }
   return(val);
}

// Date functions

file = fso.CreateTextFile("FindRepl.tmp", true);
file.WriteLine( (new Date("12/31/2000")).getVarDate() );  // give proper credit, please...
file.Close();
file = fso.OpenTextFile("FindRepl.tmp", 1);
var date1st = file.Read(2);
file.Close();
file = fso.GetFile("FindRepl.tmp");
file.Delete();

function toDate(s) {  // Convert a string or a number (of milliseconds) to Date
   try {
      var d = s.split("/");
      if ( date1st == "31" ) { var aux = d[0]; d[0] = d[1]; d[1] = aux; }  // DD/MM/YYYY
      if ( date1st == "20" ) { aux = d[0]; d[0] = d[1]; d[1] = d[2]; d[2] = aux; }  // YYYY/MM/DD
      d = new Date(d[0]+"/"+d[1]+"/"+d[2]);
   } catch(e) {
      var d = new Date();
      d.setTime(s);
   }
   return(d);
}
function showDate(d,fmt) {  // Show a Date or a number (of milliseconds) with Date formats
   var d2 = d, s = "Unknown date format";
   if ( ! Date.prototype.isPrototypeOf(d2) ) d2 = toDate(d2);
   if ( fmt == 0 || fmt == undefined ) {
      s = d2.toDateString().split(" ");
      s = s[1]+"/"+s[2]+"/"+s[3];
   } else if ( fmt == 1 ) {
      s = d2.toDateString();
   } else if ( fmt == 2 ) {
      s = d2.toString();
   } else if ( fmt == 3 ) {
      s = d2.toString().split(" ");
      s = s[3];
   } else if ( fmt == 4 ) {
      s = d2.toLocaleString().split(" ");
      s = s[s.length-3]+" "+s[s.length-2]+" "+s[s.length-1];
   } else if ( fmt == 5 || fmt == 6 ) {
      d2.setTime(d2.getTime()+d2.getTimezoneOffset()*60000);
      s = d2.toString().split(" ");
      s = s[3]+((fmt==6)?"."+d2.getTime().toString().slice(-3):"");
   } else if ( fmt == 11 ) {
      s = d2.toLocaleDateString();
   } else if ( fmt == 12 ) {
      s = d2.toLocaleString();
   } else {
      var D = (100+d2.getDate()).toString().substr(1),
          M = (101+d2.getMonth()).toString().substr(1),
          Y = d2.getFullYear();
      if ( fmt == 7 ) {
         s = D+"-"+M+"-"+Y;
      } else if ( fmt == 8 ) {
         s = Y+"-"+M+"-"+D;
      } else {
         var t = (d2.toString().split(" "))[3].split(":");
         if ( fmt == 9 ) {
            s = Y+"-"+M+"-"+D+"@"+t.join(".");
         } else if ( fmt == 10 ) {
            s = Y+M+D+t.join("");
         }
      }
   }
   return(s);
}

var millisecsPerDay = 1000 * 60 * 60 * 24,
    startTime = new Date(),
    daysNow = Math.floor(startTime.getTime()/millisecsPerDay);
function dateDiff(d1,d2) {
   return( Math.floor((Date.prototype.isPrototypeOf(d1)?d1.getTime():d1)/millisecsPerDay) -
           Math.floor((Date.prototype.isPrototypeOf(d2)?d2.getTime():d2)/millisecsPerDay) );
}
function days(d){return( daysNow - Math.floor((Date.prototype.isPrototypeOf(d)?d.getTime():d)/millisecsPerDay) );}
function dateAdd(d,n) {
   var newD = new Date();
   newD.setTime( (Date.prototype.isPrototypeOf(d)?d.getTime():d) + n*millisecsPerDay );
   return(newD);
}

// File, Folder and Drive functions

function fileExist(name) {
   return(fso.FileExists(name));
}

// === Start of new section added in FindRepl V2.2
function fileCopy(name,destination) {
   fso.CopyFile(name,destination);
   return('File(s) "'+name+'" copied');
}
function fileMove(name,destination) {
   fso.MoveFile(name,destination);
   return('File(s) "'+name+'" moved');
}
function fileDelete(name) {
   fso.DeleteFile(name);
   return('File(s) "'+name+'" deleted');
}

var A = "Attributes", TC = "DateCreated", TA = "DateLastAccessed", TW = "DateLastModified",
    D = "Drive", NX = "Name", N = "NameOnly", P = "ParentFolder", F = "Path", SN = "ShortName",
    SF = "ShortPath", Z = "Size", X = "Extension", T = "Type";
function fileProperty(fileName,property) {
   var p,x;
   if ( property == "Extension" ) {
      p = fso.GetExtensionName(fileName);
   } else if ( property == "NameOnly" ) {
      p = fso.GetFile(fileName).Name;
      if ( x=fso.GetExtensionName(fileName) ) p = p.slice(0,-(x.length+1));
   } else {
      p = eval("fso.GetFile(fileName)."+property);
   }
   return(p);
}

function folderExist(name) {
   return(fso.FolderExists(name));
}

function fileRename(fileName,newName) {
   var file = fso.GetFile(fileName);
   if ( fileName.toUpperCase() == newName.toUpperCase() ) file.Name = "_  .  _";
   file.Name = newName;
   return('File "'+fileName+'" renamed to "'+newName+'"');
}
function folderRename(folderName,newName) {
   var folder = fso.GetFolder(fileName);
   if ( folderName.toUpperCase() == newName.toUpperCase() ) folder.Name = "_  .  _";
   folder.Name = newName;
   return('Folder "'+folderName+'" renamed to "'+newName+'"');
}

function Copy(name,destination) {
   fso.GetFile(name).Copy(destination);
   return('File/Folder "'+name+'" copied');
}
function Move(name,destination) {
   fso.GetFile(name).Move(destination);
   return('File/Folder "'+name+'" moved');
}
function Delete(name) {
   fso.GetFile(name).Delete();
   return('File/Folder "'+name+'" deleted');
}

function folderCopy(name,destination) {
   fso.CopyFolder(name,destination);
   return('Folder(s) "'+name+'" copied');
}
function folderMove(name,destination) {
   fso.MoveFolder(name,destination);
   return('Folder(s) "'+name+'" moved');
}
function folderDelete(name) {
   fso.DeleteFolder(name);
   return('Folder(s) "'+name+'" deleted');
}

function driveProperty(drivePath,property) {
   return( eval("fso.GetDrive(fso.GetDriveName(drivePath))."+property) );
}

function folderProperty(folderName,property) {
   var p,x;
   if ( property == "Extension" ) {
      p = fso.GetExtensionName(folderName);
   } else if ( property == "NameOnly" ) {
      p = fso.GetFolder(folderName).Name;
      if ( x=fso.GetExtensionName(fileName) ) p = p.slice(0,-(x.length+1));
   } else {
      p = eval("fso.GetFolder(folderName)."+property);
   }
   return(p);
}

// Data generating functions

function drivesCollection ( propList ) {
   var e = new Enumerator(fso.Drives), drives = "";
   if ( arguments.length ) {
      for ( ; !e.atEnd(); e.moveNext() ) {
         var drive = e.item();
         for ( var i = 0; i < arguments.length; i++ ) {
            var a = arguments[i];
            drives += (i?",":"")+'"'+(((a=="DriveLetter"||a=="DriveType"||a=="Path")||drive.IsReady)?
                                      eval("drive."+arguments[i]):"Not ready")+'"';
         }
         drives += "\r\n";
      }
   } else {
      for ( ; !e.atEnd(); e.moveNext() ) {
         drives += '"'+e.item().Path+'"\r\n';
      }
   }
   return(drives);
}

function filesCollection ( pathOrCol, propList ) {
   var files = "", fc;
   if ( arguments.length ) {
      try {  // if ( pathOrCol is path ) {
         fc = fso.GetFolder(pathOrCol).Files;
      } catch (e) {  // } else {  // pathOrCol is filesCol
         fc = pathOrCol;
      }
      for ( fc = new Enumerator(fc); !fc.atEnd(); fc.moveNext() ) {
         var file = fc.item();
         for ( var i = 1; i < arguments.length; i++ ) {
            files += (i>1?",":"")+'"'+eval("file."+arguments[i])+'"';
         }
         files += "\r\n";
      }
   } else {
      for ( fc = new Enumerator(fso.GetFolder(".").Files); !fc.atEnd(); fc.moveNext() ) {
         files += '"'+fc.item().Name+'"\r\n';
      }
   }
   return(files);
}

function foldersCollection ( pathOrCol, propList ) {
   var folders = "", fc;
   if ( arguments.length ) {
      try {  // if ( pathOrCol is path ) {
         fc = fso.GetFolder(pathOrCol).SubFolders;
      } catch (e) {  // } else {  // pathOrCol is foldersCol
         fc = pathOrCol;
      }
      for ( fc = new Enumerator(fc); !fc.atEnd(); fc.moveNext() ) {
         var folder = fc.item();
         for ( var i = 1; i < arguments.length; i++ ) {
            if ( arguments[i].indexOf(".SubFolders") >= 0 ) {
               if ( folder.SubFolders && eval(arguments[i]) ) {
                  arguments[0] = folder.SubFolders;
                  folders += (folders.slice(-2)!="\r\n"?"\r\n":"") + foldersCollection.apply(undefined,arguments);
                  if ( i+1 < arguments.length ) eval(arguments[i+1]);
               }
               i = arguments.length;
            } else if ( arguments[i].indexOf(".Files") >= 0 ) {
               folders += (folders.slice(-2)!="\r\n"?"\r\n":"") + eval(arguments[i]);
            } else {
               folders += (i>1?",":"")+'"'+eval("folder."+arguments[i])+'"';
            }
         }
         if ( folders.slice(-2) != "\r\n" ) folders += "\r\n";
      }
   } else {
      for ( fc = new Enumerator(fso.GetFolder(".").SubFolders); !fc.atEnd(); fc.moveNext() ) {
         folders += '"'+fc.item().Name+'"\r\n';
      }
   }
   return(folders);
}

function specialFolders ( specialFolder, propList ) {
   var WshShell = WScript.CreateObject("WScript.Shell"), special = "", fc;
   if ( arguments.length ) {
      var fc = specialFolder.split("."), folder = WshShell.SpecialFolders(fc[0]);
      for ( fc = new Enumerator(eval("fso.GetFolder(folder)."+fc[1])); !fc.atEnd(); fc.moveNext() ) {
         var s = fc.item();
         for ( var i = 1; i < arguments.length; i++ ) {
            special += (i>1?",":"")+'"'+eval("s."+arguments[i])+'"';
         }
         special += "\r\n";
      }
   } else {
      for ( fc = new Enumerator(WshShell.SpecialFolders); !fc.atEnd(); fc.moveNext() ) {
         special += fc.item()+"\r\n";
      }
   }
   return(special);
}

function wmiCollection ( className, propList ) {
   // http://msdn.microsoft.com/en-us/library/aa393741(v=vs.85).aspx
   var wmi = "", colItems;
   if ( arguments.length > 1 ) {
      // http://msdn.microsoft.com/en-us/library/windows/desktop/aa394606(v=vs.85).aspx
      colItems = GetObject("WinMgmts:").ExecQuery("Select * from " + className);
      for ( var e = new Enumerator(colItems); ! e.atEnd(); e.moveNext() ) {
         var s = e.item();
         for ( var i = 1; i < arguments.length; i++ ) {
            wmi += (i>1?",":"")+'"'+eval("s."+arguments[i])+'"';
         }
         wmi += "\r\n";
      }
   } else if ( arguments.length == 1 ) {
      // Method suggested by: http://msdn.microsoft.com/en-us/library/aa392315(v=vs.85).aspx
      //                      https://gallery.technet.microsoft.com/f0666124-3b67-4254-8ff1-3b75ae15776d
      colItems = GetObject("WinMgmts:").Get(className).Properties_;
      for ( var e = new Enumerator(colItems); ! e.atEnd(); e.moveNext() ) {
         wmi += e.item().Name+"\r\n";
      }
   } else {
      // https://gallery.technet.microsoft.com/scriptcenter/5649568b-074e-4f5d-be52-e8b7d8fe4517
      colItems = GetObject("WinMgmts:");  // imply ("WinMgmts:\root\cimv2")
      for ( var e = new Enumerator(colItems.SubclassesOf()); ! e.atEnd(); e.moveNext() ) {
         wmi += e.item().Path_.Class+"\r\n";
      }
   }
   return(wmi);
}

// === End of new section added in FindRepl V2.2
// Other functions

function prompt(s){WScript.Stderr.Write(s); return(keyb.ReadLine());}
function FOR($,init,test,inc,body) {
   var For="";
   for ( eval(init); eval(test); eval(inc) ) For+=eval(body);
   return(For);
}
var arguments="";


// PROCESS PARAMETERS

if ( file=options.Item("G") ) {
   file = fso.OpenTextFile(file, 1);
   search = "";
   if ( alternation ) replace = "";
   var sepIn = Jexpr?/(\\\\|\\\||[^\|]|[^\|][^\|])+/g:/(\\\\|\\\||[^\|])+/g, sepOut = "";
   while ( ! file.AtEndOfStream ) {
      if ( block=file.ReadLine() ) {
         if ( block.substr(0,4) == "var " && Jexpr ) {
            eval ( block );
         } else if ( block.substr(0,2) != "//" ) {
            if ( block.slice(0,1)+block.slice(-1) == '""' ) block = block.slice(1,-1);
            block = block.match(sepIn);
            search += sepOut+block[0];
            if ( alternation ) replace += sepOut+(block[1]?block[1]:"");
            sepOut = Jexpr?"||":"|";
         }
      }
   }
   file.Close();
   if ( quote != undefined ) search = search.replace(eval("/"+quote+"/g"),"\\x22");
} else {  // No option /G given
   if ( args.length > 0 ) {
      search = getRegExp(args.Item(0),true);
   }
   if ( args.length > 1 ) {
      replace = args.Item(1);
      if ( replace.substr(0,1) == "=" ) replace = env(replace.substr(1));
   }
}
if ( replace ) {
   if ( quote != undefined ) replace = replace.replace(eval("/"+quote+"/g"),"\\x22");
}

if ( ! WScript.Arguments.length ) WScript.Quit(-1);

if ( options.Exists("E") ) {
   endBlk = getRegExp(options.Item("E"));
   procBlocks = true;
}
if ( options.Exists("O") ) {
   offset = parseInts(options.Item("O").split(":"));
   procBlocks = true;
}
block = undefined;
if ( options.Exists("B") ) {
   block = getRegExp(options.Item("B"),true);
}
if ( options.Exists("$") ) submatches = parseInts(options.Item("$").split(":"));
var removeCRLF = false;


if ( replace != undefined ) {
   removeCRLF = (block == "\\r\\n") && (replace == "");

   if ( alternation ) {  // Enable alternation replacements from "Se|ar|ch" to "Re|pla|ce"
      if ( ! Jexpr ) {  // Original version

         var searchA = search.match(/(\\\\|\\\||[^\|])+/g),
             replaceA = replace.match(/(\\\\|\\\||[^\|])+/g),
             repl = new Array();
         for ( var i = 0; i < searchA.length; i++ ) {
            repl[eval('"'+searchA[i]+'"')] = replaceA?replaceA[i]?eval('"'+replaceA[i]+'"'):"":"";
         }
         replace = function($0,$1,$2) { return repl[$0]; };
         searchA.length = 0;
         if ( replaceA ) replaceA.length = 0;

      } else {  // Version 2: Search alternation have regular expressions:  "regexp1||regexp2||regexp3"
                //            Replace alternation have JScript expressions: "'Re'||'pla'||'ce'"

         var searchA = search.match(/(\\\\|\\\||[^\|]|[^\|][^\|])+/g), // divide search "regexp1||regexp2" in parts
             repl = replace.match(/(\\\\|\\\||[^\|]|[^\|][^\|])+/g);   // the same for "replace1||replace2"

         search = "";
         replace = "$0,";                               // "function($0,"
         for ( var i = 0; i < searchA.length; i++ ) {   // process each regexpI
            search += (i?'|':'')+'('+searchA[i]+')';    // re-assemble search regexp as "(regexp1)|(regexp2)"
            var subexprs = searchA[i].match(/[^\\]?\(/g);// count subexpressions in this regexpI
            subexprs = subexprs?subexprs.length:0;      // zero if no one
            for ( var j = 0; j <= subexprs; j++ ) {     // insert each subexpression
                  replace += '$'+(i+1)+'$'+j+',';       // 'function($0,' + "$i$0,$i$j,..."
            }
            repl[i] = repl[i].replace(/\$(\d{1,2})/g,"$$"+(i+1)+"$$$1");        // change "$n" by "$i$n" in replaceI
         }

         replace += "$offset,$string";                  // 'function($0, $i$0,$i$1,$i$2, ...' + "$offset, $string)"
         eval ("replace=function("+replace+"){for (var i=1; !eval('$'+i+'$0'); i++);return(eval(repl[i-1]));};");

         Jexpr = false;                                 // the replace function already includes the "eval" part

         searchA = undefined;
         if ( subexprs ) subexprs = undefined;

      }

   } else {  // No alternation

      var evalReplace = "";
      if ( Jexpr) {
         for ( var i = 0; i <= JexprN; i++ ) evalReplace += (i?',':'')+'$'+i;
         eval ( "evalReplace = function("+evalReplace+") { return(eval(replace)); };" );
      }

   }

} else {  // replace == undefined: Find-only operation (to do: adjust values of /$ switch)

   if ( Jexpr ) {  // Find a "second-level" alternation
      var searchA = search.split("||");                 // divide search "regexp1||regexp2" in parts
      search = "";
      for ( var i = 0; i < searchA.length; i++ ) {
         search += (i?'|':'')+'('+searchA[i]+')';       // re-assemble search regexp as "(regexp1)|(regexp2)"
      }
      searchA = undefined;
   }

}

if ( block=options.Item("VAR") ) eval ( "var "+block+";" );
if ( search != "" ) search = new RegExp(search, "gm"+ignoreCase);
if ( block  != undefined ) block  = new RegExp(block , "gm"+ignoreCase);
var keyb = fso.OpenTextFile("CONIN$", 1);


// PROCESS INPUT FILE


// FINDREPL [/I] [/V] [/N] rSearch [/E:rEndBlk] [/O:s:e] [/B:rBlock] [/$:s1...]
//          [[/R] [/A] sReplace] [/Q:c] [/S:sSource]

//          In Search and Replace operations:
//            /V or /N switches implies line processing
//            /E or /O switches implies block (and line) processing
//          If Search operation (with no previous switches) have NOT /$ switch:
//            implies line processing (otherwise is file processing)

var severalLines = false;
if ( options.Exists("S") ) {  // Process Source string instead of file
   var source = options.Item("S");
   if ( source.substr(0,1) == "=" ) source = env(source.substr(1));
   if ( ! Jexpr ) {
      inputLines = new Array(); lastLine = 1;
      inputLines[0] = source;
      procLines = true;
   } else {
      inputLines = eval(source);
      severalLines = true;
   }
} else {  // Process Stdin file
   inputLines = WScript.StdIn.ReadAll();
   severalLines = true;
}

if ( severalLines ) {

   if ( notMatched || showNumber || procBlocks ) procLines = true;
   if ( replace==undefined && submatches==undefined ) procLines = true;

   if ( procLines ) {  // Separate file contents in lines
      var lastByte = inputLines.slice(-1);
      inputLines = inputLines.replace(/([^\r\n]*)\r?\n/g,"$1\n").match(/^.*$/gm);
      lastLine = inputLines.length - ((lastByte == "\n")?1:0);
   }

   if ( procBlocks ) {  // Create blocks of lines
      if ( search != "" ) {  // Blocks based on Search lines:
         if ( offset == undefined ) offset = new Array(0,0);
         for ( var i = 1; i <= lastLine; i++ ) {
            if ( match(inputLines[i-1],search) ) {
               if ( endBlk != undefined ) {  // 1- from Search line to EndBlk line [+offsets].
                  for ( var j=i+1; j<=lastLine && !match(inputLines[j-1],endBlk); j++ );
                  if ( j <= lastLine ) {
                     var s = i+offset[0], e = j+offset[1];
                     // Insert additional code here to cancel overlapped blocks
                     range.push(s>0?s:1, e>0?e:1);
                  }
                  i = j;
               } else {  // 2- surrounding Search lines with offsets.
                  s = i+offset[0], e = i+offset[1];
                  range.push(s>0?s:1, e>0?e:1);
               }
            }
         }
      } else {  // Offset with no Search: block is range of lines
         if ( offset.length < 2 ) offset[1] = lastLine;
         s = offset[0]<0 ? offset[0]+lastLine+1 : offset[0];
         e = offset[1]<0 ? offset[1]+lastLine+1 : offset[1];
         range.push(s>0?s:1, e>0?e:1);
      }
      if ( range.length == 0 ) WScript.Quit(0);
      range.push(0xFFFFFFFF,0xFFFFFFFF);
   }

}
// endif severalLines

if ( replace == undefined ) {  // Search operations
   if ( procLines ) {  // Search on lines
      if ( procBlocks ) {  // Process previously created blocks
         for ( var r=0, lineNumber=1; lineNumber <= lastLine; lineNumber++ ) {
            if ( (range[r]<=lineNumber && lineNumber<=range[r+1]) != notMatched ) {
               if ( submatches != undefined ) {
                  if ( showNumber ) WScript.Stdout.Write(lineNumber+":");
                  while ( (nextMatch = block.exec(inputLines[lineNumber-1])) != null ) {
                     for ( var s = 0; s < submatches.length; s++ ) {
                        WScript.Stdout.Write(" " + (quote!=undefined?quote:'"') +
                                                   nextMatch[submatches[s]] +
                                                   (quote!=undefined?quote:'"'));
                     }
                     result++;
                  }
                  WScript.Stdout.WriteLine();
               } else {
                  if ( block == undefined  ||  match(inputLines[lineNumber-1],block) ) {
                     if ( showNumber ) WScript.Stdout.Write(lineNumber+":");
                     WScript.Stdout.WriteLine(inputLines[lineNumber-1]);
                     result++;
                  }
               }
            }
            if ( lineNumber >= range[r+1] ) r += 2;
         }
      } else {  // Process all lines for Search
         for ( lineNumber = 1; lineNumber <= lastLine; lineNumber++ ) {
            if ( match(inputLines[lineNumber-1],search) != notMatched ) {
               if ( showNumber ) WScript.Stdout.Write(lineNumber+":");
               if ( submatches != undefined ) {
                  search.lastIndex = 0;
                  while ( (nextMatch = search.exec(inputLines[lineNumber-1])) != null ) {
                     for ( var s = 0; s < submatches.length; s++ ) {
                        WScript.Stdout.Write(" " + (quote!=undefined?quote:'"') +
                                                   nextMatch[submatches[s]] +
                                                   (quote!=undefined?quote:'"'));
                     }
                     result++;
                  }
                  WScript.Stdout.WriteLine();
               } else {
                  WScript.Stdout.WriteLine(inputLines[lineNumber-1]);
                  result++;
               }
            }
         }
      }

   } else {  // Search on entire file and show submatched substrings
      if ( submatches != undefined ) {
         while ( (nextMatch = search.exec(inputLines)) != null ) {
            for ( var s = 0; s < submatches.length; s++ ) {
               WScript.Stdout.Write(" " + (quote!=undefined?quote:'"') +
                                          nextMatch[submatches[s]] +
                                          (quote!=undefined?quote:'"'));
            }
            result++;
            WScript.Stdout.WriteLine();
         }
      }
   }

} else {  // Replace operations

   if ( procLines ) {  // Replace on lines
      if ( procBlocks ) {  // Process previously created blocks
         if ( block == undefined ) block = search;  // Replace rSearch or rBlock (the last one)
         var CRLFremoved = false;
         for ( var r=0, lineNumber=1; lineNumber <= lastLine; lineNumber++ ) {
            if ( range[r]<=lineNumber && lineNumber<=range[r+1] ) {
               if ( removeCRLF ) {
                  WScript.Stdout.Write(inputLines[lineNumber-1]);
                  CRLFremoved = true;
                  result++;
               } else {
                  if ( match(inputLines[lineNumber-1],block) ) {
                     if ( CRLFremoved ) { WScript.Stdout.WriteLine(); CRLFremoved = false; }
                     WScript.Stdout.WriteLine(inputLines[lineNumber-1].replace(block,Jexpr?evalReplace:replace));
                     result++;
                  } else {
                     if ( CRLFremoved ) { WScript.Stdout.WriteLine(); CRLFremoved = false; }
                     if ( ! justReplaced ) WScript.Stdout.WriteLine(inputLines[lineNumber-1]);
                  }
               }
            } else {
               if ( CRLFremoved ) { WScript.Stdout.WriteLine(); CRLFremoved = false; }
               if ( ! justReplaced ) WScript.Stdout.WriteLine(inputLines[lineNumber-1]);
            }
            if ( lineNumber >= range[r+1] ) r += 2;
         }
         if ( CRLFremoved ) { WScript.Stdout.WriteLine(); CRLFremoved = false; }
      } else {  // Process all lines for Replace
         for ( lineNumber = 1; lineNumber <= lastLine; lineNumber++ ) {
            if ( match(inputLines[lineNumber-1],search) ) {
               WScript.Stdout.WriteLine(inputLines[lineNumber-1].replace(search,Jexpr?evalReplace:replace));
               result++;
            } else {
               if ( ! justReplaced ) WScript.Stdout.WriteLine(inputLines[lineNumber-1]);
            }
         }
      }

   } else {  // Replace on entire file
      WScript.Stdout.Write(inputLines.replace(search,Jexpr?evalReplace:replace));
   }

}

if ( Jexpr && (lastLine=options.Item("L")) ) {
   if ( lastLine.substr(0,1) == '=' ) lastLine = env(lastLine.substr(1));
   if ( quote != undefined ) lastLine = lastLine.replace(eval("/"+quote+"/g"),"\\x22");
   WScript.Stdout.WriteLine( eval(lastLine) );
}

WScript.Quit(result);

FIND STRINGS

1. Basic find string usage

Although it may seems complex, the usage of FindRepl.bat is straightforward. All optional switches may be included in any order, like in most standard "DOS" commands. The program processes lines read from Stdin and have two base parameters, a Search part and a Replace part: < theFile.txt FindRepl "Search" "Replace". If the Replace part is not given, then FindRepl.bat behaves like FINDSTR command.

- Simple find:

Code: Select all

< theFile.txt FindRepl "any string"

- Enumerate/Count all lines, even if the file have Unix-type end-of-line delimiters (<LF> only) or the last line doesn't have the end-of-line delimiter, or both:

Code: Select all

< theFile.txt FindRepl /N
echo Number of lines: %errorlevel%

- Eliminate all lines with "XYZ" text:

Code: Select all

< theFile.txt FindRepl /V "XYZ"


2. Finding blocks of lines (/E and /O switches)

Besides individual lines, FindRepl.bat program may also output blocks of lines that may be defined in several ways; the simplest one is via a numbered range of lines defined in /O switch (without Search part): FindRepl /O:s:e.

- Show from line 12 to line 34:

Code: Select all

< theFile.txt FindRepl /O:12:34

- Show the first 15 lines (head):

Code: Select all

< theFile.txt FindRepl /O:1:15

- Show the last 20 lines (tail):

Code: Select all

< theFile.txt FindRepl /O:-20

- Show both the first 15 lines and the last 20 lines (with line numbers):

Code: Select all

< theFile.txt FindRepl /V /O:16:-21 /N


If Search part and /O switch are given, the numbers in /O switch (with optional signs) are added to the number of each matching line: FindRepl "Search" /O:s:e.

- Search for a certain string and show a block of 3 lines in each match: 1 line before and 1 line after the matching line (http://www.dostips.com/forum/viewtopic.php?f=3&t=3801):

Code: Select all

< theFile.txt FindRepl "the string" /O:-1:+1


A block may also be defined via an ending matching line placed in /E switch (instead of /O): FindRepl "Start" /E:"End"; the block is comprised of all lines surrounded by the Start string and the next occurrence of the End string.

- Eliminate a block of lines surrounded by certain delimiting lines (http://stackoverflow.com/questions/17126655/how-to-remove-lines-or-text-in-given-lines-from-file-in-batch):

Code: Select all

< theFile.txt FindRepl /V "start remove" /E:"end remove"


Both /E and /O switches may be combined: FindRepl "Start" /E:"End" /O:s:e so S number is added to Start line and E number is added to End line.

- Show a block of lines surrounded by delimiting lines, but not including them:

Code: Select all

< theFile.txt FindRepl "start show" /E:"end show" /O:+1:-1
For example, the built-in help screen that appear when FindRepl.bat is executed with /? parameter is obtained via a command equivalent to this one:

Code: Select all

< "FindRepl.bat" FindRepl "^<usage>" /E:"^</usage>" /O:+1:-1


3. Basic usage of regular expressions

The Search string is not plain text, but a regular expression ("regexp"). In its most basic form, a regexp just includes the literal string to search: "Search this".

Several features may be included in a regexp, most of them selected via a backslash character: \. You may use a backslash followed by another character to specify binary bytes; for example, to specify a <TAB> character you may use \t or \cI (Ctrl-I) or \x09 (Ascii hexa-code 9). For a <LF> (newline) character you may use \n or \cJ or \x0A, and use \r or \cM or \x0D for <CR> (return).

Certain combinations of \ and the next character allows to specify sets of common characters. For example, use \d for any digit (equivalent to [0-9]), \D for nondigit characters (equivalent to [^0-9]), \w for alphanumeric characters and underscore ([A-Za-z0-9_]), \W for the rest of special characters, \s for separators (space, tab, return, new-line) and \S for non-separators. A dot alone match any character except a newline; to search for a dot, precede it with backslash.

You may anchor the beginning of line with ^ or the end of line with $, or use \b to anchor the beginning or end of a word. An asterisk repeat the previous matched character zero or more times, so "\d*" match an optional number. A plus sign repeat previous match one or more times ("\d+" match a required number), and a question mark repeat zero or one time ("\d?" match an optional digit).

For a complete description of this topic, see: http://msdn.microsoft.com/en-us/library/1400241x(v=vs.84).aspx Please note that you always must escape with backslash the following special characters in order to search for themselves: \*+?^$.[]{}()|

- Show lines that include a dot followed by a <TAB>:

Code: Select all

< theFile.txt FindRepl "\.\t"


All text parameters usually are strings enclosed in quotes or a single word with no quotes if it does not contain special Batch characters. Note that you can not include a quote in a regexp, so in this case you may use another character and place it in /Q switch. Although you may use the \x22 hexadecimal value to insert a quote in any text, the /Q:c method is much clearer. The colons in all switches are mandatory.

- Show lines that include this text: He said: "Here I am" and gone

Code: Select all

< theFile.txt FindRepl "He said: 'Here I am' and gone" /Q:'


4. Storing search strings in Batch variables

If the first character of any text parameter is an equal-sign, then it represent the name of a Batch variable that contains the text that will be processed; if you need to insert an equal-sign in the first character of a literal string for any reason, just escape it with backslash. The /S switch allows to process a given text instead of the Stdin file.

- Check if "C:\Certain\Path" already exists in system PATH variable (http://stackoverflow.com/questions/17086292/how-to-insert-a-new-path-into-system-path-variable-if-it-is-not-already-there):

Code: Select all

FindRepl "C:\Certain\Path" /S:=PATH > NUL
if %errorlevel% gtr 0 echo The given path exists in system PATH


5. Search an extra string in a block of lines (/B switch)

When blocks of lines are defined, their resulting lines may be searched again using an extra regexp defined in the /B:rBlock switch. In this case, if the /V switch is also given it is applied just to the first rSearch.

- Search for value of "Name" field inside "Category X" tag of an XML file, assuming that all values are placed in separated lines:

Code: Select all

< theFile.xml FindRepl "<Category X>" /E:"</Category X>" /B:"Name"


6. Defining and using subexpressions (/$ switch)

A regular expression may be comprised of several subexpressions enclosed in parentheses. Usually the search of such regexp display matching lines, but if the /$ switch is included just the individual submatched subexpressions are displayed. The /$ switch specify the desired subexpressions via a series of numbers separated by colon (a zero will show the whole matched expression); each subexpression is shown enclosed in quotes or in the character given in /Q switch. For a complete description of this point, see: http://msdn.microsoft.com/en-us/library/kstkz771(v=vs.84).aspx

- Search for the line next to "BEGIN DSJOB" and extract the value enclosed in quotes after "Identifier" (http://www.dostips.com/forum/viewtopic.php?f=3&t=4679):

Code: Select all

< theFile.txt FindRepl "BEGIN DSJOB" /O:+1:+1 /B:"Identifier '([^']*)'" /Q:' /$:1

- Search for lines that contain a keyword, get the line that is located 3 lines below and copy the characters 5 to 12 of those lines (http://stackoverflow.com/questions/27013999/using-a-batch-program-to-search-for-a-keyword-and-copy-offset-content-to-a-file):

Code: Select all

for /F %a in ('^< input.txt FindRepl "keyword" /O:+3:+3 /B:".{4}(.{8})" /$:1') do @echo %~a


7. A large example

The next example use "(\w+)" regexp to match words separated by special characters and "(\w+)@(\w+)\.(\w+)" regexp to match email addresses comprised of an user name, an at sign, a server name, a dot, and a domain name.

Code: Select all

C:\> < theFile.txt FindRepl /N
1:Line with an email address: joedoe@unknown.org
2:Please send mail to george@contoso.com and someone@example.com. Thanks!
3:Line number 3 with no email address

C:\> echo Number of lines: %errorlevel%
Number of lines: 3

C:\> set email=(\w+)@(\w+)\.(\w+)

C:\> < theFile.txt FindRepl /N =email
1:Line with an email address: joedoe@unknown.org
2:Please send mail to george@contoso.com and someone@example.com. Thanks!

C:\> echo Lines with email addresses: %errorlevel%
Lines with email addresses: 2

C:\> rem Extract email addresses:

C:\> < theFile.txt FindRepl /N =email /$:0
1: "joedoe@unknown.org"
2: "george@contoso.com" "someone@example.com"

C:\> echo Number of email addresses: %errorlevel%
Number of email addresses: 3

C:\> rem Separate email parts in domain, server, and user order:

C:\> < theFile.txt FindRepl /N =email /$:3:2:1
1: "org" "unknown" "joedoe"
2: "com" "contoso" "george" "com" "example" "someone"

C:\> rem Separate words:

C:\> < theFile.txt FindRepl "(\w+)" /$:1
 "Line" "with" "an" "email" "address" "joedoe" "unknown" "org"
 "Please" "send" "mail" "to" "george" "contoso" "com" "and" "someone" "example" "com" "Thanks"
 "Line" "number" "3" "with" "no" "email" "address"

C:\> echo Number of words: %errorlevel%
Number of words: 27

C:\> rem Find/count capital letters:

C:\> < theFile.txt FindRepl "[A-Z]" /$:0
 "L"
 "P" "T"
 "L"

C:\> echo %errorlevel%
4

C:\> rem Count letters:

C:\> < theFile.txt FindRepl /I "[A-Z]" /$:0 > NUL

C:\> echo %errorlevel%
124


FIND AND REPLACE STRINGS

1. Basic find and replace string usage

If the Replace part is given, then FindRepl.bat replaces the matched Search text with the Replace part. Note that "the matched Search text" is the same text previously described in Search part above, including blocks and /B:rBlock regexp. You may even replace text in a numbered block of lines; to do that, put "" in the Search text, use /O:s:e to select the block of lines and place the current Search text in /B: switch. The only switches that don't works in a Replace operation are /N and /$, but you may use a $ match variable in the Replace text in order to retrieve the saved submatched substrings as described at this site: http://msdn.microsoft.com/en-us/library/t0kbytzc(v=vs.84).aspx

- Eliminate all "XYZ" words:

Code: Select all

< theFile.txt FindRepl.bat "\bXYZ\b" ""

- Replace all "ACSD" strings with "XYZ" (http://www.dostips.com/forum/viewtopic.php?f=3&t=3282):

Code: Select all

< theFile.txt FindRepl "ACSD" "XYZ"

- Search for a specific word and replace the next 4 characters (http://stackoverflow.com/questions/17085650/find-a-string-and-replace-specific-letters-in-batch-file):

Code: Select all

< theFile.txt FindRepl "word ...." "word NEWC"

- Replace text in the line number 3 of the file only (http://www.dostips.com/forum/viewtopic.php?f=3&t=4697&p=34923#p34923):

Code: Select all

< file.txt FindRepl "" /O:3:3 /B:"VALUETOCHANGE" "Given Value"


In the Replace text you may insert the usual escaped characters: \t (TAB), \n (LF), \r (CR), etc., but you must use a double dollar sign ($$) in order to insert a literal $ character; the rest of characters are processed literally because the Replace text is not a regular expression. Also, because Batch-JScript interface problems, you can not insert a quote in the replace text even if it is stored in a Batch variable, but you may use the /Q switch to solve this problem. To do that, choose a character that you will insert in the Search and Replace parts in place of quotes and put the same character in /Q:c switch.

- Change from 1 up to 8 spaces by a <TAB>:

Code: Select all

< theFile FindRepl " {1,8}" "\t"

- Change the data value of a certain tag in a XML file (http://stackoverflow.com/questions/17054275/changing-tag-data-in-an-xml-file-using-windows-batch-file):

Code: Select all

< theFile.xml FindRepl "(<TagName>).*(</TagName>)" "$1NewValue$2"


2. Replacing line delimiters

If the /B:rBlock text is "\r\n" (<CR><LF> or end of line delimiter) and the sReplace string is "" (empty), then output lines are shown joined together in a long string without line separators. When the entire file is processed for Replace (no /V nor /N switches and no blocks are given), then the end of line delimiting characters ("\r\n" = <CR><LF>) may be searched and replaced in any way you wish. For example:

- Change <LF> Unix line delimiters to <CR><LF> Windows ones:

Code: Select all

< unixFile.txt FindRepl "\n" "\r\n" > windowsFile.txt

- Change <CR><LF> Windows line delimiters to <LF> Linux ones:

Code: Select all

< windowsFile.txt FindRepl "\r\n" "\n" > unixFile.txt


If the entire file is processed for Replace this way, the value returned in ERRORLEVEL is zero.

If the entire file is processed for Search (no /V nor /N nor blocks) and the /$ switch is given, then the end of line delimiters may be searched, but just the results of /$ switch will be displayed.


3. Find and replace multiple strings (/A switch)

Finally, support for fast replacement of multiple strings in just one pass of the file has been added. This feature is achieved via the /A switch and a series of alternative values separated by pipe character and placed in both Search and Replace text in the form of an "alternation", as described at this site: http://msdn.microsoft.com/en-us/library/kstkz771(v=vs.84).aspx

- Replace certain names (three of them) by three different ones (http://www.dostips.com/forum/viewtopic.php?f=3&t=3848):

Code: Select all

< theFile.txt FindRepl "Bob Jones|Mary|Tom Riley" /A "Fred Thomas|Jane|Doug Smith"


- Translate day-of-the-week names from Spanish to English:

Code: Select all

set "spanish=lunes|martes|miércoles|jueves|viernes|sábado|domingo"
set "english=Monday|Tuesday|Wednesday|Thursday|Friday|Saturday|Sunday"
< theFile.txt FindRepl =spanish /A =english


Please note that this multi-string replacement can only be achieved with case sensitive literal strings. If a search alternative match different input texts (i.e. if it is a regular expression), the result will be wrong; for example: FindRepl /I "One|Two|Three" /A "Uno|Dos|Tres". Previous example will replace "One" word by "Uno", but any other word that may be matched as result of the /I switch will be deleted, like "one", "onE", etc. You may use FindRepl "One|one|Two|two" /A "Uno|uno|Dos|dos" as a provisional fix, but the next version of the program will solve this problem.

The next example load a series of values from a replacements file with "old:new" format, and process a data file to replace all the strings. The multi-string replacement selection method is based on a direct access to an array element, so its speed is not affected by the number of elements. The only limit is the 8KB total size of the Batch variables that store the sets of replacements.

Code: Select all

@echo off
setlocal EnableDelayedExpansion
set search=
set replace=
for /F "tokens=1,2 delims=:" %%a in (replacements.txt) do (
   set "search=!search!|%%a"
   set "replace=!replace!|%%b"
)
set "search=!search:~1!"
set "replace=!replace:~1!"
< theFile FindRepl =search /A =replace

You may also use an alternation to just Search a file for a long list of words.


FindRepl.bat program have not any kind of error detection; if you provide wrong parameters, the CScript run-time support will issue an error message. Although error detection code could be easily added, this point will increase the size of the (already large) JScript program.

I encourage you to review the sites given in previous links about the Windows Script Host Regular Expression capabilities. A domain of this topic is the key to fully exploit the capabilities of this and other similar utilities.

You are invited to use FindRepl.bat program and report any problem or bug you may find. Be aware that the processing of very large files depends on the available memory, so if the file is huge, then the program may take too long to complete. I tested FindRepl.bat with a 10.5 MB size file in my very old and limited Windows XP computer and get the right results after a couple minutes.


Antonio
Last edited by Aacini on 15 Dec 2014 22:02, edited 16 times in total.

Squashman
Expert
Posts: 4486
Joined: 23 Dec 2011 13:59

Re: New regex utility to search and replace strings in files

#2 Post by Squashman » 26 Jun 2013 21:20

How about an option to have your search strings in a file like the /G option of FINDSTR?

On a side note: I have been meaning for years to figure out a better way to split a file based on the search strings in a FILE without having to use a FOR loop or use two separate FINDSTR commands.

Input
Inputfile.txt
Searchstrings.txt

Output
Match.txt (lines that match the search strings text file)
NoMatch.txt (doesn't match the search strings.)

Years ago I used a for loop to read the file in and then used FINDSTR and then checked the errorlevel. But this was extremely slow and clunky.

Always wondered if there was a better or faster way to do it.

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: New regex utility to search and replace strings in files

#3 Post by foxidrive » 26 Jun 2013 21:59

That's impressive Aacini - it looks like a very powerful tool - and I only got 2/3 of the way through the great documentation before I ran out of steam. I'll come back to finish later.

Just two typos so far - this is missing findrepl.bat

< theFile.xml "<Category X>" /E:"</Category X>" /B:"Name"

and I think where you write arroba in English it is commonly known as an 'at sign'

One more point - the http links need to have spaces at the start and end, I think, otherwise they are plain text and not links.

Cheers

Aacini
Expert
Posts: 1914
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: New regex utility to search and replace strings in files

#4 Post by Aacini » 27 Jun 2013 08:52

Squashman wrote:How about an option to have your search strings in a file like the /G option of FINDSTR?
@Squashman,

I didn't included this option because it would add even more code and resources (like FileSystemObject) to the already large program. A JScript program is compiled every time it is used and FindRepl is intended to be used frequently with small files (/G switch is not one of the most used options). Besides, the loading of search strings from a file may be done easily in Batch. If /G option would be included and the strings would be used on several files, the strings file would be loaded again with each file (unless the program be modified to also process files by itself via wild-cards). If the strings file is processed apart, you may use the created strings variable in several FindRepl executions.

For example, your program could be written this way:

Code: Select all

@echo off
setlocal EnableDelayedExpansion
rem FINDSTRINGS.BAT inputFile stringsFile
set search=
for /F "delims=" %%a in (%2) do set "search=!search!|%%a"
set "search=!search:~1!"
< %1 call FindRepl =search > Match.txt
< %1 call FindRepl =search /V > NoMatch.txt



foxidrive wrote:One more point - the http links need to have spaces at the start and end, I think, otherwise they are plain text and not links.
@foxidrive,

Thanks a lot for reporting the typos foxi, I fixed them...

In this forum the number of links in each post are limited to just 2, so in order to include more we need to disguise they in a way that the site engine don't detect them. The trick (that I borrowed from someone else, don't remember who) is to enclose the double slashes between italic tags. Of course, you need to copy the link and paste it in the navigator bar in order to use it...

Antonio

Squashman
Expert
Posts: 4486
Joined: 23 Dec 2011 13:59

Re: New regex utility to search and replace strings in files

#5 Post by Squashman » 27 Jun 2013 09:17

Aacini wrote:For example, your program could be written this way:

Code: Select all

< %1 call FindRepl =search > Match.txt
< %1 call FindRepl =search /V > NoMatch.txt

Antonio

Trying to avoid running it twice. I work with very large files. I guess I will have to get back into programming again.

Endoro
Posts: 244
Joined: 27 Mar 2013 01:29
Location: Bozen

Re: New regex utility to search and replace strings in files

#6 Post by Endoro » 27 Jun 2013 12:11

I miss a "/g:file" option in sed every day (or every other day).
I manage myself with unpleasant constructions such as

Code: Select all

sed -r "s#(.*)#/\1/d#" fileB | sed -f - fileA


It would be nice to have something better :)

brinda
Posts: 78
Joined: 25 Apr 2012 23:51

Re: New regex utility to search and replace strings in files

#7 Post by brinda » 27 Jun 2013 19:02

Antonio,

thanks for providing this. Something to replace grep in batch + it has replace functions as well.

Now for reading the manual :D

thank you.

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: New regex utility to search and replace strings in files

#8 Post by foxidrive » 27 Jun 2013 21:06

Aacini wrote:
foxidrive wrote:One more point - the http links need to have spaces at the start and end, I think, otherwise they are plain text and not links.
@foxidrive,

In this forum the number of links in each post are limited to just 2, so in order to include more we need to disguise they in a way that the site engine don't detect them. The trick (that I borrowed from someone else, don't remember who) is to enclose the double slashes between italic tags. Of course, you need to copy the link and paste it in the navigator bar in order to use it...

Antonio


I hadn't realised there was limit - I see. ta.

Aacini
Expert
Posts: 1914
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: New regex utility to search and replace strings in files

#9 Post by Aacini » 28 Jun 2013 09:11

@Squashman,

I wrote a small program that fulfills your specific needs:

Code: Select all

@if (@CodeSection == @Batch) @then

@echo off
if "%~2" neq "" if "%~1" neq "/?" goto begin
   echo Load strings from a file and search them in another file
   echo/
   echo FINDSTRINGS [I] [/N1] [/N2] dataFile stringsFile
   echo/
   echo   /I        Specifies that the search is not to be case-sensitive.
   echo   /N1 /N2   Prints line numbers before lines that matches/not matches.
   echo/
   echo Matching lines are printed in Stdout and non-matching lines in Stderr.
   goto :EOF
:begin
   CScript //nologo //E:JScript "%~F0" %*
   exit /B %errorlevel%

@end

// JScript section

var options = WScript.Arguments.Named,
    args    = WScript.Arguments.Unnamed,
    ignoreCase  = options.Exists("I")?"i":"",
    showNumber1 = options.Exists("N1"),
    showNumber2 = options.Exists("N2");

if ( args.Length < 2 ) WScript.Quit(1);

var fso = new ActiveXObject("Scripting.FileSystemObject"),
    file = fso.OpenTextFile(args.Item(1), 1),  // stringsFile, ForReading
    search = file.ReadLine().replace(/([][*+?^$.{}()|/\\])/g,"\\$1");
while ( ! file.AtEndOfStream ) {
   search += "|" + file.ReadLine().replace(/([][*+?^$.{}()|/\\])/g,"\\$1");
}
file.Close();
search = new RegExp(search,"g"+ignoreCase);
file = fso.OpenTextFile(args.Item(0), 1);  // dataFile, ForReading
while ( ! file.AtEndOfStream ) {
   var line = file.ReadLine();
   if ( line.search(search) >= 0 ) {
      if ( showNumber1 ) WScript.Stdout.Write((file.Line-1)+":");
      WScript.Stdout.WriteLine(line);
   } else {
      if ( showNumber2 ) WScript.Stderr.Write((file.Line-1)+":");
      WScript.Stderr.WriteLine(line);
   }
}
file.Close();
WScript.Quit(0);

For example:

Code: Select all

FINDSTRINGS Inputfile.txt Searchstrings.txt > Match.txt 2> NoMatch.txt


Enjoy! :mrgreen:

Antonio

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: New regex utility to search and replace strings in files

#10 Post by dbenham » 28 Jun 2013 17:37

Very nice work Antonio. You have implemented many features on my wish list for a find utility, and added others I hadn't even thought of.

I'm not convinced that combining a search utility with a replace utility is the best option. I'm guessing that it might be easier to provide two specialized utilities that are cumulatively more powerful then a single combined utility. I'm thinking easier both from a coding standpoint, and from a documentation and usability standpoint. But that really is a guess - I haven't studied your code or done any design work.

One feature I miss in your utility is the lost ability to read one line at a time. It is not uncommon to have a chain of pipes, and your use of ReadAll() serializes that step in the chain. The next pipe can't begin until your process is complete. ReadAll() also is less then ideal for really large files. I wonder what it would take to restructure things to read and process one line at a time whenever possible.

One feature I particularly like is your option to only print modified lines when doing a replace. I borrowed that idea and added it to my REPL.BAT utility.


Dave Benham

brinda
Posts: 78
Joined: 25 Apr 2012 23:51

Re: New regex utility to search and replace strings in files

#11 Post by brinda » 30 Jun 2013 23:47

Antonio,

this is great. No need to get permission on using downloaded exe from web to use for potential virus etc.

was using grep from http://unxutils.sourceforge.net/. Very little knowledge on regex

sample on grep + repl

Code: Select all

for %%g IN (*%%f_LN*) do (
    ..\grep -B 7 -P "Check Voltage is" "%%g" |findstr /v /c:"Check Voltage is" | ..\repl "\r|\n" "" M >> ..\test.htm
)


replace by findrepl

Code: Select all

for %%g IN (*%%f_LN*) do (
< "%%g" ..\findrepl  "Check Voltage is" /O:-7:0 |findstr /v /c:"Check Voltage is" | ..\findrepl "\r\n" "" >> ..\test.htm
)


Not only that, Could not notice any noticable difference when running through 58 files with each less than 1MB size in WIN2000 SP4.

Thanks again for this.

Squashman
Expert
Posts: 4486
Joined: 23 Dec 2011 13:59

Re: New regex utility to search and replace strings in files

#12 Post by Squashman » 01 Jul 2013 05:35

brinda wrote:No need to get permission on using downloaded exe from web to use for potential virus etc.

:?: :?: :?:

brinda
Posts: 78
Joined: 25 Apr 2012 23:51

Re: New regex utility to search and replace strings in files

#13 Post by brinda » 01 Jul 2013 06:15

Squashman wrote:
brinda wrote:No need to get permission on using downloaded exe from web to use for potential virus etc.

:?: :?: :?:


squashman,

at my work place. They do not allow grep.exe or anything to be downloaded for fear of virus etc eventhough an anti-virus is there - policy. A lot of proof is needed before the downloaded program gets approved + even than there is always the check that if anything goes wrong while using this - the fault lies with the user who downloaded. sorry, left the word out. :)

Squashman
Expert
Posts: 4486
Joined: 23 Dec 2011 13:59

Re: New regex utility to search and replace strings in files

#14 Post by Squashman » 01 Jul 2013 09:56

brinda wrote:
Squashman wrote:
brinda wrote:No need to get permission on using downloaded exe from web to use for potential virus etc.

:?: :?: :?:


squashman,

at my work place. They do not allow grep.exe or anything to be downloaded for fear of virus etc eventhough an anti-virus is there - policy. A lot of proof is needed before the downloaded program gets approved + even than there is always the check that if anything goes wrong while using this - the fault lies with the user who downloaded. sorry, left the word out. :)

Then your English is a bit broken. You basically said you did not need permission to download executable files from the web.

Aacini
Expert
Posts: 1914
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: New regex utility to search and replace strings in files

#15 Post by Aacini » 02 Jul 2013 04:51

brinda wrote:replace by findrepl

Code: Select all

for %%g IN (*%%f_LN*) do (
< "%%g" ..\findrepl  "Check Voltage is" /O:-7:0 |findstr /v /c:"Check Voltage is" | ..\findrepl "\r\n" "" >> ..\test.htm
)


@brinda,

If I correctly understood your code, you get first a block of 8 lines that ends at "Check Voltage is" line (with first FindRepl), and then eliminate the bottom line (with findstr /v). In this case the findstr is not necessary; just get the appropriate block of lines directly with FindRepl (from -7 to -1):

Code: Select all

for %%g IN (*%%f_LN*) do (
< "%%g" ..\findrepl  "Check Voltage is" /O:-7:-1 | ..\findrepl "\r\n" "" >> ..\test.htm
)


I slightly modified FindRepl.bat program so it may also eliminate the end-of-line delimiters when the /B:rBlock is "\r\n" and the sReplace string is "". If you use the modified version of this program (just copy it again from above), you may solve your problem this way:

Code: Select all

for %%g IN (*%%f_LN*) do (
< "%%g" ..\findrepl  "Check Voltage is" /O:-7:-1 /B:"\r\n" "" >> ..\test.htm
)


Antonio

Post Reply