Page 1 of 1

file and text manipulation.....

Posted: 04 Jul 2011 11:57
by amichaelglg
I need to search through a text file and output into a new file anything starting with either E:\ or M:\ UPTO ending either a # sign or a space.
Example current file:
dsffdfe:\abc\cde rffrv m:\abc\efd#
dfe:\abc\cde rffrv m:\abc\efd#m:\abc\ggg.lst # e:\abc\hhh

Output should be:
e:\abc\cde
m:\abc\efd
e:\abc\cde
m:\abc\efd
m:\abc\ggg.lst
e:\abc\hhh

And if that wasn't bad enough.....I would like to run the output file created in some kind of file/folder checker i.e see if any of the folders or files listed do not exist..as most would exist.

Many thanks...in anticipation.

Re: file and text manipulation.....

Posted: 04 Jul 2011 12:39
by Cleptography
I am sorry and I don't care but this kind of parsing is just pointless and retarded using cmd. Such a waste of time. Please use a tool that can handle this sort of task easily, like vbscript or powershell.

Re: file and text manipulation.....

Posted: 04 Jul 2011 13:55
by dbenham
I split the output into 2 files - one for existing paths and one for non-existing paths.
The code has been tested with your "example current file", but no other testing was done:

Code: Select all

@echo off
setlocal disableDelayedExpansion

set inFile="input.txt"
set existsFile="exists.txt"
set notExistsFile="notExists.txt"

if exist %existsFile% del %existsFile%
if exist %notExistsFile% del %notExistsFile%

set lf=^


::Do not remove above 2 blank lines!
set ForEntireLine=^^^"eol^^=^^^%lf%%lf%^%lf%%lf%^^ delims^^=^^^"

set "out="
for /f %ForEntireLine% %%a in ('findstr /il /c:"e:\\" /c:"m:\\" %inFile%') do (
  set "ln=%%a"
  setlocal enableDelayedExpansion
  for %%l in ("!lf!") do (
    set "ln=!ln: =%%~l!"
    set "ln=!ln:#=%%~l!"
    for /f %ForEntireLine% %%b in ("!ln!") do (
      setlocal disableDelayedExpansion
      set "ln=%%b"
      setlocal enableDelayedExpansion
      if "!ln:e:\=!" neq "!ln!" (
        set "out=e:\!ln:*e:\=!"
      ) else if "!ln:m:\=!" neq "!ln!" (
        set "out=m:\!ln:*m:\=!"
      )
      if defined out (
        if exist !out! (echo !out!>>%existsFile%) else echo !out!>>%notExistsFile%
      )
      endlocal
      endlocal
    )
  )
  endlocal
)

2011-07-05 bug fix: out -> !out! in EXIST test

Dave Benham

Re: file and text manipulation.....

Posted: 04 Jul 2011 15:11
by orange_batch
Cleptography wrote:I am sorry and I don't care but this kind of parsing is just pointless and retarded using cmd. Such a waste of time. Please use a tool that can handle this sort of task easily, like vbscript or powershell.

This isn't as hard as it may sound, but dbenham solved it so whatev. 8)

Try building a function to retrieve unicode folder names with exception for duplicate unicode character lengths not including uniquely identifying ASCII like I did. Now that's a challenge. :wink:

Re: file and text manipulation.....

Posted: 04 Jul 2011 15:54
by Cleptography
orange_batch wrote:Try building a function to retrieve unicode folder names with exception for duplicate unicode character lengths not including uniquely identifying ASCII like I did. Now that's a challenge. :wink:

@Orange
....and a bloody waste of time.

@Dave
You are introducing weird syntax to a 9 poster that makes no sense. Should provide an explanation of your code or at least a link to the non standard methods used. Nice script by the way. I am still puzzled and blown away by what you and Ed have managed to accomplish.

Re: file and text manipulation.....

Posted: 04 Jul 2011 16:09
by amichaelglg
thanks I will try this out tomorrow and....yes I am lost at what I am looking at and impressed at the same time. :)

someone did mention doing this in pearl - but it's not something I know. I thought I might understand something, but....

Re: file and text manipulation.....

Posted: 04 Jul 2011 16:34
by Ed Dyreen

Should provide an explanation of your code or at least a link to the non standard methods used.
Batch "macros" with arguments
http://www.dostips.com/forum/viewtopic.php?f=3&t=1827

Re: file and text manipulation.....

Posted: 04 Jul 2011 16:39
by dbenham
My code is mostly vanilla with a few tricks:

1) lf = <LF> = line feed: Usually accessed via delayed expansion

2) ForEachLine: a little macro that sets the FOR /F options to preserve the entire line. When expanded it sets EOL=<LF> and disables DELIMS by setting it to nothing. This guards against lines starting with ; for example.

3) Substituting <LF> for # and <space>. This breaks up the line into multiple lines for the 2nd FOR statement.

4) Using a FOR loop to put the <LF> character into a FOR variable so we can use it in the substitution expression.

5) Repeatedly switching between enabled and disabled delayed expansion so as to preserve ! and or ^ that might be in the text file contents. Both are valid in DOS file/directory names.

Dave Benham

Re: file and text manipulation.....

Posted: 05 Jul 2011 02:08
by amichaelglg
might have to read that a few times to begin to understand it.

One small thing after running the script. It does manage to extract all the files and folders into a list , but they all end up in the notexists.txt file and I know for sure some do exist.

I appreciate very much the help.

Re: file and text manipulation.....

Posted: 05 Jul 2011 05:19
by dbenham
amichaelglg wrote:One small thing after running the script. It does manage to extract all the files and folders into a list , but they all end up in the notexists.txt file and I know for sure some do exist.

I edited the code in my original post and it is all fixed. I forgot to enclose the EXIST OUT test within !! so it wasn't expanding.

Dave Benham

Re: file and text manipulation.....

Posted: 05 Jul 2011 06:26
by amichaelglg
All good now.

Dave your a star - thanks again

Re: file and text manipulation.....

Posted: 05 Jul 2011 15:20
by orange_batch
Cleptography wrote:@Orange
....and a bloody waste of time.

I find it's a break-through innovation I came up with as far as DOS goes. 8) When processing multiple paths, it allows scripts to still work with unicode.

But Dave, Clept is right about macros. People won't learn anything if you're giving those solutions, it's an interesting and useful technique but it's useless for most people and totally obfuscating.

Re: file and text manipulation.....

Posted: 05 Jul 2011 16:28
by Acy Forsythe
*Raises Hand*

I'm learning from it...

I do agree that it's not helpful if you are just learning batch files, it does make them seem more complicated than they already are, but in trying to parse through Dave's scripts to see what he was doing, why and how, I learned a lot, mostly I learned just how much I didn't already know.

Re: file and text manipulation.....

Posted: 05 Jul 2011 16:32
by dbenham
orange_batch wrote:But Dave, Clept is right about macros. People won't learn anything if you're giving those solutions, it's an interesting and useful technique but it's useless for most people and totally obfuscating.
I agree indiscriminate use of macros is counter-instructive. But simple macros without arguments where code fragments are stored in a variable is pretty basic for this site.

In this case I only used one simple macro, and I fail to see how

Code: Select all

for /F ^"eol^=^

delims^=^" %%a in ...
is any less obfuscated then

Code: Select all

set lf=^


set ForEntireLine=^^^"eol^^=^^^%lf%%lf%^%lf%%lf%^^ delims^^=^^^"
for /F %ForEntireLine% %%a in ...
Especially when the technique is needed in two places. At least the ForEntireLine macro name gives an indication of its purpose.

For a discussion of FOR /F EOL problems and solutions - see this thread, starting with the 5th post

Buried in the url that Ed posted is jeb's initial derivation of a macro form of the technique.

I think the real "problem" is that DOS limitations make writing a solution that can handle any valid file or directory name along with text file lines that could start with any character is inherently tricky. Hence Cleptography's initial response.

Dave Benham