Page 1 of 1

How can I remove trailing space(s) between text and CR/LF ?

Posted: 25 Dec 2011 03:25
by alan_b
Happy Christmas everyone.

I am writing a script to process lines of text from multiple contributors.
I find that some have left a trailing space after the text and before the CR/LF,
and this is no problem for my script but may cause problems with the final application that uses what I process.

Do I just need FINDSTR with the option "xyz\> Word position: end of word"
and if so what is the command I should invoke ?

If it can "DOSIFY" (with CR/LF) a file that includes some UNIX (LF) style line terminators it would be a bonus,
Because at the moment I first have to use MORE to de-Unix the input file

I can use SET /P and with something like this :-
if "!ln:~-1!"==" " ( REM Found one or more trailing spaces - need to remove them ... )

With enough effort and experiments I could knock out one trailing space at a time,
and then a trivial but tedious adaptation to keep iterating until that line has no more spaces to be stripped,
and then on to the next line, but then do I have to iterate or can all trailing spaces be done at once ?

Regards
Alan

Re: How can I remove trailing space(s) between text and CR/L

Posted: 25 Dec 2011 08:21
by dbenham
Ho Ho Ho :)

I don't see how FINDSTR can help.

If you are already comfortable using SET /P to read your file, then you can exploit the fact that it strips trailing control characters as a useful feature.

Code: Select all


@echo off
setlocal enableDelayedExpansion

set "test=%~1"
echo before="!test!"


::convert <space> to <lf>
if defined test set ^"test=!test: =^

!^"


::strip trailing control characters
echo(!test!>temp.txt
set "test="
<temp.txt set/p"test="


::convert remaining <lf> back to <space>
if defined test set ^"test=!test:^

= !^"


echo  after="!test!"

a few test results:

Code: Select all

C:\test>rtrim "Hello world!    "
before="Hello world    "
 after="Hello world"

C:\test>rtrim ""
before=""
 after=""

C:\test>rtrim "     "
before="     "
 after=""

C:\test>


I haven't tested performance of this vs a loop that trims one space at a time.

With this solution I suppose you will end up writing the file 3 times :!: :(

1 - convert from unix to Windows using MORE
2 - read using SET /P and write with <space> converted to <lf>
3 - read using SET /P and write with <lf> converted back to <space>


Dave Benham

Re: How can I remove trailing space(s) between text and CR/L

Posted: 25 Dec 2011 10:41
by aGerman
Merry Christmas everyone!

@Dave
Good idea, but my first tests showed that single carets are removed :(

@Alan
To be honest I would avoid Batch in this case. Other languages could solve your Lf problem and the right-trimming at once. Tell me if you would be interested in.
However, this could remove at least 1023 trailing spaces in only 10 loops:

Code: Select all

@echo off

:: Predefine variables and the macro $RTrim
setlocal DisableDelayedExpansion
  :: LineFeed
set LF=^


set ^"\n=^^^%LF%%LF%^%LF%%LF%^^"&rem TWO EMPTY LINES ABOVE REQUIRED!
  :: create variable containing 512 spaces
setlocal EnableDelayedExpansion
set "spcs=                "
for /l %%i in (1 1 5) do (
  set "spcs=!spcs!!spcs!"
)
endlocal &set "spcs=%spcs%"
  :: macro for right-trim
set $RTrim=(%\n%
  set /a "k=512"%\n%
  set "spc=%spcs%"%\n%
  for /l %%j in (1 1 10) do (%\n%
    for %%k in (!k!) do (%\n%
      if "!LN:~-%%k!"=="!spc!" (%\n%
        set "LN=!LN:~0,-%%k!"%\n%
      )%\n%
    )%\n%
    set /a "k/=2"%\n%
    for %%k in (!k!) do set "spc=!spc:~%%k!"%\n%
  )%\n%
)

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

:: Test
set "file=test.txt"
setlocal EnableDelayedExpansion
<"%file%" (
  for /f %%n in ('type "%file%"^|find /c /v ""') do (
    for /L %%i in (1 1 %%n) do (
      set "LN=" &set /p "LN="
      echo "!LN!"&rem BEFORE
      %$RTrim%
      echo "!LN!"&rem AFTER
      echo(
    )
  )
)
endlocal
pause


Regards
aGerman

Re: How can I remove trailing space(s) between text and CR/L

Posted: 25 Dec 2011 11:49
by dbenham
aGerman wrote:@Dave
Good idea, but my first tests showed that single carets are removed :(
:?: :?
I like your RTRIM macro much better, but I don't see how my proposal strips carets.
C:\test>rtrim "hello^world "
before="hello^world "
after="hello^world"
The quotes in the argument list are just to preserve the caret until it gets into a variable. But once any character is in a variable, it seems to me that the technique should preserve it properly.

I really do like your RTRIM macro. It seems like it could be extended rather easily to support the maximum variable size and converted into a general purpose callable macro (or function)

Dave Benham

Re: How can I remove trailing space(s) between text and CR/L

Posted: 25 Dec 2011 12:37
by alan_b
Thanks for your efforts.

I am writing a script to assist another engineer in his task,
and till now I would have been able to explain any part of the script that he queried.

I am afraid that Dave's code is beyond me tonight, but perhaps I will understand after a good night's sleep.

Unfortunately aGerman's code gives me even more difficulty, starting with the very first ^ in the line
set ^"\n=^^^%LF%%LF%^%LF%%LF%^^"&rem TWO EMPTY LINES ABOVE REQUIRED!

I can only guess that the first ^ has some relationship to the previous two empty lines.

I have decided to give up,
and instead will produce not only the final text file but also a variant that encapsulates each line with braces {text and stuff}
and produce an error report on lines that need fixing with

Code: Select all

FIND " }" < %file%


Thanks for your efforts, but I am not man enough to use them

Regards
Alan

Re: How can I remove trailing space(s) between text and CR/L

Posted: 25 Dec 2011 12:55
by aGerman
dbenham wrote:I like your RTRIM macro much better, but I don't see how my proposal strips carets.

Hmm ... that's what I get:

Code: Select all

> rtrim "hello^ world!     "
before="hello world     "
 after="hello world"

> rtrim "hello^world!     "
before="helloworld     "
 after="helloworld"

>

dbenham wrote:I really do like your RTRIM macro. It seems like it could be extended rather easily to support the maximum variable size and converted into a general purpose callable macro (or function)

So do I :wink:


alan_b wrote:Unfortunately aGerman's code gives me even more difficulty, starting with the very first ^ in the line
set ^"\n=^^^%LF%%LF%^%LF%%LF%^^"&rem TWO EMPTY LINES ABOVE REQUIRED!

I can only guess that the first ^ has some relationship to the previous two empty lines.

Yes and no. %LF% expands to a real linefeed character (and to create it the empty lines are necessary). Imagine how it would look like if there were linefeeds instead of %LF%. It results in a multiline set command. For that reason you have to escape the quotation marks.

alan_b wrote:I have decided to give up

:( Only because of the code? Is there anything else I should explain?

Regards
aGerman

Re: How can I remove trailing space(s) between text and CR/L

Posted: 25 Dec 2011 13:48
by dbenham
aGerman wrote:Hmm ... that's what I get:

Code: Select all

> rtrim "hello^ world!     "
before="hello world     "
 after="hello world"

> rtrim "hello^world!     "
before="helloworld     "
 after="helloworld"

>



Try this. The command line games has nothing to do with the technique. It's just to get the desired value into a variable.

Code: Select all

C:\test>rtrim "hello^^world^!     "
before="hello^world!     "
 after="hello^world!"


Dave Benham

Re: How can I remove trailing space(s) between text and CR/L

Posted: 25 Dec 2011 14:21
by alan_b
aGerman wrote:
alan_b wrote:I have decided to give up

:( Only because of the code? Is there anything else I should explain?
Regards
aGerman

I have no criticism of your explanation.
The problem is my comprehension.

Unfortunately I have excelled at the use of BAT in command.com to such an extent that I naturally think of those techniques and limitations,
and when I see an escalating series of ^escape codes thus ^^^^^
it feels as though I am diving into a series of rabbit holes towards the Mad Hatter's Tea Party.

Unfortunately at least 50% of my initial experiments with SET /P have instantly killed CMD.EXE before it ever got to "PAUSE".

I was happy during my career to create and use, with well defined and documented rules :-
#DEFINE procedures in ANSI 'C' code; and
MACRO procedures in assembler.

MACRO procedures in BATCH code is a new game with no defined rules, just exceptions waiting to trip me up :twisted:
Sorry, but I can only cope with "baby steps" at the moment - my comfort zone is crawling on hands and knees :D

Regards
Alan

Re: How can I remove trailing space(s) between text and CR/L

Posted: 25 Dec 2011 14:34
by aGerman
@Dave,
of course that works.
Interchanging setlocal enableDelayedExpansion and set "test=%~1" would fix it as well.

@Alan,
a "macro" is nothing but a piece of code in an environment variable in this case :wink:
To make things easier to understand:

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set "file=test.txt"

set "spcs=                "
for /l %%i in (1 1 5) do (
  set "spcs=!spcs!!spcs!"
)

<"%file%" (
  for /f %%n in ('type "%file%"^|find /c /v ""') do (
    for /L %%i in (1 1 %%n) do (
      set "LN=" &set /p "LN="
      echo "!LN!"&rem BEFORE

      set /a "k=512"
      set "spc=%spcs%"
      for /l %%j in (1 1 10) do (
        for %%k in (!k!) do (
          if "!LN:~-%%k!"=="!spc!" (
            set "LN=!LN:~0,-%%k!"
          )
        )
        set /a "k/=2"
        for %%k in (!k!) do set "spc=!spc:~%%k!"
      )

      echo "!LN!"&rem AFTER
      echo(
    )
  )
)

pause


Regards
aGerman

Re: How can I remove trailing space(s) between text and CR/L

Posted: 25 Dec 2011 16:20
by alan_b
Thanks

It is easier to understand now I see the macro code expanded.

Life is so unfair :-
Why does "!LN:~-%%k!" work for you, but not for me ?
i.e.

Code: Select all

      for /f %%n in ("!LN!") do (
        FOR /L %%r in (!END_%%n:~0,4!,1,!END_%%n:~-4!) DO (
          ECHO %%r {!ST_%%r!}
        )
      )

The above failed and had to be replaced with:-

Code: Select all

      for /f %%n in ("!LN!") do (
        SET /A LO=!END_%%n:~0,4! & SET /A HI=!END_%%n:~-4!
        FOR /L %%r in (!LO!,1,!HI!) DO (
          ECHO %%r {!ST_%%r!}
        )
      )

Apparently CMD.EXE refuses to evaluate within "FOR /L ..." the variable boundaries held by the variable !END_%%n!
CMD.EXE however will evaluate within an "if" test your apparently similar "!LN:~-%%k!"

Code: Select all

        for %%k in (!k!) do (
          if "!LN:~-%%k!"=="!spc!" (
            set "LN=!LN:~0,-%%k!"
          )
        )


Every time I think I understand what will either work or fail,
I seem to then observe that you and other experts here can do the impossible.

It could also be significant that it is past my bed time.
Perhaps I will learn a bit more tomorrow.

Regards
Alan

Re: How can I remove trailing space(s) between text and CR/L

Posted: 25 Dec 2011 18:33
by aGerman
Hi Alan,

you know that a comma is parsed as a separator inside of the parantheses. All you have to do is to escape it in the variable:

Code: Select all

        FOR /L %%r in (!END_%%n:~0^,4!,1,!END_%%n:~-4!) DO (


Regards
aGerman

Re: How can I remove trailing space(s) between text and CR/L

Posted: 26 Dec 2011 01:28
by alan_b
aGerman wrote:Hi Alan,

you know that a comma is parsed as a separator inside of the parantheses. All you have to do is to escape it in the variable:

Code: Select all

        FOR /L %%r in (!END_%%n:~0^,4!,1,!END_%%n:~-4!) DO (


Regards
aGerman

Thanks, that is one of the many things I did NOT know.
I will try that when I go back off-line.

Re: How can I remove trailing space(s) between text and CR/L

Posted: 26 Dec 2011 06:41
by aGerman
Hi Alan.

alan_b wrote:Thanks, that is one of the many things I did NOT know.

I don't believe you :lol: You already used commas as separators.
(just kidding)

To shed some light on it:
If the CMD parses the expression inside of the parentheses there are basically two things it has to process:
- expand the variables
- separate it into tokens (the 3 values for Start, Step and End in this case)
The question is when are the variables expanded, before or after the CMD separated the tokens? EnableDelayedExpansion is the hint I could give you :wink:

One of the most interesting and important explanations I recommend is what jeb worked out. See:
http://stackoverflow.com/questions/4094 ... 33#4095133

Regards
aGerman

Re: How can I remove trailing space(s) between text and CR/L

Posted: 26 Dec 2011 10:42
by alan_b
What I should have said was that I never knew that a comma might need a ^escape within the parenthesis.

Thanks for that information, you code fix worked for me.

Thanks for the link to what jeb worked out.
I just wish it was easier for me to understand.

Regards
Alan