split string into substrings based on delimiter

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
Sponge Belly
Posts: 231
Joined: 01 Oct 2012 13:32
Location: Ireland
Contact:

split string into substrings based on delimiter

#1 Post by Sponge Belly » 02 May 2015 09:17

Hi All! :)

The syntax for extracting the first occurrence of a substring to end of string is well known:

Code: Select all

set "tail=%str:*x=%"


And there’s a kludgy way to get the start of a string up to the first occurrence of the substring:

Code: Select all

set "head=%str:x=" & rem."%"


I was messing around with the latter when I stumbled across the following:

Code: Select all

@echo off & setlocal enableextensions disabledelayedexpansion

set "x=monotonous"
set "x1=%x:o=" & set "x2=%"
set x

endlocal & goto :eof


Var x1 contains: m, and x2 ends up with: us. From the last occurrence of the substring to the end of string, in other words. 8)

All the usual caveats apply, of course. The substring is case-insensitive, but the replacement string isn’t. Quotes must be doubled. Per cents, tildes, asterisks and equal signs must be encoded. And it only works with %-variables.

But there’s more. Run my little snippet again with echo on. The x2 var is set four times, each time with the contents of the substring between the previous occurrence of the letter o and the next one. :shock:

BFN!

- SB
Last edited by Sponge Belly on 03 May 2015 08:23, edited 2 times in total.

Aacini
Expert
Posts: 1913
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: from last occurrence to end of string

#2 Post by Aacini » 02 May 2015 09:32

I like it! :D

This remembers me the good old times, when interesting Batch discoveries were frequently made...

EDIT: THIS WORKS! :shock:

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set i=1
set "x=monotonous"
set "x!i!=%x:o=" & set /A i+=1 & set "x!i!=%"
set x

Output:

Code: Select all

x=monotonous
x1=m
x2=n
x3=t
x4=n
x5=us


SB: Perhaps you should change the topic title to "Split string in all substrings separated by a delimiter!" 8)

Antonio

Sponge Belly
Posts: 231
Joined: 01 Oct 2012 13:32
Location: Ireland
Contact:

Re: split string into substrings based on delimiter

#3 Post by Sponge Belly » 02 May 2015 13:25

Hi Aacini,

Clever use of delayed expansion and set /a! 8)

Your method of storing all the intermediary results was so obvious when I read the example… so why didn’t I think of it myself? :cry:

Anyways, I changed the subject line as you suggested.

Laters!

- SB
Last edited by Sponge Belly on 03 May 2015 17:39, edited 1 time in total.

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: split string into substrings based on delimiter

#4 Post by aGerman » 03 May 2015 07:13

Great find :) I didn't even believe we could be able to consider the internal iterations that happen for text replacements.

Regards
aGerman

carlos
Expert
Posts: 503
Joined: 20 Aug 2010 13:57
Location: Chile
Contact:

Re: split string into substrings based on delimiter

#5 Post by carlos » 03 May 2015 18:27

great discovery. thanks for share it.
how it was found ?

npocmaka_
Posts: 516
Joined: 24 Jun 2013 17:10
Location: Bulgaria
Contact:

Re: split string into substrings based on delimiter

#6 Post by npocmaka_ » 04 May 2015 01:45

nice!

Aacini
Expert
Posts: 1913
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: split string into substrings based on delimiter

#7 Post by Aacini » 04 May 2015 01:46

The modification below allows to "Replace each substring by a series of different strings":

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set p=%%
set r1=ONE
set r2=TWO
set r3=THREE
set r4=FOUR

set i=0
set "x=monotonous"
set "x2=%x:o=" & set /A i+=1 & call set "x2=!x2!!p!r!i!!p!%"
set x

At end, x2 contains mONEnTWOtTHREEnFOURus. 8)

Antonio

npocmaka_
Posts: 516
Joined: 24 Jun 2013 17:10
Location: Bulgaria
Contact:

Re: split string into substrings based on delimiter

#8 Post by npocmaka_ » 04 May 2015 01:52

carlos wrote:great discovery. thanks for share it.
how it was found ?



The &rem trick is comparatively old. I think I saw it first here - viewtopic.php?t=194 and here viewtopic.php?f=3&t=381

jeb
Expert
Posts: 1055
Joined: 30 Aug 2007 08:05
Location: Germany, Bochum

Re: split string into substrings based on delimiter

#9 Post by jeb » 04 May 2015 04:05

npocmaka_ wrote:The &rem trick is comparatively old. I think I saw it first here - viewtopic.php?t=194 and here viewtopic.php?f=3&t=381

I think it's much older, but till now I only saw the &REM variant.

But the trick of Aacini to use different replace strings is really cool. :o

The only drawback of the command injection is the problem, that it's really tricky to made it bullet proof against quotes, linefeeds and carriage returns.

Aacini
Expert
Posts: 1913
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: split string into substrings based on delimiter

#10 Post by Aacini » 04 May 2015 11:01

Another one! :mrgreen:

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set "x=<one>1</one>,<two>2</two>,<three>3</three>,<four>4</four>"

set "a=%x%,"
set "b=%a:,=" & (if "!b:<two>=!" neq "!b!" set "c=!b!") & set "b=%"
for /F "tokens=2 delims=><" %%a in ("%c%") do set "xTwo=%%a"
set x

At end: xTwo=2 8)

EDIT: The modification below get all fields from the line:

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set "x=<one>1</one>,<two>2</two>,<three>3</three>,<four>4</four>"

set q="
set p=%%
set "a=%x%,"
set "b=%a:,=" & set "b=!b:~1!" & set "b=!b:>==!" & call set !q!x!p!b:^</=!q!!p! & set "b=%"
set x

Output:

Code: Select all

x=<one>1</one>,<two>2</two>,<three>3</three>,<four>4</four>
xfour=4
xone=1
xthree=3
xtwo=2

I tried to insert the "& rem." command of the original trick in place of the "</" string in order to eliminate the undesired part after it, but I didn't found the way to made it work. However, just enclosing the desired part of the value in quotes was enough, although this method will fail if the undesired part contain special characters.

Note that the "call set !q!x!p!b:^</=!q!!p!" part is a nested replacement that is executed with each one of the substrings of the original replacement. This way, this method is comprised of three stages:

  1. The original string is splitted in several parts via the first %expansion%.
  2. Each part is processed using delayed expansion !variables! to assemble the final expression. This method allows to insert quotes and other special characters in places that the original %expansion% can not handle.
  3. The final expression in each part is evaluated via the nested CALL command.

This means that this method may be used instead of a FOR command in certain cases, when the processing of each part is not too complex.

Antonio
Last edited by Aacini on 05 May 2015 21:50, edited 1 time in total.

Aacini
Expert
Posts: 1913
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: split string into substrings based on delimiter

#11 Post by Aacini » 04 May 2015 13:48

I like the following one! :P

"Replace a list of comma-separated subscripts by their corresponding array elements"

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set p=%%
set r1=ONE
set r2=TWO
set r3=THREE
set r4=FOUR
set r5=FIVE

set "x=3,1,5,4"
set "x2=%%r%x:,=!p!," & call set "x2=!x2!!p!r%%%"
set x

Output:

Code: Select all

x=3,1,5,4
x2=THREE,ONE,FIVE,FOUR

Antonio

Sponge Belly
Posts: 231
Joined: 01 Oct 2012 13:32
Location: Ireland
Contact:

Re: split string into substrings based on delimiter

#12 Post by Sponge Belly » 12 May 2015 13:52

Hi Antonio,

Your last example was amazing. Too bad I can’t understand it! :lol:

Anyways, I was wondering if it’s possible to change the value of the original string from inside the loop caused by the string split operation… because if it is, we could append to the string what was just taken off, the string would never grow shorter, and the loop would go on indefinitely.

Just thinking out loud. ;)

- SB

Ed Dyreen
Expert
Posts: 1569
Joined: 16 May 2011 08:21
Location: Flanders(Belgium)
Contact:

Re: split string into substrings based on delimiter

#13 Post by Ed Dyreen » 13 May 2015 01:11

Sponge Belly wrote:Hi Antonio,

Your last example was amazing. Too bad I can’t understand it! :lol:

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set p=%%
set r1=ONE
set r2=TWO
set r3=THREE
set r4=FOUR
set r5=FIVE

set "x=3,1,5,4"
set "x2=%%r%x:,=!p!," & call set "x2=!x2!!p!r%%%"
set x
this is still the &REM trick, but replaced with a different command, so during the first percent expansion each , inside the x variable expands into the string !p!," & call set "x2=!x2!!p!r so you get
set "x2=%r3!p!," &call set "x2=!x2!!p!r1!p!," &call set "x2=!x2!!p!r5!p!," &call set "x2=!x2!!p!r4%"
When the first command is executed there is a second exclamation mark expansion so !p! expands into %
set "x2=%r3%,"
The second command starts with Call so call set "x2=!x2!!p!r1!p!," expands to set "x2=%r3%,%r1%,"

Now the set command executes and the result is assigned to x2
set "x2=THREE,ONE,"
and so on.

This also answers your 2nd question; what you call a loop is a fixed series of commands.

dosItHelp? :wink:

Sponge Belly
Posts: 231
Joined: 01 Oct 2012 13:32
Location: Ireland
Contact:

Re: split string into substrings based on delimiter

#14 Post by Sponge Belly » 22 May 2015 10:10

Hi Ed, :)

Thanks for the explanation. Still can’t quite wrap my head around Aacini’s code, though. Will keep trying.

In the meantime, I’ve rekindled my obsession with finding the best way to trim leading and trailing whitespace from a string:

Code: Select all

@echo off & setlocal enableextensions disabledelayedexpansion
set ^"str= ^^^"    ^^^&^^    ^"^^^&^"^& !^^!^^^^! %%   %%OS%%    ^"
for /f delims^=^ eol^= %%A in ('
cmd /von /c echo(^^!str:^^^"^=^^^"^^^"^^!^| more /t1
') do set "x= %%A "

set /a i=j=0 & set "k="
set "x=%x: =" & (if defined x if not defined k set /a k=i) & (if defined x set /a j=i) & set /a i+=1 & set "x=%"
set "x="
set /a pos=k-1,len=j-i+1
if %len% lss 0 (set "len=,%len%") else set "len="

setlocal enabledelayedexpansion
for /f delims^=^ eol^= %%A in ("!str:~%pos%%len%!") do (
endlocal & set "xstr=%%A" & echo([%%A])
set xstr

endlocal & goto :eof


Quotes must be doubled. Any tabs are turned into spaces by more /t1. The cmd /von is necessary to avoid %-variable expansion. A space is added to either end of the resultant string, which is stored in var x. This is to ensure that x is undefined the first and last time the string is split.

Sorry about the overlong line, btw. If anyone can help me optimise the… whatayacallit… statements between the opening and closing per cents that are executed for every time the string is split, please get in touch.

Anyways, i is the number of times the string is split, j is the value of i the last time x was defined, and k is the value of i the first time x was defined. The amount of whitespace to be trimmed from both ends of the string can be inferred from these values.

Interesting approach, but I don’t know how practical it is. :|

- SB

Aacini
Expert
Posts: 1913
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: split string into substrings based on delimiter

#15 Post by Aacini » 22 May 2015 11:20

I LIKE THIS! :mrgreen:

"Trim leading and trailing whitespace from a string"

EDIT 2015/05/23 - I slightly modified the code exchanging the initialization of "x2" and "word" variables; this detail allows to eliminate the inserted space at begining of the string (and makes the code more coherent).

Code: Select all

@echo off
setlocal EnableDelayedExpansion

set "x=     String with spaces     "
set "x=%x% "
set "x2="
set "word=%x: =" & (if "!word!" neq "" set "x2=!x2! !word!") & set "word=%" & set "x2=!x2:~1!"

echo "%x:~0,-1%"
echo "%x2%"

Antonio
Last edited by Aacini on 23 May 2015 11:20, edited 1 time in total.

Post Reply