strLen boosted
Posted: 24 Oct 2010 16:46
Hi !
I was wondering how to retrieve the length of a string ; I've seen «:strLen» (http://www.dostips.com/DtCodeCmdLib.php#Function.strLen) ; yay !
I've also managed to improve it, at least on this configuration : Intel Core Duo T2300 1.66GHz / Windows XPSP3 {5.1.2600}.
The code is towards the end of this post.
Short version :
Perfectly knowing what I was doing from the get-go {...}, I decided to speed up «:strLen» twofold and to remove its 1023 chars limit at that.
Long version :
Reading the code of «:strLen», the redundancy was bugging me, and, cocksure like an idiot, I was thinking :
~ "Why ? Why ? Why not *sim-ply* put a loop here ?!".
Crafting a «for» loop, I was soon st(h)umbling on the «"!str:~%len%!"» thingy with this nasty problem of delayed expansion that can not be nested.
~ "... tee hee ...", I *ne-ver* thought it was possible, promise !
Now seeing why a «for» was not used, I was still perplexed by the absence of any loop here, be it of the «goto:loop» kind.
~ "Is it a case of «goto» allergy ? Is it due to the risk of conflict when putting a label in a function ?".
Anyway, I was devising a manual «goto:loop», renaming «:loop» in a manner unlikely to be encountered outside of «:strLen».
This was working correctly .. except for the fact that it was completing in 220% of the time sufficient for the loop-free version.
~ Zing ! "This redundancy : it's .. it's a case of loop-unrolling ! Hats off !".
Abandoning the idea to tweak «:strLen», I was lurking around, improving my DOS-fu, when I saw at line 7 of «:getFunctions» (http://www.dostips.com/DtCodeCmdLib.php#Function.getFunctions) that this «call set» nesting expansion trick could also be used in a way remarkably similar to «!»-style delayed expansion.
~ "Let's see if a «for» is possible after all ...".
Well, I did not manage to use this trick to good effect ; that said, while trying, I found a way to nest delayed expansions of the «!» variety, which permitted to write the sought-after «for», as slow as it was expected to be.
Knowing what to search, I found that I was not the first to have thought about it (http://judago.webs.com/variablecatches.htm#fw-footer) and replaced my clunky «(set tmp=^^!str:~!len!^^!) & for %%t in ("!tmp!") do if "%%~t"=="" ...» with a leaner «for %%l in ("!len!") do if "!str:~%%~l!"=="" ...».
Now, this «for» is anything but slow : it happens to complete in a good 50% of the time taken by the loop-unrolled implementation of «:strLen».
Maybe that internal jumps are way cheaper than manual ones ; this cannot account for the improvement compared to the loop-unrolling though.
Maybe that some step of the interpreting needs to occur once only for the body of a «for» kind of loop.
With this fancy «for» came an index that was usable to directly compute the notional value of the binary search guess, thus allowing to eliminate the variable «n» and the need to access it ; things then completed in 90% of the previous time.
Positing that strings in the wild tend to be short rather than long, I skewed the search in guessing by quarters rather than halves but this was producing disappointing results speed wise.
All these percents, compounded, made «:strLen» complete in a small 50% of the time taken the loop-unrolled way.
While I was at it, I also focused on the maximum length limit of 1023 characters.
It was tempting to simply increase the number of iterations, and cursory tests were successful, but the limit was still a property of the function, not the language.
Instead, I chose to push the maximum close to whatever limit is inherent to the underlying machinery at hand by counting the number of full chunks of 1024 characters and letting the binary search occur on the rest.
This way, the new version should work wherever the loop-unrolled version works, the search domain being kept untouched.
Attempt 1/4)
I knew that a usual «for» loop can be exited through a «goto» and now being aware that such loops are faster than those of the «goto:loop» kind, I first tried to combine an infinite «for /l %%i in (0,0,1) do ...» with a «goto:break» ... the script stalled. In fact, if a «goto:break» does redirect the flow outside, some echoing shows that this occurs only after the loop has iterated as many times as instructed to, no matter what. The post-«goto:break» iterations do not seem to have any other effects than to take time and to suggest the avoidance of «goto»s in «for»s.
Attempt 2/4)
Falling back to a manual «goto:loop», I obtained something that was working well. Despite the overhead of counting the chunks, the new «:strLen» was completing in 60% of the time taken by the loop-unrolled version.
Attempt 3/4)
Due to the «goto:loop», a label was now embedded in «:strLen». Not really a practical problem, but still ; I entertained the idea of trying a recursive approach, sort of. In fact, I zeroed in on looping directly on the function's label. This required to protect the instructions standing between the label «:strLen» and the body of the loop : they had to be run once only per call from outside «:strLen». Things got a bit dirty. Just before looping, I was appending the string "local_to_strLen" to the semantically numeric chunks' counter : no such thing ought to happen outside of «:strLen», which provides a way to discriminate outside calls and inside jumps. The instructions to protect were placed in a block «(...)» guarded by an «if» : they were to be run only on the condition that "local_to_strLen" was not a suffix of the chunks' counter. When jumped to from inside, the chunks' counter is properly conditioned and the instructions protected / when called from outside, either a homonym of the chunks' counter exists or this variable name is unassigned, anyway neither case exhibits such a suffix.
Again, it worked well .. and somehow quicker than with {Attempt 2} for strings of the first chunk where it was hovering under 90% of the previous time.
But why ? After all, it had more housekeeping to do.
Apparently, the mere fact of enclosing some instructions in a block «(...)» makes them faster , even when delayed expansion is not involved, which nonetheless sees its general importance accrued. The speed tests consisting of executing «:strLen» repeatedly, it seems that some cacheability is improved. This puts the «for»-related gain in speed under a new light too.
Whatever.
Attempt 4/4)
I put this fresh awareness to good use on {Attempt 2} ...
Enclosing as much as possible of «:strLen»'s body between parentheses worked OK and fast. I was not comfortable with letting a label in a block though, so split it in two, just to see ; it worked even faster.
This enclosing led to a decrease of the time taken by {Attempt 2} to a small 85% !. give or take 5 points depending on the lengths of the tested strings. Which, compared to {Attempt 3}, made it complete as fast for short strings and in 70% of the time for long strings.
Ultimately, this version was/is completing in a good 50%, overhead included, of the time of its loop-unrolled counterpart.
Having read these rules (http://www.dostips.com/DtGeneTermsOfUse.php), I submit this code to your attention :
Side notes :
Thanks for your time !
I was wondering how to retrieve the length of a string ; I've seen «:strLen» (http://www.dostips.com/DtCodeCmdLib.php#Function.strLen) ; yay !
I've also managed to improve it, at least on this configuration : Intel Core Duo T2300 1.66GHz / Windows XPSP3 {5.1.2600}.
The code is towards the end of this post.
Short version :
Perfectly knowing what I was doing from the get-go {...}, I decided to speed up «:strLen» twofold and to remove its 1023 chars limit at that.
Long version :
Reading the code of «:strLen», the redundancy was bugging me, and, cocksure like an idiot, I was thinking :
~ "Why ? Why ? Why not *sim-ply* put a loop here ?!".
Crafting a «for» loop, I was soon st(h)umbling on the «"!str:~%len%!"» thingy with this nasty problem of delayed expansion that can not be nested.
~ "... tee hee ...", I *ne-ver* thought it was possible, promise !
Now seeing why a «for» was not used, I was still perplexed by the absence of any loop here, be it of the «goto:loop» kind.
~ "Is it a case of «goto» allergy ? Is it due to the risk of conflict when putting a label in a function ?".
Anyway, I was devising a manual «goto:loop», renaming «:loop» in a manner unlikely to be encountered outside of «:strLen».
This was working correctly .. except for the fact that it was completing in 220% of the time sufficient for the loop-free version.
~ Zing ! "This redundancy : it's .. it's a case of loop-unrolling ! Hats off !".
Abandoning the idea to tweak «:strLen», I was lurking around, improving my DOS-fu, when I saw at line 7 of «:getFunctions» (http://www.dostips.com/DtCodeCmdLib.php#Function.getFunctions) that this «call set» nesting expansion trick could also be used in a way remarkably similar to «!»-style delayed expansion.
~ "Let's see if a «for» is possible after all ...".
Well, I did not manage to use this trick to good effect ; that said, while trying, I found a way to nest delayed expansions of the «!» variety, which permitted to write the sought-after «for», as slow as it was expected to be.
Knowing what to search, I found that I was not the first to have thought about it (http://judago.webs.com/variablecatches.htm#fw-footer) and replaced my clunky «(set tmp=^^!str:~!len!^^!) & for %%t in ("!tmp!") do if "%%~t"=="" ...» with a leaner «for %%l in ("!len!") do if "!str:~%%~l!"=="" ...».
Now, this «for» is anything but slow : it happens to complete in a good 50% of the time taken by the loop-unrolled implementation of «:strLen».
Maybe that internal jumps are way cheaper than manual ones ; this cannot account for the improvement compared to the loop-unrolling though.
Maybe that some step of the interpreting needs to occur once only for the body of a «for» kind of loop.
With this fancy «for» came an index that was usable to directly compute the notional value of the binary search guess, thus allowing to eliminate the variable «n» and the need to access it ; things then completed in 90% of the previous time.
Positing that strings in the wild tend to be short rather than long, I skewed the search in guessing by quarters rather than halves but this was producing disappointing results speed wise.
All these percents, compounded, made «:strLen» complete in a small 50% of the time taken the loop-unrolled way.
While I was at it, I also focused on the maximum length limit of 1023 characters.
It was tempting to simply increase the number of iterations, and cursory tests were successful, but the limit was still a property of the function, not the language.
Instead, I chose to push the maximum close to whatever limit is inherent to the underlying machinery at hand by counting the number of full chunks of 1024 characters and letting the binary search occur on the rest.
This way, the new version should work wherever the loop-unrolled version works, the search domain being kept untouched.
Attempt 1/4)
I knew that a usual «for» loop can be exited through a «goto» and now being aware that such loops are faster than those of the «goto:loop» kind, I first tried to combine an infinite «for /l %%i in (0,0,1) do ...» with a «goto:break» ... the script stalled. In fact, if a «goto:break» does redirect the flow outside, some echoing shows that this occurs only after the loop has iterated as many times as instructed to, no matter what. The post-«goto:break» iterations do not seem to have any other effects than to take time and to suggest the avoidance of «goto»s in «for»s.
Attempt 2/4)
Falling back to a manual «goto:loop», I obtained something that was working well. Despite the overhead of counting the chunks, the new «:strLen» was completing in 60% of the time taken by the loop-unrolled version.
Attempt 3/4)
Due to the «goto:loop», a label was now embedded in «:strLen». Not really a practical problem, but still ; I entertained the idea of trying a recursive approach, sort of. In fact, I zeroed in on looping directly on the function's label. This required to protect the instructions standing between the label «:strLen» and the body of the loop : they had to be run once only per call from outside «:strLen». Things got a bit dirty. Just before looping, I was appending the string "local_to_strLen" to the semantically numeric chunks' counter : no such thing ought to happen outside of «:strLen», which provides a way to discriminate outside calls and inside jumps. The instructions to protect were placed in a block «(...)» guarded by an «if» : they were to be run only on the condition that "local_to_strLen" was not a suffix of the chunks' counter. When jumped to from inside, the chunks' counter is properly conditioned and the instructions protected / when called from outside, either a homonym of the chunks' counter exists or this variable name is unassigned, anyway neither case exhibits such a suffix.
Again, it worked well .. and somehow quicker than with {Attempt 2} for strings of the first chunk where it was hovering under 90% of the previous time.
But why ? After all, it had more housekeeping to do.
Apparently, the mere fact of enclosing some instructions in a block «(...)» makes them faster , even when delayed expansion is not involved, which nonetheless sees its general importance accrued. The speed tests consisting of executing «:strLen» repeatedly, it seems that some cacheability is improved. This puts the «for»-related gain in speed under a new light too.
Whatever.
Attempt 4/4)
I put this fresh awareness to good use on {Attempt 2} ...
Enclosing as much as possible of «:strLen»'s body between parentheses worked OK and fast. I was not comfortable with letting a label in a block though, so split it in two, just to see ; it worked even faster.
This enclosing led to a decrease of the time taken by {Attempt 2} to a small 85% !. give or take 5 points depending on the lengths of the tested strings. Which, compared to {Attempt 3}, made it complete as fast for short strings and in 70% of the time for long strings.
Ultimately, this version was/is completing in a good 50%, overhead included, of the time of its loop-unrolled counterpart.
Having read these rules (http://www.dostips.com/DtGeneTermsOfUse.php), I submit this code to your attention :
Code: Select all
:strLen string len -- returns the length of a string via binary search
:: -- string [in] - variable name containing the string being measured for length
:: -- len [out] - variable to be used to return the string length
:$created 20081122 :$changed 20101020 :$categories StringOperation
:$source http://www.dostips.com
:: the binary search occurs after all full chunks of 1024 characters (1 KiC) have been counted
:: this enables the measurement of lengths nearly as great as the command line limit
:: the shortness of the names s for "string" and r for "remainder" helps in this matter
:: some blocks seem superfluous, they improve speed
:: when initializing s, keep the A up front to ensure we get the length and not the upper bound
:: it also avoids trouble in case of empty string
( SETLOCAL ENABLEDELAYEDEXPANSION
set s=A!%~1!
set nbKiC=0
)
:strLen#repeat_until_less_than_1_KiC_remains_to_search_through
( set r=!s:~1024!
if not "!r!"=="" (
set /a nbKiC+=1
set s=!r!
goto:strLen#repeat_until_less_than_1_KiC_remains_to_search_through
)
set len=0
for /l %%e in (9,-1,0) do (
set /a "len|=1<<%%e" & for %%l in ("!len!") do if "!s:~%%~l!"=="" set /a "len&=~(1<<%%e)"
)
set /a "len+=nbKiC<<10"
)
( ENDLOCAL & REM RETURN VALUES
IF "%~2" NEQ "" SET "%~2=%len%"
)
EXIT /b
Side notes :
- While lurking here, I've seen a previous version (http://www.dostips.com/DtCodeFunctions.php#_Toc128586394) of «:strLen» (http://www.dostips.com/DtCodeCmdLib.php#Function.strLen).
- Tests of correctness and speed are provided in the next post.
- Testing required a less locale dependent version of «:getTod» (http://www.dostips.com/DtCodeFunctions.php#_Toc128586395), here is it :
Code: Select all
:getTod -- get a Time of Day value in 1/100th seconds
:: -- %~1: out - time of day
SETLOCAL
set t=%time: =0%
set /a t=((1%t:~0,2%*60+1%t:~3,2%)*60+1%t:~6,2%)*100+1%t:~9,2%-36610100
( ENDLOCAL & REM RETURN VALUES
IF "%~1" NEQ "" SET %~1=%t%
)
GOTO:EOF
Thanks for your time !