Why does SET performance degrade as environment size grows?
Posted: 05 Dec 2011 22:06
There are many clever uses for environment variables that can cause the size of the environment to grow dramatically. For example:
I remember seeing Ed "too complex" Dyreen report that he saw poor performance as his memory usage increased. I filed that in the back of my mind, but didn't worry much until I ran into performance issues myself with a Stack Overflow question and answer: http://stackoverflow.com/a/8369403/1012053.
I've done some extensive timing tests that demonstrate large environment sizes can dramatically impact the performance of SET and SETLOCAL/ENDLOCAL. Absolute time values are highly machine dependent, but I think the qualitative results are relevant to everyone.
Based on my tests, the amount of time it takes to SET a value or do a SETLOCAL/ENDLOCAL toggle is roughly linear with the size of the environment. This can have a devastating impact on performance with large data sets and/or with large macro libraries.
It only stands to reason that the performance of SETLOCAL / ENDLOCAL will suffer as the size of the environment grows - Obviously the amount of memory that must be allocated grows with the size of the environment.
But I'm shocked that the time it takes to SET a single value suffers just as badly The only thing I can think of is CMD.EXE must store the entire environment in one continuous block of memory and it reallocates a new block every time a single value changes. The SET tests I used were using a variable that resides near the beginning of the environment (a). I ran some additional tests using a variable that resides near the end (z) (not shown). The set was as much as 25% faster when working with z vs. a, so there is some positional dependency as well.
It is interesting that the time it takes to expand a variable appears to be independent of the environment size, thank goodness.
Question - Does anyone have any better insight as to why SET degrades linearly as the environment grows I suppose a definitive answer would have to come from someone with knowledge of the internal workings of CMD.EXE. Even better would be a suggestion on how to improve performance when using a large environment, but I doubt there is much that can be done.
Here is my actual timing test code. The timer routine is a macro that is already loaded into memory prior to running the test script.
Here are the results of one run. The times are for 1000 iterations of each operation measured in 1/100 seconds.
Dave Benham
- Load rows of data or lines of a file into a pseudo array for sorting (or other) purposes.
- Loading commands into memory (macros) for speed and convenience of an include-able library of functions.
I remember seeing Ed "too complex" Dyreen report that he saw poor performance as his memory usage increased. I filed that in the back of my mind, but didn't worry much until I ran into performance issues myself with a Stack Overflow question and answer: http://stackoverflow.com/a/8369403/1012053.
I've done some extensive timing tests that demonstrate large environment sizes can dramatically impact the performance of SET and SETLOCAL/ENDLOCAL. Absolute time values are highly machine dependent, but I think the qualitative results are relevant to everyone.
Code: Select all
Aprox. Env Size Set a var (sec) Setlocal/Endlocal (sec) Expand a var (sec)
--------------- --------------- ------------------------ ------------------
10KB 0.0001 0.0005 0.0003
1293KB 0.0126 0.0322 0.0003
Based on my tests, the amount of time it takes to SET a value or do a SETLOCAL/ENDLOCAL toggle is roughly linear with the size of the environment. This can have a devastating impact on performance with large data sets and/or with large macro libraries.
It only stands to reason that the performance of SETLOCAL / ENDLOCAL will suffer as the size of the environment grows - Obviously the amount of memory that must be allocated grows with the size of the environment.
But I'm shocked that the time it takes to SET a single value suffers just as badly The only thing I can think of is CMD.EXE must store the entire environment in one continuous block of memory and it reallocates a new block every time a single value changes. The SET tests I used were using a variable that resides near the beginning of the environment (a). I ran some additional tests using a variable that resides near the end (z) (not shown). The set was as much as 25% faster when working with z vs. a, so there is some positional dependency as well.
It is interesting that the time it takes to expand a variable appears to be independent of the environment size, thank goodness.
Question - Does anyone have any better insight as to why SET degrades linearly as the environment grows I suppose a definitive answer would have to come from someone with knowledge of the internal workings of CMD.EXE. Even better would be a suggestion on how to improve performance when using a large environment, but I doubt there is much that can be done.
Here is my actual timing test code. The timer routine is a macro that is already loaded into memory prior to running the test script.
Code: Select all
@echo off
setlocal enableDelayedExpansion
set "test=a"
for /l %%n in (1 1 10) do set "test=!test!!test!"
set buf1=%test%
set buf2=%test%
set "buf3=a"
for /l %%n in (1 1 9) do set "buf3=!buf3!!buf3!"
set cnt=0
set "a=a"
for %%n in (0 10 20 40 80 160 320 640) do call :test %%n
exit /b
:test
for /l %%n in (1 1 %1) do (
set /a cnt+=1
set test!cnt!=%test%
)
set >env.txt
set t1=%time%
for /l %%n in (1 1 1000) do (
rem
)
set t2=%time%
for /l %%n in (1 1 1000) do (
rem
setlocal
endlocal
)
set t3=%time%
for /l %%n in (1 1 1000) do (
rem
setlocal
set "a=b"
endlocal
)
set t4=%time%
for /l %%n in (1 1 1000) do (
rem
setlocal
set "a="
endlocal
)
set t5=%time%
for /l %%n in (1 1 1000) do (
rem
echo !a!>nul
)
set t6=%time%
for /l %%n in (1 1 1000) do (
set "a=b"
set "a="
)
set t7=%time%
%macro.diffTimeRaw% t1 t2 base
%macro.diffTimeRaw% t2 t3 base_setlocal_endlocal
%macro.diffTimeRaw% t3 t4 base_setlocal_set_endlocal
%macro.diffTimeRaw% t4 t5 base_setlocal_unset_endlocal
%macro.diffTimeRaw% t5 t6 base_expand
%macro.diffTimeRaw% t6 t7 base_set_unset
set /a time_setlocal_endlocal=base_setlocal_endlocal-base
set /a time_set=base_setlocal_set_endlocal-base_setlocal_endlocal
set /a time_unset=base_setlocal_unset_endlocal-base_setlocal_endlocal
set /a time_expand=base_expand-base
set /a time_set_unset_predicted=base+time_set+time_unset
call :padNum time_setlocal_endlocal
call :padNum time_set
call :padNum time_unset
call :padNum time_expand
call :padNum time_set_unset_predicted
call :padNum base_set_unset
echo ---------------------------------------
for %%f in (env.txt) do echo approximate environment size = %%~zf
echo(
echo setlocal/endlocal = %time_setlocal_endlocal%
echo set = %time_set%
echo unset = %time_unset%
echo expand = %time_expand%
echo(
echo predicted set/unset = %time_set_unset_predicted%
echo actual set/unset = %base_set_unset%
echo(
del env.txt
exit /b
:padNum
set "%1= !%1!"
set "%1=!%1:~-7,5!.!%1:~-2!"
set "%1=!%1: .=0.!"
set "%1=!%1:. =.0!"
exit /b
Here are the results of one run. The times are for 1000 iterations of each operation measured in 1/100 seconds.
Code: Select all
---------------------------------------
approximate environment size = 10352
setlocal/endlocal = 0.45
set = 0.09
unset = 0.10
expand = 0.25
predicted set/unset = 0.19
actual set/unset = 0.20
---------------------------------------
approximate environment size = 21051
setlocal/endlocal = 0.58
set = 0.18
unset = 0.15
expand = 0.28
predicted set/unset = 0.33
actual set/unset = 0.31
---------------------------------------
approximate environment size = 41711
setlocal/endlocal = 0.84
set = 0.31
unset = 0.23
expand = 0.29
predicted set/unset = 0.54
actual set/unset = 0.51
---------------------------------------
approximate environment size = 83033
setlocal/endlocal = 1.34
set = 0.58
unset = 0.41
expand = 0.27
predicted set/unset = 1.00
actual set/unset = 0.91
---------------------------------------
approximate environment size = 165726
setlocal/endlocal = 3.02
set = 1.23
unset = 0.92
expand = 0.24
predicted set/unset = 2.15
actual set/unset = 2.10
---------------------------------------
approximate environment size = 331166
setlocal/endlocal = 5.27
set = 2.12
unset = 1.56
expand = 0.27
predicted set/unset = 3.69
actual set/unset = 3.82
---------------------------------------
approximate environment size = 662046
setlocal/endlocal = 15.80
set = 5.91
unset = 4.74
expand = 0.28
predicted set/unset = 10.66
actual set/unset = 11.66
---------------------------------------
approximate environment size = 1324081
setlocal/endlocal = 32.15
set = 12.57
unset = 10.25
expand = 0.26
predicted set/unset = 22.83
actual set/unset = 24.88
Dave Benham