Page 1 of 1

Subtotals in DOS Batch

Posted: 19 Oct 2009 21:14
by NetBackUp_Admin
Ok, I am pretty new to batch scripting, maybe you can help me:

Let's say that I have a text file that looks like this:

svr_list_1.txt



server1 1
server1 5
server1 2
server2 22
server2 1
server2 1005
server2 110
server3 4
server3 2002
server4 14
(and so on)



The goal is to get a new svr_list_2.txt file with something like this:

server1 8
server2 1138
server3 2006
server4 14
(and so on)

(As you can see, I need to sum all the numbers in the right column that share the name on the left column) (something like MS Excel "Subtotal" function :D. I know that batch and maths are not good friends but... hope is the last thing you lose... right?) Thank you all in advance for all your support.

Posted: 20 Oct 2009 18:00
by avery_larry
I think this will work:

Code: Select all

@echo off
setlocal enabledelayedexpansion
set "servlist=nul"

for /f "tokens=1,2" %%a in (svr_list_1.txt) do (
   set /a "%%a += %%b"
   echo.!servlist!|find " %%a" >nul 2>nul ||set "servlist=!servlist! %%a"
)

for %%a in (%servlist:~4,999999%) do echo %%a !%%a!>>svr_list_2.txt
)

DOS Batch Math!!!

Posted: 20 Oct 2009 22:45
by NetBackUp_Admin
Let me say something: :shock: :shock: :shock: U simply ROCK man!!! Real Genius!!! Thank you, thank you, thank you!!!

This works great with the example list!

However, it is not working with the real list, but I know it is not your fault, I did not give a good explanation in the begining. Here I go:

The real list looks like this:

paatan-S1600 210229561
paatan-S1600 162188734
paatan-S1900 6063697
paatan-S1900 5917962
paatan-S1900 5684244
paatan-S1900 89128636
paatan-S1900 4199444
paatan-S1900 72385866
paatan-S1900 70132397
paatan-S1900 70103914
paatan-S1900 69827918
paatan-S1900 67529567
paatan-S2800 100000774
paatan-S2800 99052983
paatan-S3601 35736331
paatan-S3602 15115541
paatan-S3602 655169
paatan-S3602 39902

and after the batch process it, we get this :(

paatan-S1600
paatan-S1900
paatan-S2800
paatan-S3601
paatan-S3602

Now... I am sure that it is the "-" that is screwing the maths up because i later removed them... and it worked great again:

svr_list_1.txt

paatanS1600 210229561
paatanS1600 162188734
paatanS1900 6063697
paatanS1900 5917962
paatanS1900 5684244
paatanS1900 89128636
paatanS1900 4199444
paatanS1900 72385866
paatanS1900 70132397
paatanS1900 70103914
paatanS1900 69827918
paatanS1900 67529567
paatanS2800 100000774
paatanS2800 99052983
paatanS3601 35736331
paatanS3602 15115541
paatanS3602 655169
paatanS3602 39902



svr_list_2.txt

paatanS1600 372418295
paatanS1900 460973645
paatanS2800 199053757
paatanS3601 35736331
paatanS3602 15810612

In the "real world" i cannot remove the "-" so... can it be fixed up? I mean... is there a way to make it work when the left column has a "-" in the middle of the name?

Thank you in advance for all your support!

Posted: 21 Oct 2009 11:56
by avery_larry

Code: Select all

@echo off 
setlocal enabledelayedexpansion
set "idx=0"

for /f "tokens=1,2" %%a in (svr_list_1.txt) do (
   if "!idx!"=="0" (
      set /a idx+=1 & set "name1=%%a" & set /a "count1+=%%b"
      ) else (
         call :process %%a %%b
   )
)


for /l %%a in (1,1,%idx%) do echo !name%%a! !count%%a!>>svr_list_2.txt
)

goto :eof


:process
for /l %%c in (1,1,%idx%) do (
   if "!name%%c!"=="%1" set /a "count%%c+=%2" & goto :eof
)
set /a idx+=1
set "name%idx%=%1"
set /a "count%idx%+=%2"
goto :eof


Note that the number summation is limited by DOS to just a little over 2000000000.

Posted: 21 Oct 2009 20:11
by NetBackUp_Admin
Oh...!!! You had to change the whole code... sorry about that... :(

************************************************************************************************
NOTE: If you want to quit now, I will understand. You've already helped me a lot. I owe you man!
************************************************************************************************


Now...
I am not really understanding the following statement:
"Note that the number summation is limited by DOS to just a little over 2000000000"

I have 2 possible meanings for that:
1) The total amount of numbers to be sum cannot exceed 2000000000.
or...
2) The result of the sum operation cannot be above 2000000000.

Regardless of which the correct answer is, I noticed that for the largest servers it is not showing the correct result... I have even had some negative numbers as a result... :(

If the correct answer is number 1, let me say that:
I will never, ever ,ever ,ever reach that number. Data on the left column is never repeated for more than 400 lines.

If the correct answer is number 2, let me say that:
The numbers in the right are nothing else than backup images shown in KBs. What I am currently doing is changing it to GBs (*/1024/1024) for easier reading. So I thought... if it could be done, for every number, at the begining of the process (before the sum)... that would solve the "large-numbers-not-supported" issue.

If you give me your email address, I can send you the real and complete list "svr_list_1.txt" along with the (currently java-processed) final list "svr_list_2.txt" for your better understanding. (the lists are waaaaay too large to post here and I do not kwow how to attach filers here... as a matter of fact, i do not even know if it possible to do so) (If you cannot (or do not want to) give me you email address, I can rather create a new email account, attach the files there, and give you the pass so you can access these lists)

In addition to this, I am attaching below the contents of my current "my_precious.html" javascript (which works perfectly, only that I now need to do it all in Batch) so you can have a better idea of what I need:

Code: Select all

<html>
<head></head></br><text><b>Paste the contents of the "svr_list_1.txt" file here,</b></text></br><text><b>then hit the "DO THE MATHS!" button:</b></text></br><body>
<textarea id="left" cols="55" rows="10"></textarea>
</br></br>
<button onclick="doIt();" >DO THE MATHS!</button>
</br></text></br><text><b>Grab the contents of this box, and save it as "svr_list_2.txt"</b></text>
</br><textarea id="right" cols="55" rows="10"></textarea>
<script>

function doIt()
{
   var hLeft = document.getElementById('left');
   var hRight = document.getElementById('right');
   var sourceTxt =  hLeft.value;
   var preList = sourceTxt.split("\n");
   var multiArray = new Array();
   var outterIndex;
   var resultArray = new Array();


   for(outterIndex = 0; outterIndex<preList.length; outterIndex++)
   {
      var temp = preList[outterIndex];
      var temp2 = temp.split(' ');
      multiArray[outterIndex] = new Array(temp2[0],temp2[1]);
//      hRight.value = hRight.value + multiArray[outterIndex][0] + ' ' + multiArray[outterIndex][1] + "\n" ;
   }
   
   for (outterIndex=0; outterIndex<preList.length; outterIndex++)
   {
      temporal = multiArray[outterIndex][0];
      temporalSum = 0;
      flag=false;
      for (innerIndex=0; innerIndex<preList.length; innerIndex++)
      {
         if(multiArray[innerIndex][0]==temporal)
         {
            temporalSum = temporalSum + parseInt(multiArray[innerIndex][1]);
            if (flag==false)
            {
               resultArray[outterIndex] = new Array(null,null);
               flag=true;               
            }
         }
      }
      resultArray[outterIndex][0] = temporal;
//      temporalSum = (temporalSum / 1048576);
//      resultArray[outterIndex][1] = Math.round((temporalSum/1048576)*100);
      resultArray[outterIndex][1] = Math.round((temporalSum/1048576)*100)/100;
   }

   hRight.value = '';

   for(outterIndex = 0; outterIndex<resultArray.length; outterIndex++)
   {
      if (hRight.value.indexOf(resultArray[outterIndex][0])==-1)
      {
         hRight.value = hRight.value + resultArray[outterIndex][0] + ' ' + resultArray[outterIndex][1].toFixed(2) +          "\n" ;
      }
   }
}


</script>

</body>



</html>


Once again... thank you for all your dedication to my succcess!!! (Even though it is pretty obvious that you master the Batch arts, that code did not write itself alone, and I am completely sure it was not a "just-a-couple-of-minutes" task, not even for you, man!)

Posted: 22 Oct 2009 09:44
by avery_larry
Maybe 15 minutes.

DOS can't handle past 2 Gb in the builtin arithmetic -- it does a funky rollover to negative numbers. I typically use the poor man's shortcut of stripping the last 6 or 9 digits off the file sizes (when in bytes) and using that as a "close enough" approximation of megabytes or gigabytes.

It does take a little more coding doing it that way, because you have to handle the exception when it's less than 1 Mb or 1 Gb.

backup images shown in KBs
If you truly meant KBs instead of bytes, then your arithmetic limit is 2 Tb. Are they that big?

Anyway -- I think it should be easy to add /1024/1024 to the script . . .


<5 minutes later>

Yep. Just add /1024/1024 to a few spots in the script (or you can just use /1024):

Code: Select all

@echo off 
setlocal enabledelayedexpansion
set "idx=0"

for /f "tokens=1,2" %%a in (svr_list_1.txt) do (
   if "!idx!"=="0" (
      set /a idx+=1 & set "name1=%%a" & set /a "count1+=%%b/1024/1024"
      ) else (
         call :process %%a %%b
   )
)


for /l %%a in (1,1,%idx%) do echo !name%%a! !count%%a!>>svr_list_2.txt
)

goto :eof


:process
for /l %%c in (1,1,%idx%) do (
   if "!name%%c!"=="%1" set /a "count%%c+=%2/1024/1024" & goto :eof
)
set /a idx+=1
set "name%idx%=%1"
set /a "count%idx%+=%2/1024/1024"
goto :eof

Re: DOS Batch Math!!!

Posted: 23 Oct 2009 01:15
by ghostmachine4
NetBackUp_Admin wrote:paatan-S1600 210229561
paatan-S1600 162188734
paatan-S1900 6063697
paatan-S1900 5917962
paatan-S1900 5684244
paatan-S1900 89128636
paatan-S1900 4199444
paatan-S1900 72385866
paatan-S1900 70132397
paatan-S1900 70103914
paatan-S1900 69827918
paatan-S1900 67529567
paatan-S2800 100000774
paatan-S2800 99052983
paatan-S3601 35736331
paatan-S3602 15115541
paatan-S3602 655169
paatan-S3602 39902


for parsing files and formatting reports, use a tool appropriate for the job. If you can download gawk for windows : http://gnuwin32.sourceforge.net/packages/gawk.htm, here's how to do it with one line

Code: Select all

C:\test>gawk "{s[$1]+=$2}END{for(i in s)print i,s[i]}" file
paatan-S1900 460973645
paatan-S3601 35736331
paatan-S3602 15810612
paatan-S2800 199053757
paatan-S1600 372418295

U did it! THANKS!

Posted: 26 Oct 2009 21:09
by NetBackUp_Admin
@ghostmachine4: Sorry man, I cannot use anything but plain DOS. Thans anyway!

@avery_larry: U rock man! this works just great, (almost) flawless!
The only thing i noticed is that i am getting always a result a lil under the real one. I mean, for my biggest server, all of the images do a total of 139TB (I know that due to my calculator, and then confirmed via excel). and with this dos script i am getting 137TB. I am getting a diference of about 2050GB less. I guess this is caused by DOS rounding down the decimals... and since we are doing the /1024/1024 before the sum... I guess all little images are being rounded down to "0". But that is ok... 2TB in about 139... does not hurt!

Thankyou once again for all your time and dedication! May I be of help, do not hesitate to contact me, man! Bye :)

Posted: 27 Oct 2009 15:17
by avery_larry
Yep -- it's always going to round down. You may want to just divide by 1024 in the various script areas, and then divide by 1024 again at the very end (or just report in gigabytes instead of terabytes).

You'd have to change the for /l loop, either just strip the final 3 digits or add the code to divide each %countn% by 1024. I'd probably just /1024 once and then do this:

for /l %%a in (1,1,%idx%) do echo !name%%a! !count%%a:~0,-3!>>svr_list_2.txt

At least I think that will work with the variable substring.