Optimization of Mass Copy Via Batch
Posted: 18 May 2022 13:22
So this is more of a theoretical discussion to get an algorithm down as the coding itself should be fairly simple.
So there are x about of nas units, each unit with a unique IP\path. ie NAS1 with path \\IP_NAS1\PATH_NAS1, NAS2 with path \\IP_NAS2\PATH_NAS2, etc. The paths will be unique as well as the IP.
The data set is the same, but only one unit is the source or original. The others are copies that need to be replicated to. The source unit will always have the same IP and path.
Since there are multiple units, once replication has occurred to another nas unit, that nas unit too can be used as a source, so now you have 2x sources. Once copies have been made to 4, you have 4x sources, etc. This can have a tremendous advantage in speed over simple copying the source over and over to each destination.
There is a certain point where you will have a 1:1 ratio between sources and destinations, which would allow the optimal speed for replication since each unit can run at full speed (negating any network bandwidth issues).
But how do you know when you've hit that 1:1 ratio? And how do you dynamically assign destinations to become sources?
Now let's throw another thing into the mix--not all x units will be on all the time, so you may have let's say 5 on one day and 20 another.
There are also some constants--certain nas units that will always be on 24x7 as well as the source which will be on 24x7. Let's say there are 5 of these (because I can't find my paper where I put the real number).
Currently, I have a batch file that replicates from the source nas to a second nas which then are both used to replicate to other nas units. But I haven't figured out how to make the scaling automatic to go larger than this. I think the identification of if nas units are available or not will be a simple 'IF EXIST \\IP\PATH\nul', and there's a single ROBOCOPY command which handles the replication.
Okay, that's it! I'd love to hear ideas on how to approach this better than just hard coding everything and then having to change it each time a new NAS unit is added. Thank you in advance!
So there are x about of nas units, each unit with a unique IP\path. ie NAS1 with path \\IP_NAS1\PATH_NAS1, NAS2 with path \\IP_NAS2\PATH_NAS2, etc. The paths will be unique as well as the IP.
The data set is the same, but only one unit is the source or original. The others are copies that need to be replicated to. The source unit will always have the same IP and path.
Since there are multiple units, once replication has occurred to another nas unit, that nas unit too can be used as a source, so now you have 2x sources. Once copies have been made to 4, you have 4x sources, etc. This can have a tremendous advantage in speed over simple copying the source over and over to each destination.
There is a certain point where you will have a 1:1 ratio between sources and destinations, which would allow the optimal speed for replication since each unit can run at full speed (negating any network bandwidth issues).
But how do you know when you've hit that 1:1 ratio? And how do you dynamically assign destinations to become sources?
Now let's throw another thing into the mix--not all x units will be on all the time, so you may have let's say 5 on one day and 20 another.
There are also some constants--certain nas units that will always be on 24x7 as well as the source which will be on 24x7. Let's say there are 5 of these (because I can't find my paper where I put the real number).
Currently, I have a batch file that replicates from the source nas to a second nas which then are both used to replicate to other nas units. But I haven't figured out how to make the scaling automatic to go larger than this. I think the identification of if nas units are available or not will be a simple 'IF EXIST \\IP\PATH\nul', and there's a single ROBOCOPY command which handles the replication.
Okay, that's it! I'd love to hear ideas on how to approach this better than just hard coding everything and then having to change it each time a new NAS unit is added. Thank you in advance!