Algorithm

Algorithm for assign+ is described in http://www.isi.edu/~mirkovic/publications/imc12-final.pdf

Code

Code resides in assign_plus branch of the repository, in assign+ folder. There are three files:

  • assign+
    • PURPOSE: reads resource requirements and available resources and finds an allocation with minimum interswitch bw use
    • INPUT: two command line arguments ptopfile and topfile. These files are usually produced in the process of resource assignment by assign_wrapper, now called mapper. There are a few other command line options, that should be self-explanatory when you run the code. If you want very verbose output run it with -d.
    • OUTPUT: textual output denoting nodes and links and their assignments that can be passed further to resource allocation
  • getptopdetail.pl
    • PURPOSE: parse a ptopfile into something human-readable. Assign+ also uses this output.
    • INPUT: ptopfile
    • OUTPUT: human-readable summary of ptopfile
  • gettopdetail.pl
    • PURPOSE: parse a topfile into something human-readable. Assign+ also uses this output.
    • INPUT: topfile
    • OUTPUT: human-readable summary of topfile

Testing

Testing code resides in assign_plus branch of the repository, in assign+/test folder. There are several files that jointly perform tests that compare performance of assign and assign+.

  • removefixed.pl
    • PURPOSE: Remove fixed nodes from a topfile
    • INPUT: topfile
    • OUTPUT: topfile in correct format minus lines that talk about fixed nodes
  • makefixed.pl
    • PURPOSE: Make a topfile with fixed nodes taken from the output of assign or assign+
    • INPUT: output-file-of-assign-or-assign+ topfile
    • OUTPUT: topfile in correct format plus lines that enforce fixed nodes and fixed interfaces for links
  • parseinfo.pl
    • PURPOSE: measure performance of assign or assign+
    • INPUT: a file containing output of assign or assign+
    • OUTPUT: success or failure of allocation, time the allocation took, number of nodes and interswitch links allocated
  • runtests.pl
    • PURPOSE: run a number of tests with assign and assign+. The tests we run are the following:
      • t1. assign with given ptopfile and topfile
      • t2. assign+ with given ptopfile and topfile
      • t3. assign with given ptopfile and modified topfile to remove fixed nodes
      • t4. assign+ with given ptopfile and modified topfile to remove fixed nodes
      • t5. assign with given ptopfile and modified topfile to remove fixed nodes, and then include fixed nodes and fixed interfaces from the output of assign+ in step t4
    • INPUT: a file containing lines with ptopfile and topfile
    • OUTPUT: file testresults that contains performance of these two allocation algorithms for each line of input
  • processresults.pl
    • PURPOSE: process the results file generated in the previous step
    • INPUT: name of the results file
    • OUTPUT: statistics about successes and failures of assign and assign+. The code also produces files that contain inputs that lead to following potentially anomalous cases:
      • checktests - assign+ failed and assign didn't, i.e. t1 was a success and t2 was a failure
      • failedtests - both assign+ and assign failed, i.e. both t1 and t2 were failures
      • weirdtests - assign+ succeeded and assign didn't, i.e. t1 was a failure and t2 and t5 were a success
      • impossibletests - assign+ succeeded and assign didn't with fixed nodes from output of assign+, i.e. t1 was a failure, t2 was a success and t5 was a failure. This most often happens because assign cannot deal properly with fixed interfaces on links.

Additionally, the folder contains a version of assign I have used in testing, that will run on boss. It also contains two sets of test cases:

  • expinfotests - all attempted allocations from start of DeterLab until Spring 2013
  • expfailedtests - all failed allocations roughly from Jan 2012 until Spring 2013

Performance

Expinfotests

There were total of 109,326 tests. Historically, in real operation some of these allocations succeeded and some failed.

Category Count Percentage Reason
Both assign and assign+ succeeded 98,641 90% n/a
assign succeeded, assign+ failed 98 0.08% In 93 of these cases assign doesn't detect a disconnected switch topology nor oversubscribed interswitch bandwidth, but it should. In 2 cases we overestimate what we need in one vclass. I can fix this issue but it will have to wait. In 3 cases the ptopfile has an error - two interfaces on a physical node are called the same. assign doesn't pick up on this but assign+ does, which should be correct behavior. So effectively in only 2 out of 109,326 assign is better than assign+.
Both assign and assign+ failed 6,341 5.8% n/a
assign failed, assign+ succeeded, assign succeeded with fixed nodes from assign+ 666 0.61% There is a good solution but assign cannot find it in a given number of trials.
assign failed, assign+ succeeded, assign failed with fixed nodes from assign+ 3,580 3.24% This happens because assign cannot properly deal with fixed interfaces. I checked quite a few of these solutions manually and they are possible solutions.

Expfailedtests

There were total of 9,861 tests. Historically, all these allocations failed.

Category Count Percentage Reason
Both assign and assign+ succeeded 300 3% assign sometimes fails due to "luck". There is a good solution but it runs out of trials to find it.
assign succeeded, assign+ failed 4 0.04% In one case there was a disconnected switch topology. In other 3 cases assign does not properly check for OS support and ends up assigning nodes that cannot load a requested OS. Thus effectively there were 0 cases in these tests where assign was better than assign+.
Both assign and assign+ failed 9,409 95.4% n/a
assign failed, assign+ succeeded, assign succeeded with fixed nodes from assign+ 147 1.49% There is a good solution but assign cannot find it in a given number of trials.
assign failed, assign+ succeeded, assign failed with fixed nodes from assign+ 1 0.01% This happens because assign cannot properly deal with fixed interfaces. I checked quite a few of these solutions manually and they are possible solutions.

Performance Results

The image above shows the ratio of time taken by assign+ vs assign on y axis vs time taken by assign+ on x axis. Values of y smaller than 1 (blue line) are better. We can see that the ratio varies a lot for short times, e.g., under 10 seconds. Sometimes assign is better and sometimes assign+. As we go to more complex topologies that take longer to allocate assign+ becomes decidedly better, often taking 1/10 of the time that assign needs. In the extreme this is 1 minute for assign+ vs 10 minutes for assign.

The image above shows the ratio of the nodes allocated by assign+ vs assign on y axis vs number of nodes allocated by assign+ on x axis. All values are very close to 1 (blue line) indicating that both algorithms allocate similar numbers of nodes for a given topology.

The image above shows the number of interswitch links allocated by assign on y axis vs number of interswitch links allocated by assign+ on x axis. Values of y greater than 1 (above blue line) are better. We can see that assign+ often allocates 10-100 times fewer interswitch links than assign.

Last modified 11 years ago Last modified on Aug 7, 2013 4:00:35 PM

Attachments (3)

Download all attachments as: .zip