This page talks about language used to design experiments. It should both be used to design metadescriptions and perhaps to make them more specific to a particular experiment the user wants to run.
I'll start with a few examples of experiments first, that we should be able to design in this language.
We may end up with a single language or a set of related languages. Here is what we need to express:
Note that intentionally this is all pretty high-level and is orthogonal to any generator used to generate topologies, traffic, etc. There must be a mapping process that selects eligible generators for each dimension and takes their output and maps objects and events to it. More about this mapping process later.
I'll now ignore the question which language to use to design experiments because I think that pretty much any language can be used once we know what we want to say. To figure this out I'll try to use a mix of FSA (finite state automata) and protocol diagrams, and Arun's adaptation of TLA (temporal logic algebra) to describe example experiments from above. If the level of detail is right we can decide on appropriate language in the next step.
This example used two metadescriptions. Let's go through each of them:
Dimensions:
(in English: There must be two sets of hosts, at least one infected host in infected set and at least one vulnerable host in vulnerable set. There can be a third set of hosts that are not vulnerable or infected. These sets are disjoint. All objects here are of type Nodes.)
(in English: Each infected host generates scan events that target a vulnerable host - double line means one object acts upon another. There is at least one such event for a vulnerable host and at least one pair of scan+vulnerable host in the experiment. Once an infection event occurs on vulnerable host it transitions to an infected state. An infected host may scan other, non-vulnerable hosts).
Note that I haven't yet defined what scan and infection events mean. I have to do this somewhere but I think the right place for this would be a common repository of domain knowledge rather than attaching these per experiment class since many classes of experiments may need same definitions. Ultimately what I'd like to say in these definitions in plain English is:
Also note that I haven't said if each infected hosts scans ALL or SOME vulnerable hosts and how many. We should have a mechanism to specify this. Same comment goes for any "acts upon" relationship.
(in English: There must be two sets of hosts, at least two eligible peers and at least one leader. Nothing is said about relationship between sets so it's possible that there's an intersection between those that is non-empty. All objects here are of type Nodes.)
(in English: Each eligible peer contacts some other eligible peers asking them to peer with it. If they agree both go into "peer" state, otherwise they both revert to eligible peer state. In each of peer states (elpeer, maypeer, peer) this object may somehow learn about a leader and go into lpeer state in which it knows leader identity. In lpeer state it may learn about other leaders as well. An object in lpeer state receives some commands from the leader and may report back to the leader.).
Note that I haven't defined what wannapeer, yespeer, leaderis, cmd and report events are and I should define it in the common domain knowledge base.
Now I'm a user who wants to design my experiment. I need to combine two metadescriptions and somehow tie them down to generator choices. To combine I need to specify how outputs of worm metadescription match inputs of P2P metadescription. This is simple and I'll just do something like:
i.e. each infected host becomes elpeer.
The system now needs to offer me several generators:
User either chooses each generator or agrees to use a default one for each choice. User can then manipulate the generators (their parameters) and the workflow. For example the user may add "patched" state after the "infected" one with the "patch" event to make the transition.
This example used two metadescriptions. Let's go through each of them:
Dimensions:
(in English: There are attacker and legitimate client nodes, one each. There is one true resource and one fake resource, both are of type Info which means that they are a piece of text someone gives to the legitimate client when it asks for that resource. A cache is simply a collection of Info items, one ore more. Attacker and leg. client are disjoint sets. )
Note: I haven't said if these objects are nodes, resources or what. I should have a way of saying this.
(in English: Each infected host generates scan events that target a vulnerable host - double line means one object acts upon another. There is at least one such event for a vulnerable host and at least one pair of scan+vulnerable host in the experiment. Once an infection event occurs on vulnerable host it transitions to an infected state. An infected host may scan other, non-vulnerable hosts).
Note that I haven't yet defined what scan and infection events mean. I have to do this somewhere but I think the right place for this would be a common repository of domain knowledge rather than attaching these per experiment class since many classes of experiments may need same definitions. Ultimately what I'd like to say in these definitions in plain English is:
Also note that I haven't said if each infected hosts scans ALL or SOME vulnerable hosts and how many. We should have a mechanism to specify this. Same comment goes for any "acts upon" relationship.