Version 66 (modified by sunshine, 12 years ago) (diff)

--

This page talks about language used to design experiments. It should both be used to design metadescriptions and perhaps to make them more specific to a particular experiment the user wants to run.

Examples

I'll start with a few examples of experiments first, that we should be able to design in this language.

  1. [BotnetExample]
  2. [CachePoisonExample]
  3. [MitmExample]

Expressiveness of the Meta-description

We may end up with a single language or a set of related languages. Here is what we need to express:

  1. Logical topology - both at the level of individual nodes or groups of nodes. We are expressing a logical topology of the experiment where there are objects that do something in the experiment - generate traffic, change state, hold data, whatever. Whether these objects are individually generated or generated as a group of entities, whether they are physical nodes or virtual, etc. does not matter. The expressiveness should be such that the actual implementation of objects and the cardinality of each object is orthogonal to the topology description. We should however be able to give hints such as "these objects are in the same network or on same physical node or object A resides on object B". Here is a rough list of hints we'd like to be able to give:
    • What type of object this is - we need to enumerate all possible types such as Node, Info, DNSRecord, ... As new meta-descriptions and new generators are added type list will grow
    • What is the cardinality of this object - this is just a hint about cardinality. I need to be able to say "this is one of many" vs "one and only one", like in [BotnetExample] I should have many vulnerable hosts, but in [MitmExample] I have exactly two nodes and one attacker in the middle of them. I could have multiple MITM triplets but the minimum size is 1 triplet.
    • Can objects overlap or are they distinct
    • An object is (not) located on another object (e.g., cache on a node)
    • An object is (not) contained in another object (e.g. cache record in cache)
  1. Timeline of events - we need to express the ordering of actions that some objects will take in the experiment, their duration, repetition and concurrency. We also need to express state transitions in objects. In some domains this is called a workflow. It could be pre-created in the experiment design stage or it could be generated manually during the experiment (mined from events that happen as user takes manual actions) or a mix of those. Each experiment class must have some default workflow that user can manipulate during experiment design. Here is a rough list of things to express here:
    • Parameters of an action
    • An action must happen vs may happen
    • Additional actions are allowed vs not allowed
    • An action can(not) be split into multiple smaller pieces that have the same effect
    • State changes in objects due to some action, at random, due to a timeout ... State changes that generate an action
    • Conditions that lead to state change or to an action
    • One (or none or N) of a number of actions must happen
    • Loops with and without conditions
  1. Invariants - we need to express what MUST happen in the experiment for it to be valid. This is not a complete set, just the necessary one. If any of the invariants were violated the experiment would become invalid. Valid here means "for it to belong to a class of experiments whose metadescription we used" plus any other conditions that user wants to impose. There are two types of invariants:
    1. those that deal with objects and their states ("cache must be poisoned")
    2. those that deal with events and their features ("traffic must flow from A to B for 5 minutes at 100Mbps")

In general case invariants are defined in the logical topology and timeline of events. Additional invariants may be defined in this section but so far I had hard time coming up with those for a meta-description. Naturally when the user designs her experiment starting from the metadescription this will lead to more invariants being defined automatically and to some that a user can choose to define.

Note that intentionally this is all pretty high-level and is orthogonal to any generator used to generate topologies, traffic, etc. There must be a mapping process that selects eligible generators for each dimension and takes their output and maps objects and events to it. More about this mapping process later.

Domain Knowledge

Generator Descriptions

Mapping elements of metadescription to generators during experiment design

TODO

  • How is ordering of events defined?
  • What do we denote "all", "each", "none", "some"
  • How do we denote state transitions because of an event, vs. self-initiated, vs. those that emit an event