Version 3 (modified by sunshine, 14 years ago) (diff)

--

This page talks about language used to design experiments. It should both be used to design metadescriptions and perhaps to make them more specific to a particular experiment the user wants to run.

Examples

I'll start with a few examples of experiments first, that we should be able to design in this language.

  1. A botnet experiment where a worm infects some vulnerable hosts, they organize into a P2P botnet with some botmaster and start exchanging C&C traffic. Experimenter wants to observe the evolution of the botnet and the amount of traffic that master receives. There are three classes of experiments here that need to be combined together:
    1. an experiment where worm spreads and infects vulnerable hosts
    2. an experiment where some hosts organize into P2P network and somehow elect a botmaster
    3. an experiment where peers start exchanging some C&C botnet traffic
  1. A cache poisoning experiment where the attacker poisons a DNS cache to take over authority for a given domain. The attacker then creates a phishing page and tries to steal user's usernames/passwords. There are two classes of experiments that need to be combined:
    1. an experiment where a DNS cache is poisoned, subclass of cache poisoning experiments
    2. an experiment where a phishing attack is conducted via a Web page to steal usernames/passwords
  1. An ARP spoofing experiment where the attacker puts himself in between two nodes and then modifies their traffic. There are two classes of experiments that need to be combined:
    1. an experiment where ARP poisoning happens between two nodes by the attacker
    2. an experiment where an attacker changes traffic passing through it

Requirements

We may end up with a single language or a set of related languages. Here is what we need to express:

  1. Topologies - both at the level of individual nodes or groups of nodes. In fact we should be expressing a logical topology of the experiment where there are objects that do something in the experiment - generate traffic, change state, hold data, whatever. Whether these objects are individually generated or generated as a group of entities, whether they are physical nodes or virtual, etc. does not matter. The expressiveness should be such that the actual implementation of objects and the cardinality of each object is orthogonal to the topology description. We should however be able to give hints such as "these objects are in the same network or on same physical node or object A resides on object B".
  2. Timeline of events - we need to express the ordering of actions that some objects will take in the experiment, their duration, repetition and concurrency. We also need to express state transitions in objects but these should be deterministic consequences of actions, so they are implicit in the workflow. In some domains this is called a workflow. It could be pre-created in the experiment design stage or it could be generated manually during the experiment (mined from events that happen as user takes manual actions) or a mix of those.
  3. Invariants - we need to express what MUST happen in the experiment for it to be valid. Valid here means "for it to belong to a class of experiments whose metadescription we used" plus any other conditions that user wants to impose. There are two types of invariants:
    1. those that deal with objects and their states ("cache must be poisoned")
    2. those that deal with events and their features ("traffic must flow from A to B for 5 minutes at 100Mbps")