Version 59 (modified by sunshine, 12 years ago) (diff)

--

This page talks about language used to design experiments. It should both be used to design metadescriptions and perhaps to make them more specific to a particular experiment the user wants to run.

Examples

I'll start with a few examples of experiments first, that we should be able to design in this language.

  1. [BotnetExample]
  2. [CPExample]
  3. [MITMExample]

  1. An ARP spoofing experiment where the attacker puts himself in between two nodes and then modifies their traffic. There are two classes of experiments that need to be combined:
    1. an experiment where ARP poisoning happens between two nodes by the attacker
    2. an experiment where an attacker changes traffic passing through it

Requirements

We may end up with a single language or a set of related languages. Here is what we need to express:

  1. Logical topologies - both at the level of individual nodes or groups of nodes. We are expressing a logical topology of the experiment where there are objects that do something in the experiment - generate traffic, change state, hold data, whatever. Whether these objects are individually generated or generated as a group of entities, whether they are physical nodes or virtual, etc. does not matter. The expressiveness should be such that the actual implementation of objects and the cardinality of each object is orthogonal to the topology description. We should however be able to give hints such as "these objects are in the same network or on same physical node or object A resides on object B". These hints are called constraints.
  2. Timeline of events - we need to express the ordering of actions that some objects will take in the experiment, their duration, repetition and concurrency. We also need to express state transitions in objects. In some domains this is called a workflow. It could be pre-created in the experiment design stage or it could be generated manually during the experiment (mined from events that happen as user takes manual actions) or a mix of those. Each experiment class must have some default workflow that user can manipulate during experiment design.
  3. Invariants - we need to express what MUST happen in the experiment for it to be valid. Valid here means "for it to belong to a class of experiments whose metadescription we used" plus any other conditions that user wants to impose. There are two types of invariants:
    1. those that deal with objects and their states ("cache must be poisoned")
    2. those that deal with events and their features ("traffic must flow from A to B for 5 minutes at 100Mbps")

Note that intentionally this is all pretty high-level and is orthogonal to any generator used to generate topologies, traffic, etc. There must be a mapping process that selects eligible generators for each dimension and takes their output and maps objects and events to it. More about this mapping process later.

Diving in

I'll now ignore the question which language to use to design experiments because I think that pretty much any language can be used once we know what we want to say. To figure this out I'll try to use some variation of UML that can express both protocol diagrams and state transitions. If the level of detail is right we can decide on appropriate language in the next step.

Example 3: ARP poisoning with MITM attack

This example used two metadescriptions. The first was ARP poisoning which is a flavor of cache poisoning, and the other is MITM attack.

ARP poisoning metadescription

This is a special case of cache poisoning where the target is ARP cache. I've highlighted customizations from the general cache poisoning metadescriptions to arrive at this one.

Dimensions:

  • Logical topology: No image "arpcpobj.jpg" attached to ExDescLang

(in English: There is one attacker node. There is a fakeIP of type IPaddress. A cache is simply a collection of ARPRecord items, one or more. These are subtypes of Info and in the domain knowledge DB there's syntax defined for an ARPRecord. Cache does not reside at the attacker.)

  • Timeline of events:

No image "arpwf.jpg" attached to ExDescLang

(in English: Attacker sends the ARP reply with mapping of an ARP address to somebody's IP. This really could be anybody's ARP address but in most cases it is the attacker's.)

  • Invariants:

Nothing in addition to the topology and timeline above.

MITM attack metadescription

Dimensions:

  • Logical topology: No image "mitmobj.jpg" attached to ExDescLang

(in English: There is one attacker node, and two regular nodes who want to communicate. These are all different nodes.)

  • Timeline of events:

No image "mitmwf.jpg" attached to ExDescLang

(in English: Attacker replaces each msg between nodes with some modification.)

  • Invariants:

Nothing in addition to the topology and timeline above.

Experiment design

Now I'm a user who wants to design an experiment. I need to combine two metadescriptions (ARP poisoning and MITM attack) and somehow tie them down to generator choices. To combine I'll do something like this:

No image "arpmitmcomb.jpg" attached to ExDescLang

i.e. the ARP experiment needs to be run twice to generate the mappings at node1 and node2 necessary for the attacker to appear on the path from node1 to node2. The cache we're poisoning is at node1 and node2. Poison links the IP address of node2 and node1 respectively with the attacker's ARP address.

The system now needs to offer me several generators:

  • It should offer a topology generator and map the nodes (Node1, Node2, Attacker) to the topology that gets generated. Caches have to reside at Node1 and Node2.
  • It should offer event generator for each of the events: reply (for ARP), and mod(for message).

TODO

  • How is ordering of events defined?
  • What do we denote "all", "each", "none", "some"
  • How do we denote state transitions because of an event, vs. self-initiated, vs. those that emit an event