Changes between Initial Version and Version 1 of LifeCycle


Ignore:
Timestamp:
May 4, 2011 2:16:51 PM (14 years ago)
Author:
Alefiya Hussain
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • LifeCycle

    v1 v1  
     1== Design Notes ==
     2
     3This document serves the following purposes:
     4* It discusses the high-level concepts of an experiment life cycle. This design is preliminary and constantly evolving and this document will be updated periodically.
     5* Building on the concepts and constructs within the experiment lifecycle, it describes how elm integrates with fedd, seer and cedl.
     6* It discusses goals for the August review
     7
     8==Overview==
     9The current experimental testbed services primarily focus on providing experimenters access to testbed resources with little or no help to configure, correctly execute,
     10and systematically analyze the experiment data and artifacts. Additionally, while it is well know that experimentation is inherently iterative, there are limited mechanisms to integrate and cumulatively build upon experimentation assets and artifacts during the configure-execute-analyze phases of the experiment lifecyle.
     11
     12The eclipse ELM plug-in provides an integrated environment with a (large) collection of tools and workbenches to support and manage artifacts from all three phases on the experiment life cycle. Each workbench or perspective integrates several tools to support a specific experimentation activity. It provide a consistent interface for
     13easy invocation of tools and tool chains along with access to data repositories to store and recall artifacts in a uniform way. For example, the topology perspective
     14allows the experimenter to define a physical topology by merging topology elements based on the specified constraints and then validate the resultant topology.
     15
     16The key capabilities of the ELM plug-in include:
     17* Mechanisms to record variations and derivations of the experiment assets and artifacts and along with their inter-realtionships for the entire set of tasks over which an experimenter iterates during the study.
     18
     19* Inform design and analysis tools to obtain maximum information with the minimum number of experiment trials for a particular study. Every measured value in an experiment is fundamentally a random variable. Hence there are slight variations in the measurements during a trial even when all experimentation factors are kept constant. Hence to be able to characterize such stochastic behavior, it is necessary to execute multiple repetitions and identify confidence levels. Leveraging the tools in the analysis phase, feedback from the analysis phase can be used to control the number of required repetitions for statistically significant results.
     20
     21* Facilitate composition of functional and structural elements of the experiment based on stated and unstated constraints. The ELM workbenches allow creating and linking functional elements of the experiment without specifying the underlying structure and topology. Resolving the constraints to configure a set of realizable and executable experiment trials is  a complex constraint satisfaction problem.
     22
     23* Facilitate experiment monitoring and analysis for accuracy of results and availability of resources and services. ELM+SEER will enable monitoring the experiment configuration and performance of resources to ensure the experiment is executed correctly. While resource misconfiguration and failures are easier to spot, identifying "incorrect performance" of a resource or service is extremely hard. For stochastic processes as seen typically in networked systems, it is very important to be able to identify such experimentation errors as they can significantly impact results and bias measurements.
     24
     25* Enable reuse of experiment assets and artifacts. Reuse is driven by the ability to discover the workflows, scenarios, and data. The ELM environment will provide registry and registry views, along with (RDF-based, DAML+OIL) metadata to facilitate the discovery process. ELM will provide tools to index and search semantically rich descriptions and identify experimentation components including models, workflows, services and specialized applications. To promote sharing, ELM will provide annotation workbenches that allow experimenters to add sufficient metadata and dynamically link artifacts based on these annotations.
     26
     27* Support for multi-party experiments where a particular scenario can be personalized for a team in a ''appropriate'' way by providing restricted views and control over only certain aspects of the experiment. The registry view will allow the team to access only a restricted set of services. The analysis perspectives and views will present relevant animations and graphs to the team. Thus by personalizing a scenario view, the same underlying scenario, can be manipulated and observed in  different ways by multiple teams.
     28
     29
     30We define a '''scenario''' to encompass related experiments used to explore a scientific inquiry. The scenario explicitly couples the experimenter's '''intent''' with the '''apparatus''' to create a series of related experiment trials.
     31The experimenter's intent is captured as '''workflows''' and '''invariants'''. A workflow is a sequence of interdependent actions or steps and  invariants are properties of an experiment that should remain unchanged throughout the lifecycle. The '''apparatus''' on the other hand, includes the topology and services that are instantiated on the testbed during the execution phase. Separation of experimentation intent from the apparatus also enables experiment portability where the underlying apparatus could
     32heterogeneous and abstract, virtualized experiment elements.
     33
     34== Steps for creating an experiment ==
     35Given the above ELM environment, the basic process of creating a scenario consists of the following steps in a spiral:
     36 
     37'''Composition Phase '''
     38*  defining the functional components and functional topology of the study.
     39* defining the abstractions, models, parameters, and constraints for each functional component
     40* identifying/defining the experiment workflow and invariants
     41* identifying/defining the structural physical topology
     42* Composing the experiment trials by resolving the constraints and exploring the parameter space
     43
     44'''Execution Phase'''
     45* sequential or batched execution of experiment trials
     46* monitoring for error and configuration problems
     47
     48'''Analysis Phase'''
     49* analyzing completed trials (some trial may still be executing)
     50* presenting results to experimenter
     51* feedback parameters into the composition tools
     52* annotate data and artifacts and store in the repositories
     53
     54== Integration with DETER Technologies ==
     55The diagram below describes how ELM, fedd, SEER and CEDL interact
     56
     57  |------------------------|
     58ELM --> CEDL --> fedd --> SEER
     59                 |                                ^
     60                 ---------------- |
     61
     62(Place holder: need to update)
     63
     64== August Review Demo ==
     65
     66Suppose my intent is to study the response time of an intrusion detection system. I design a scenario to connect attacker components to the IDS component with an internet-cloud component. The ids component is then connected to a service component with a wan-cloud component as shown below.
     67
     68[[File:Attacker-ids.png]]
     69
     70I am interested in exploring the effects of the attacker on response time of the IDS and not interested in any other aspect of the experiment. The ELM framework should then enable me, the experimenter to solely focus on creating a battery of  experimentation trials by varying the number of attacker components, the attacker model, the model parameters,etc. All other aspects of the experiment should be defined, configured, controlled, and monitored based on standard experimentation methodologies and practices.
     71
     72Each component that affects the response time of the IDS and has several alternatives is called a ''factor''. In the above example, there are four factors: attacker type, internet-cloud type, wan-cloud type and service type. The models that a factor can assume is called a ''level''. Thus the attacker type has two levels: volume attack and stealth attack. Each level can be further parameterized to give additional sub-levels, for example, low-volume vs high-volume attacks.
     73
     74Factors whose effects need to be quantified are called primary factors, for example, in the above study, we are interested in  quantifying the effects of attack type. All other factors are secondary, and we are not interested in exploring and quantifying currently.
     75
     76Hence the experiment design tool consists of defining individual trials varying each factor and level (and possibly also trial repetitions for statistical significance) to create a battery of experiment trials to explore the every possible combination of all levels and all primary factors.