Version 10 (modified by faber, 11 years ago) (diff)

--

DETER Testbed API

The DETER API is really an interface with two sides, an outward facing side (the testbed API) that allows people to manage resources on the testbed and an inward facing side (the containers API) that coordinates the containers that make up an experiment environment. This document defines these interfaces and documents how to use them. It is a living document that will gain detail as the specification grows and implementations are put into service. At this point (July 2013) it is mostly a roadmap.

The testbed API is responsible for managing the following things:

Users
researchers
Projects
groups of related users, used to manage what users can see and do to other testbed resources
Resources
building blocks for realizing experiments: computers, disk images, external access
Experiments
a research environment that may be stored, edited, and realized on testbed resources

The testbed API how users ask a DETER testbed to do things for it. It will generally be called from a more user-friendly front end, tuned to the user's experience and goals, such as the evoloving DETER Beginners Interface.

The containers API is used by the DETER control system to manage the resources that make up an experimental topology in progress. The goal of this API is to take raw resources inside the testbed that have been allocated using the testbed API, and configure them into a usable environment. Each logical element of an experiment (a computer, a router, etc.) is represented by a container which must be managed by DETER.

Managing a container consists of:

  • Installing and configuring any virtualization software or configuring hardware
    • This involves translating from a more generic topology/configuration description into the setup for a specific container type
  • Configuring and starting MAGI software necessary to connect to the experiment management system
  • Exposing and using container-specific features to provide DETER services

We discuss the testbed API and then the containers API. Then we discuss some initial implementation details and present a roadmap for development.

DETER Testbed API

The testbed API manipulates a few entities to provide the experimentation environment. This section discusses the key abstractions of the testbed API and how they work together to create an experimental environment.

Users

A user is a researcher who uses the DETER testbed. They request testbed services and allocate testbed resources. Users are the actors that make things happen through test testbed API.

Each user is identified by a unique string, their userid. Userids are assigned when the user is created on the testbed and guaranteed to be unique.

In addition to the user identifier DETER keeps meta-data about users. Currently that meta-data consists of:

  • Projects the user is in (see below)
  • Experiments the user owns (see below)
  • A password to authenticate the user
  • A valid e-mail address for communication and password resetting
  • Resource access information, e.g.,
    • ssh public keys
    • windows authentication credentials
  • General metadata, e.g,
    • Real name
    • Affiliation
    • Phone number
    • Address

A user identifies themselves to the testbed API by proving that they hold a specific public/private keypair. An initial such keypair is issued when the user is created, and a user can acquire another valid pair at any time by proving they know their password. Generally those pairs are short-lived to guard against loss or theft, but the password is administered using local testbed policies.

Projects

Projects are groups of users. They are used to confer rights to groups of users, identify groups of users, and control how resources are configured when experiments are realized on the testbed. This is a more general use of projects than an Emulab testbed.

A project always has an owner, the user responsible for its creation. Ownership can be changed, but only by the owner.

When a user is added to a project, their rights are also defined within that project. The rights are:

  • Can add other users or projects to the project
  • Can remove users or projects from the project
  • Can realize experiments under this project (granting members of the project access to the resources in use)

When a project is added to another project, the rights of the project are given to each member of the group.

Projects can also have attributes attached to them. The most important of these is the vetting attribute. A vetting project (one with the vetting attirbute) can allow users to have access to testbed resources. Until a user is a member of a vetting project, minimal testbed resources are allocated to them and they cannot perform any useful actions, except attempting to create one vetting project. DETER testbeds can be configured to alert staff when a vetting project is requested and require authorization from an administrator.

Vetting projects solve the problem of distributed user management on a large testbed. DETERlab has thousands of users and a small staff. The staff needs to make sure that users meet certain criteria. Having the staff review each user application is unreasonable, reviewing the creation of a comparatively smaller set of vetting projects is not.

When a researcher wants to start a project on DETERlab, the researcher creates a user and requests a vetting project. The vetting project is reviewed, and if valid created with the user as owner. The user is then added to the vetting project with the right to add others. Now that user is responsible for managing new people in their project. A new person creates a user that has no power until a researcher with a vetting account adds them to that project.

Though the testbed API sees this as two steps, the user interface presented by a web page would roll these two steps together. A user sees it as a single application page.

A testbed with looser constraints on the vetting process can skip or remove the review process.

If a user is removed from all vetting projects, they return to a powerless state.

Resources

Resources are the physical and conceptual objects managed by the testbed that are used to build experimental environments. They are the computers, network ports, externally routable addresses, virtual machine images, et al. from which experiments are constructed.

Resources are a class of objects that are user to build experiments. Now there are are a few well known resources that are visible to the API, but generic resource objects are also supported by the API and provide a way to integrate new building blocks. Some of the specialized resources are:

  • computers
  • links
  • disk images
  • ssh keys

Access to and configuration of resources is affected by the project(s) a user is acting as a member of. When a user requests resources, thay specify the projects under which they are requesting them. A user requesting resources as a member of a project representing a university class may have access to different resources than one acting as a member of the testbed administration. How membership affects the resources a group can claim is set by testbed policy?.

In addition, the membership in projects controls how resources will be configured. A resource in use by a particular project will generally be configured to be accessible to all members of that project. A student who allocates resources as a member of a small design group while implementing a class project may later allocate resources as a member of the whole class when presenting the work to the class's TA and professor.

Experiments

All of the testbed API is ultimately geared toward the creation of experiments. An experiment is:

A description of the experimental environment
topology of computers and other resources in which the experiment will take place, including infrastructure necessary to carryout and gather data from the experiment
A set of constraints on the resources needed for experiment validity
failures of nodes or software can render an experiment invalid
A procedure to carry out
the repeatable sequence of events and reaction to those events that tests a hypothesis
A data to be collected and methods for doing so
mechanisms to understand the experiment without influencing it

Not all experiments in the sense of the API data structure will have all these elements. Testbeds are often used to create an environment in which to try ideas out and explore ideas without intending to reproduce the experience. The other end of the spectrum is rigorous, repeatable hypothesis testing. The API supports both by allowing some of these aspects to be omitted for some experiments (in the API sense).

This API is primarily concerned with:

  • Storing the experiment specification for repeated use
  • Sharing the experiment specification between users subject to policy?
  • Assigning resources to carry out the experiment
  • Configuring resources using containers so that the experiment can be carried out successfully on limited resources
  • Initializing and supporting an experiment control system like MAGI to carry out the experiment's procedure, police invariants, and gather data.

These last three bullets construct the environment for experimentation. We call that process realizing the experiment.

In order to support experiments that are making minimal use of experiment control systems, the API allows a user to manipulate a realized experiment, including

  • viewing topology and state of containers
  • low-level operations on containers (start, shutdown, reconfigure)

These interfaces use filtering and graph analysis to present useful views of large experiments.

When realizing the experiment, the testbed uses the containers API to configure and control the hardware and software that create the environment.

The Containers API

The containers API is responsible for allocating resources to an experiment and configuring that hardware in ways appropriate for the user's goals - realizing it.

The key abstraction for realizing an experiment is the container. A container holds some of a physical resource's computational and networking power and uses it to create part of the experiment at a level of realism appropriate to the researcher's goals. A researcher who is interested in end-system behavior will not put very much computational power into realizing the routers that forward packets, but a lot into realizing end systems.

The containers API coordinates:

  • Allocating the resources from the testbed's pool of resources
  • Configuring the physcial resources to realize the experiment
  • Partitioning the computational and networking power of each resource into appropriate realizations

This API supports communication between the testbed control system and the containers that make up an experiment being realized (and after it is realized).

General Containers

A clear kind of container is a computer running a virtual machine monitor that partitions its CPU resources between virtual machines. This is a very useful kind of container, but not the only one. The containers interface is intended to support new forms of computer virtualization, efficient network simulation, or new kinds of physical hardware that can be part of testbeds.

To support these goals operations that the testbed can perform on a container are simple and can be specialized. In particular, the operations guaranteed to work on any container are:

Start
Begin operating as an experiment element. (For a computer, boot; for a flux capacitor, begin travelling in time)
Stop
Become quiescent. Only containers API requests will work on a stopped container
Describe
Tell the testbed what the container's state is, extended operations the container supports, the configuration format to use, and resources allocated to it
Configure
Set up the container's internal state. (For a computer, establish accounts and mount filesystems; for a flux capacitor, set travel rate and destination date)
Clean
Undo the effects of any Configure commands

A container can be in the following states:

Down
The container is in communication with the testbed, but not yet configured to act as an experiment element
Configured
The container is set up to act as an experiment element but is not yet doing so
Pinned
The container is not acting as an experiment element, but is carrying out an operation that renders it otherwise unusable. For example, a container that is capturing its state may be in this state.
Up
The container is acting as an experiment element
None
Nothing is known about the container. is is (as yet) unresponsive.

Containers report state in response to a describe operation and can also spontaneously report changes in state to the testbed.

Generally a container's life cycle looks like:

  • Container starts in None state.
  • Hardware boots, containers code begins running on the resource, when that code comes up the container is Down and reports this state to the testbed.
  • The testbed asks the container to describe itself and sends an appropriate contfiguration request
  • When the configuration is successful the container moves to Configured and reports it
  • When the testbed tells the contatiner to start, it reports when it comes up and changes its state to Up. At this point, if an experiment control system like MAGI was part of the configuration, that system will take over running any experiment procedure.
  • When the experiment (or a phase of the experiment is done) the testbed can issue a Stop to the container, which will move to Configured.

From that point the testbed can issue Clean, Configure, and Start commands to adjust the container state and operation.

Configuration and Specialization

The most common form of container specialization is the configuration format that the container expects. We have defined formats for containers that look like computers that define accounts, filesystems, software installations, etc. Other configuration formats will evolve and move into the mainstream.

In addition we define low-level commands in an experiment description that bind a configuration message content to a node in an experiment. This allows new container configurations to be communicates even when the testbed does not understand them.

In addition to custom configuration systems, containers can export custom operations. For example, some virtual machines support snapshotting their state. That can be useful to experimenters. Again containers can advertise operations that the testbed is unaware of and the testbed API presents a way for users to pass a request through to a container directly.

Putting It All Together

To understand how the pieces above fit together, here is an example of a researcher using DETER.

A person has an idea for research they would like to to conduct on DETER. That researcher accesses a DETER user interface - for example teh DETER web interface - and fills out a form that gathers information about them and their proposed research. The web interface translates this into a request to create a user and then to create a vetting project for that user. The information about the research is routed to the DETER administration.

After the DETER administration decides that the research is reasonable and meets their criteria, they authorize creation of the vetting project. The administration sends the user e-mail telling them they can use DETER. Additionally the testbed may add the new user to a project for new users or other projects for researchers in similar fields.

When the user visits the DETER web interface, it notices that the user is a member of the new users project and presents an opportunity to take a tutorial. In addition, the user has access to sample experiments accessible to users in the new users project. Working from the tutorial and documentation and starting from a sample experiment, the user may create a new experiment with an environment, constraints, procedure and data collection specced out.

The web interface uses the testbed API to review an experiment in the new users project, copy it to the user's vetting project and edit the new experiment to meet the user's goals.

The user then realizes the experiment using the testbed API, uses the experiment control API - usually MAGI - to carry out the experiment and collect the data. When the experiment is complete, the testbed API is used to release resources.

Now the user assesses the data, and may go off to publish results or create a new experiment from this one that tests a different hypothesis or something else.

More Information

From here, readers may be interested in the