Emulab Event System Reference
Table of Contents
Introduction
The Emulab event system provides a means for automating your experiments. The event system consists of several types of "agents" that implement some sort of functionality, such as running programs or generating traffic, and a scheduler that triggers the events at the appropriate time. When your experiment is swapped in, any agents specified in your NS file are automatically setup on the experimental nodes and the ops node. A short time after the experiment becomes active, "event time" begins to flow. As event time progresses, any events scheduled in the NS file for a particular time offset are sent to the appropriate agents. Alternatively, events can be sent at runtime using the tevc command from ops or an experimental node. For a detailed walkthrough of using the event system, see the advanced example.
Recently, we have added some experimental extensions to make the event system even more capable. Note that many of these features are subject to change and are only available when using the latest versions of the FBSD410-STD and RHL90-STD disk images. The NS file below gives an example of using these extensions to automate the process of creating disk images. First, it downloads a network traffic analyzer, iftop, then proceeds to build and install the software. Next, the source directory is removed and a snapshot is taken of the node's disk. Finally, after the snapshot completes and the node has finished rebooting, the experiment is swapped out.
set opt(VERSION) 0.16
set ns [new Simulator]
source tb_compat.tcl
set node [$ns node]
tb-set-node-tarfiles $node \
/tmp http://www.ex-parrot.com/~pdw/iftop/download/iftop-$opt(VERSION).tar.gz
set builder [$node program-agent -dir "/tmp/iftop-$opt(VERSION)"]
set cleaner [$node program-agent]
set build [$ns event-sequence {
$builder run -command "./configure"
$builder run -command "gmake"
$builder run -command "sudo gmake install"}]
set clean [$ns event-sequence {
$cleaner run -command "sudo rm -rf /tmp/iftop-$opt(VERSION)" }]
set doit [$ns event-sequence {
$build run
$clean run
$node snapshot-to RHL90-CUSTOMIZED
$ns swapout }]
$ns at 0.0 "$doit start"
$ns run
An example NS file that automates the process of installing software on a node and taking a snapshot of the disk image.
We also have a small package containing a more complicated experiment that runs BitTorrent on a bunch of nodes, collects their output, and generates a simple report on how they performed: BitTorrent experiment package
The rest of this document is intended as a reference manual for the available set of agents and the events they can handle.
NS "Simulator" Agent
Constructor: new Simulator
The simulator agent provides control over your Emulab experiment as a whole. The simulator agent listens for the following events:
- swapout - Swap out the experiment.
- terminate - Terminate the experiment. Warning: This event will completely destroy every trace of the experiment and there is no confirmation.
- report [-digester script] - Automatically generate and send a "report" e-mail to the user. Typically, this event should be sent at the end of an experimental "trial", when all of the data has been produced and it is time to gather, analyze, and archive the data. Gathering and archiving the data is handled by the loghole utility, which copies log files on the nodes to the experiment's log directory. Simple analysis can be done by specifying a "digester" script that processes the log files. Once all of the data processing has been finished, an e-mail will be sent to the user containing the following:
- The contents of any message events sent to the simulator agent.
- The output from the digester script.
- The captured NS file parameters
- Any log messages sent to the simulator, along with log messages automatically generated by the simulator.
- For any programs that exited with an error, a description of the command that failed and the tails of their standard error and output files.
- message string - Append a string to the head of the e-mail sent by the report event.
- log string - Append a log message to the tail of the e-mail sent by the report event.
NS Examples:
set ns [new Simulator]
...
set doit [$ns event-sequence {
$ns message "Testing one way, then the other..."
$thisway run
$thatway run
$ns report }]
Example 1: Adds some text to the report e-mail, runs an application twice, and then sends a report to the user.
Event Sequence
Constructor: $ns event-sequence [body]
- body - The list of events to be sent. If none is specified, events can be added using the sequence's append method.
An event sequence agent is an ordered list of events, each of which is sent when the previous event in the list has reported its completion. For example, in a sequence consisting of a pair of events that run programs, the first event will be sent immediately and the second will be sent when the run of the first program completes. While running two programs in a row may be trivial using conventional means, this capability works across machines and can interact with other operations like reloading disks and rebooting machines.
The semantics of when an event "completes" depend on the type of agent and event. Many events complete instantaneously, such as those used to set a property, so the next event in the sequence is sent immediately. Other events that take a variable amount of time to complete, such as running a program. Some agents provide two types of events to support non-blocking and blocking operation, usually called start and run. Whereas the start event completes instantly, the run blocks the sequence until the agent is finished.
Event sequences listen for the following events:
- start, run - Begins the execution of the sequence. When the run event is used inside another sequence, this sequence will complete when the last event completes.
NS Examples:
set doit [$ns event-sequence {
$prog0 run -command "setup.sh"
$node0 reboot
$prog0 run -command "test.sh" }]
Example 1: A sequence that performs some setup on a node, reboots it, and then starts the test.
set doit [$ns event-sequence {
$serverprog start;
# Start the server,
$clientprogs run;
# run the clients to completion, then
$serverprog stop;
# stop the server. }]
Example 2: A sequence that asynchronously starts a server, runs some clients, and finally, stops the server.
set testseq [$ns event-sequence]
foreach test $tests {
$testseq append "$prog0 run -command \"$test\""
}
Example 3: A sequence that is constructed incrementally instead of being fully specified in the constructor.
Event Timeline
Constructor: $ns event-timeline
An event timeline agent sends other events at a relative offset to the overall start time of the timeline. In other words, a timeline is a first class version of the existing "$ns at" syntax.
Event timelines listen for the following events:
- start, run - Starts the timeline. When run is used in a sequence, the timeline completes when it sends the last event.
NS Examples:
set tl [$ns event-timeline]
$tl at 0s "$prog0 start"
$tl at 15s "$prog0 stop"
set seq [$ns event-sequence {
$tl run
$ns swapout }]
Example 1: A timeline that runs a program for 15 seconds and then swaps out the experiment.
Program Agent
Constructor: $node program-agent [-command cmdline] [-dir dir] [-timeout seconds] [-tag string] [-expected-exit-code code]
- -command "cmdline" - Specifies the command-line to run. Defaults to the last command that was run or the command specified in the NS file. See belowfor additional notes on command lines.
- -dir directory - Specifies the directory to run the command within. Defaults to the last directory that was specified, the directory in the NS file, or "/tmp".
- -timeout seconds - Specifies the timeout, in seconds, for the command or zero for no timeout. If the command does not complete before the timeout, it will be stopped forcefully. Defaults to the last timeout used for this agent or no timeout.
- -tag string - Specifies the symbolic tag to be attached to this invocation of the agent and its output log file names. By default, invocations are identified by a unique number, so this option allows the user to attach a more meaningful identifier.
- -expected-exit-code number - The expected exit code for the command, this value is compared against the actual exit code to determine whether or not the command completed successfully. Unsuccessful commands run by a sequence will cause the sequence to stop executing and also fail. Defaults to the last value used or zero.
Program agents listen for the following events:
- start, run [options] - Starts the program by running the command-line in the specified directory and capturing its standard output and error. The agent will then switch into "management" mode and only accept stop and kill events until the command terminates. The event accepts the same options as the constructor, so you can change the command to be run on the fly. The output from the command is stored in the "/local/logs" directory on the node. Each invocation of the agent is stored in a separate file tagged with a unique id, in addition, the stdout and stderr data are stored separately in ".out" and ".err" files. To make it easier to locate the last invocation of the agent, soft links are created with file names that lack the unique id (e.g. "prog0.out" -> "prog0.out.5"). If a "tag" is specified, a soft link will also be created that refers to the actual file (e.g. "prog0.baseline.out" -> "prog0.out.5"). The command will be executed with the following environment variables set:
Variable Description PATH The default path for binaries is set to the standard path (e.g. /usr/bin, /bin, /usr/sbin, /sbin), the binary directories in /usr/local, and the directory containing Emulab specific binaries. EXPDIR The experiment's directory in NFS space (e.g. /proj/foo/exp/bar). LOGDIR The preferred directory for log files on the local machine. USER The name of the user that swapped in this experiment. HOME The path to the user's home directory. GROUP The name of the unix group for the user that swapped in this experiment. PID The project ID for the experiment this agent is running within. EID The experiment ID for the experiment this agent is running within. NODECNET The fully-qualified name of the node this program agent is running on. This name resolves to the IP address of the control network interface of the node. NODECNETIP The IP address of the control network interface. This address should not be advertised to, or used by, applications within an experiment as it will cause all traffic to flow over the control network rather than the experimental network. NODE The unqualified name of the node this program agent is running on. For nodes with experimental interfaces, this name resolves to the IP address of an experimental interface on the node. For nodes with more than one experimental interface, there is no guarantee which one it will resolve to. For nodes with no experimental interfaces, the name will not resolve. NODEIP The IP address of the experiment network interface that NODE resolves to. For nodes with no experimental interfaces, this variable will not be set. set opt(VAR) values Any entries in the "opt" array of the NS file will automatically be added to the environment. For example, to set a variable named "DURATION" with a value of "100", you would add "set opt(DURATION) 100" to the top of your NS file. See captured parameters.
- stop - Stops the program, if it is currently running, by sending a SIGTERM to the process group.
- kill signal - Signals the program with the given signal name. For example, to send a SIGHUP to the process you would use "sighup", or for tevc, "SIGNAL=SIGHUP".
- set - Set the properties of the program agent, accepts the same arguments as the start event.
Notes on Command Lines
In general, if you have complicated or multiple commands to execute, it is best to put them in a script and specify the script name in -command. But if you insist, here are some things to be aware of.
The command line is executed with "csh -c." Yes, that is the Berkeley C-shell and not the Bourne shell or bash. Sorry, it is an historical thing. So be aware of differences in redirection and expansion syntax (e.g., ">&" and "{}"). When in doubt, put your command in a script and set the command line to "sh -c myscript.sh".
Quoting is fragile and happens at a couple of levels:
- Quoting for TCL. Putting curly braces ({...}) or double quotes ("...") around the entire command line will quote the string to TCL (i.e., the NS script parser language). Double quotes allows for TCL variable expansion, curly braces allow no expansion. Thus, these quotes will be stripped off before the command line is given to csh. Use this mechanism if your command line has white space (i.e., arguments to the command), otherwise the Emulab NS parser will flag an error.
- Quoting for csh. Recall that the program agent runs a shell to interpret your commands, so you may need additional quoting to get special characters past it. For example, if one of your command arguments has an embedded space, you will need to quote it with single or double quotes. Backslash quoting also works.
A sick example might look like this:
... -command {echo arg{1,2} "arg3 has spaces" arg4\ has\ \'\ \'\ too}
where the echo command would have four arguments:
arg1
arg2
arg3 has spaces
arg4 has ' ' too
To summarize: put your commands in a script.
Other Notes:
- Many of the features described here are only available on recent FBSD{410,54,61}-STD, RHL90-STD, and FC4-STD disk images.
- This page currently only covers the agent at a high-level, you can find some more detail in the program-agent(8) man page on ops or an experimental node.
NS Examples:
set prog0 [$node0 program-agent]
set prog1 [$node0 program-agent -command "/usr/bin/env"]
set prog2 [$node0 program-agent -command "inf_loop_bug" -timeout 10]
set prog3 [$node0 program-agent -command "ls" -dir "/foo/bar"]
Example 1: Creates four program agents with different default properties.
Event Group
Constructor: $ns event-group [list-of-agents]
- list-of-agents - A TCL list of the agents to be in the group.
The event group agent is used to broadcast events to a group of agents of the same type. For example, if you wanted to start a program on a large number of nodes at the same time, you can create a group consisting of those program-agents and send a single start event to the group. An event group can also act as a simple synchronization method when used inside an event-sequence. In this case, the next event in the sequence won't be sent until all of the agents in the group have signalled completion.
NS Examples:
set group [$ns event-group]
for {set i 0} {$i < 4} {incr i} {
set nodes($i) [$ns node]
set progs($i) [$nodes(i) program-agent]
$group add $progs($i)
}
set doit [$ns event-sequence {
$group run -command "setup.sh"
$group run -command "client.sh" }]
Example 1: Runs the "setup.sh" script on a group of nodes and when they have all completed, runs the "client.sh" script.
set group [$ns event-group [list $rnode $lnode]]
set doit [$ns event-sequence {
$group reboot
$ns log "Reboot finished" }]
Example 2: Reboots a pair of nodes and logs a message with the simulator.
Node Agent
Constructor: $ns node
In addition to allocating an actual machine, the "$ns node" constructor will create a node agent so the node can be controlled from the event system.
Node agents listen for the following events:
- reboot - Reboot the node. When used in a sequence, this event will complete when the node has finished booting and is considered "up".
- snapshot-to imagename - Snapshot the node's disk into the given disk image. Before the snapshot is taken, the node's logs will be sync'd back to ops using the loghole utility and the "/local/logs" directory will be cleaned out. When used in a sequence, this event will complete when the snapshot has been taken and the node has finished booting and is considered "up".
- reload [-image imagename] - Reload the node's disk with the default image or the given image. When used in a sequence, this event will complete when the node has finished booting and is considered "up".
- setdest x y speed [-orientation degrees] - ([mobilewireless.php3 mobile] nodes only) This event will set the next physical destination for the node. When used in a sequence, this event will complete when the node has reached its destination. If another setdest event is sent to a node before it has reached its current destination, the new destination will overwrite the old one.
Console Agent
Constructor: $node console
Console agents operate on the serial consoles attached to some Emulab nodes. Currently, they only support capturing a slice of the output received on the serial line.
Console agents listen for the following events:
- start - Start recording the serial console output from a node.
- stop id - Stop recording the serial console output from a node and save it to a file named "agentname-id.log" in the experiment's log directory.
Traffic Generator
Traffic generation agents output network traffic at a constant bit rate over a link. Consult the advanced example for more information and examples of their use.
Traffic generators listen for the following events:
- start - Start sending traffic.
- stop - Stop sending traffic.
- set - Change characteristics of the traffic.
Disk Agent
Disk agent can be used to create and modify virtual disks on test nodes. Primary purpose of disk agent is to help experimenters test their applications for fault tolerance to disk failures/errors.
Disk agents listen for the following events:
- set/run/start - Mount a virtual disk with the given parameters. This is a simplified version of create/modify where users don't have to specify the geometry of virtual disk.
- create - Creates a virtual disk. But you need to specify the complete geometry of the virtual disk. You must mount the virtual disks manually.
- modify - Modifies a virtual disk. But you need to specify the complete geometry of the virtual disk. You must mount the virtual disks manually.
Notes
Disk-agent uses device mapper library to create/modify the virtual disks with different properties. The syntax is similar to dmsetup tool but it is no where close to the full set of features that dmsetup provides. The virtual disks supports these types:
- linear - linear type of disk is simply a 1:1 mapping of the sectors from virtual disk to the real disk.
- delay - Delay type of disk supports delaying disk I/O's by a specified number of milliseconds. This is useful to simulate slow disks.
- flakey - flakey type of disk returns disk errors for a specified period of downtime. This is useful in inducing probabilistic disk sector errors.
- error - Useful to designate a particular sector to have I/O errors.
Please refer to the kernel documentation page [here for more details. But note that not all features listed there are implemented here.
Constructor: $node disk-agent [-type ] [-mountpoint ] [-parameters ] [-command ]
- -type "type" - Specifies the type of virtual disk which could be one from the above list.
- -mountpoint "directory" - Specifies the mountpoint to mount the virtual disk.
- -parameters "string" - Specifies the optional parameters that the type supports. For example, flakey type supports <up_interval> and <down_interval>, which are basically in seconds the disk returns IO errors. If you need 50% of your IO's to fail then parameters to flakey would be '1 1'.
- -command "cmdline" - Specifies the complete geometry of the disk. The general format is,
"<start sector> <size in sectors> <type> <device path> <offset> <additional parameters>". Example, "0 10000 flakey /dev/sdb 0 1 1". This option must be used with create or modify event type only. It will be ignored if the event type in the NS file is start/run/set. Using this option will not mount the disk and has to be done manually. The virtual disk appears as /dev/mapper/<name>. We can then use mkfs to create a filesystem on top and mount it somewhere to put it use.
NS Examples:
Example 1: Creates a disk object disk0 with different properties, disk0 starts out being a linear disk (good disk) and 20 seconds later starts giving IO errors.
set disk0 [$nodeA disk-agent -type "linear" -mountpoint "/mnt"] $ns at 10 "$disk0 run" set disk0 [$nodeA disk-agent -type "flakey" -mountpoint "/mnt" params="1 1"] $ns at 20 "$disk0 run"
Example 2: Creates a disk object disk0 but this time we specify the geometry. Note that if we use events create or modify, we expect that command line is defined and other fields are ignored. Using the same disk object with different type modifies the disk. Here, disk0 starts out by being a flaky disk and then becomes a slow disk where every IO takes 500ms to complete.
set disk0 [$nodeA disk-agent -command "0 10000 flakey /dev/sdb 0 1 1"] $ns at 10 "$disk0 create" set disk0 [$nodeA disk-agent -command "0 10000 delay /dev/sdb 0 500"] $ns at 20 "$disk0 modify"
Note: We have to manually run mkfs on top of /dev/mapper/disk0 and mount it somewhere for disk to be active.
Tevc Examples
We can dynamically create/modify virtual disks on nodes and inject errors with tevc.
Example 1: Create a virtual disk and mount it on a given mount point.
tevc -e experimentname now disk run "name=disk1 type=linear mountpoint=/mnt"
Example 2: Modify some property of a virtual disk. Lets say we want to change disk1 to a flakey disk.
tevc -e experimentname now disk run "name=disk1 type=flakey mountpoint=/mnt params='1 1'" or tevc -e experimentname now disk run "name=disk1 type=delay mountpoint=/mnt params=500"
Note
- If you specify an existing disk name with run event, then it is assumed that you want to modify the properties of that disk.
- You must quote the arguments while using tevc to accommodate spaces.
How to use Disk-agent with event type create/modify?
Basic syntax for create and modify is,
tevc -e experimentname now disk create/modify "<virtual disk name> <start sector> <size in sectors> <type> <device path> <offset> <additional parameters>"
Note: You must specify the complete geometry along with additional parameters that this particular type of virtual disk supports. The virtual disk appears as /dev/mapper/<name>. We can then use mkfs to create a filesystem on top and mount it somewhere to put it use. Example 3:
Creating a linear type of virtual disk. More details here
tevc -e experimentname now disk create "disk2 0 10000 linear /dev/sdb 0"
Creating a flakey type of virtual disk with 50% failure rate. More details here
tevc -e experimentname now disk create "disk2 0 10000 flakey /dev/sdb 0 1 1"
Creating a delay type of virtual disk which delays both read and write I/O's by 500ms. More details here
tevc -e experimentname now disk create "disk2 0 10000 delay /dev/sdb 0 500"
