MultiMachineTestingInfraSpec

Differences between revisions 3 and 4
Revision 3 as of 2006-06-23 10:29:25
Size: 9075
Editor: ALagny-109-1-9-136
Comment:
Revision 4 as of 2006-06-23 12:29:11
Size: 10489
Editor: 81
Comment: Spec drafted.
Deletions are marked like this. Additions are marked like this.
Line 8: Line 8:
This specification describes a distributed cluster that supports automated integration testing. This cluster would facilitate the implementation of automated test plans that replace manual tests. It is distributed so that machines worldwide can be contributed to the infrastructure.
Line 95: Line 95:
An authorized administrator must be able to describe which resources the scheduler has control over.
Line 112: Line 114:
An authorized administrator must be able to manage the queue. Some of these operations include, but are not limited to, disabling tests, removing tests, and user management.
Line 114: Line 118:
The scoreboard records the results of each run. It needs to track the following:
 * Timestamp for each run,
 * Identifier for the test plan that was run,
 * Work queue that supplied this test plan,
 * Scheduler that allocated the resources for the test,
 * Final result of each run,
 * Links to log files extracted from each run.
Line 115: Line 126:
The scoreboard must provide an interface to query it for test run information.

The scoreboard should present a web interface for constructing queries and displaying results that would be useful to developers, testers, release managers, and infrastructure administrators.

The scoreboard should integrate some subset of this information to Launchpad. What subset is currently unspecified as it requires further discussion.

An authorized scheduler must be able to add test results to the scoreboard.

Summary

This specification describes a distributed cluster that supports automated integration testing. This cluster would facilitate the implementation of automated test plans that replace manual tests. It is distributed so that machines worldwide can be contributed to the infrastructure.

Rationale

Inside Ubuntu, there are plenty of automated test systems. From the unit tests within make check to piuparts, these tests all check for poor implementations. However, all of these systems only test the package itself. There is no facility for higher-level testing.

Since integration testing cannot be automated, it must be done by hand. As Ubuntu gets bigger, with more packages and more supported platforms, this becomes more and more infeasible.

User stories

  • Andrew tests whether Apache works by installing the package, setting up a dynamic website, and using it through his web browser. He wants to write an automated test script that does this all for him.
  • Biella wants to test file sharing between her Ubuntu server and her Windows workstation. She also wants to write an automated test script that does this.
  • Charles wants to test that some new machines work well with Ubuntu. He wants to run the entire test suite on this new hardware.
  • Dora is a system administrator at a university which makes an Ubuntu-derived distribution. She wants to run the entire test suite against a test lab of computers, whenever she rolls out a new version of her distro. Some of these test results may be private, so it must be possible to run an isolated instance of the system.
  • Eric is an Ubuntu developer that has made a large library transition. He wants to confirm that he hasn't introduced any regressions into the distro before uploading his new packages.
  • Fiona is a system administrator that wants to help Ubuntu by donating hardware resources. These should be added to the resource pool for the automated test system.
  • Greg is a release manager for an Ubuntu-derived distribution. He looks at a nightly report generated by this system to determine that no regressions have been introduced.
  • Hania is a porter who wants to port Ubuntu from one architecture to the next. She wants to get a nightly report of which features still need work for this new architecture.
  • Ivan sees fixes a bug and wants to prevent it from regressing. He writes a test script and uploads it into the testing infrastructure, as part of uploading his fix.
  • Joanna notices that bugs keep on cropping up in a certain area that she's interested in. She'll write a suite of test scripts that exercise this functionality, some of which may fail, and uploads them into the infrastructure.
  • Kyle wants to look at a summary of automated testing results in Launchpad. He would be able to look at why a certain test has passed or failed.
  • Linda is working in Malone on a bug when she realizes that it is related to a bug discovered by an automated test. It should be trivial to link the Malone report to the test results.

Scope

This specification covers the design and implementation of the automated testing infrastructure. It does not specify any specific behaviour for Launchpad.

Design

The system is based around the idea of resource pools. These are groups of networked machines, which may or may not be computers running Ubuntu, that can be reserved and used. These resource pools are controlled by a scheduler, which ensures that tests are run, allocates resources to these tests, and ensures that no test hogs any resource. The scheduler decides on which tests to run based on the work queue, and tests report their results back to a scoreboard.

Resource pool

A resource pool could be populated by:

  • computers, of varying architectures, which have Ubuntu installed,
  • computers which have other operating systems installed (Red Hat, SuSE, Debian, Windows, Mac OS X),
  • virtual machines instead of real computers,
  • appliances, like DSL routers,
  • any other device that can be controlled remotely.

Each resource in the pool will be connected to a private network that will not be connected to the Internet.

Each resource in the pool will be connected to a VLAN capable switch, so that various resources can be connected to their own private test networks.

A resource pool will provide a simulated Internet connection, that provides minimal services like DNS and hosts that resolve. It may provide a cache of the Ubuntu archive. This allows machines inside the resource pool to connect to an Internet-style network without being exposed to the Internet itself. This is not required for the initial implementation.

Each resource in the pool will be plugged into a networked power bar. This would allow it to be rebooted without manual intervention. This is not required for the initial implementation.

Each resource in the pool should have some mechanism to reset it to a known-good configuration. For instance, on modern servers, it should be possible to netboot into a bootloader that can decide if it should boot directly off the hard disk. If it shouldn't, it would boot off the network to an image that would wipe the existing hard disks and load a clean image.

Scheduler

The scheduler is responsible for continually running tests, which is picks up off the work queue. To support a test, it will:

  • Allocate resources out of its resource pool,
  • Configure these resources so the test can run which may involve cleaning up a resource or configuring a VLAN,
  • Command the resources to run a test,
  • Monitor the test and terminate it if it runs for too long,
  • Collect the results of the test and feeds them to the scoreboard,
  • Clean up the resources so they are ready for the next test run.

The scheduler will not execute any test plans itself, but delegates this task to the resources desired by the test. This reduces its impact as the bottleneck in the testing framework.

Each distributed instance should be able to run its own scheduler, which does not depend on the existance of any other scheduler.

Schedulers must be able to pick up tests off of an arbitrary work queue.

Schedulers must be able to send test results to an arbitrary scoreboard.

In order to communicate with test runs, there should be a wire-protocol that describes passing and failing test cases within an automated test plan. As well, this wire-protocol should allow the test to send log files back to the scheduler. These log files must be sent back to the scheduler even if the automated test script dies unrecoverably.

Schedulers must be able to kill a test run that is hogging resources. For example, a test that takes too long will have all its resources turned off and restarted.

Schedulers request tests from the work queue if there are free resources available. To prevent polling, it should be possible to request notifications of new tests from the work queue.

When a test completes, a scheduler must inform the work queue that the test plan has been run.

An authorized administrator must be able to describe which resources the scheduler has control over.

Work queue

The work queue is an ordered set of tests. Each test needs to express the following:

  • Declare the properties of the resources it needs. For instance, a test may express that it needs a Windows 2000 Server and an Ubuntu machine.
  • Declare the way the resources are connected. For instance, the Windows and Ubuntu machine should be connected with a VLAN.
  • Declare any files to send to each resource. For instance, an automated test script should be uploaded to the Ubuntu machine.
  • Declare which commands are run on each resource. For instance, the automated test script should be executed.

A scheduler must be able to request a new test plan to run. The work queue is used to prevent starvation of test plans by feeding schedulers with unique test plans. To request a test plan, the scheduler informs the work queue of which free resources it has, and the work queue provides the scheduler with a plan that could fit in its queue.

When a test plan has been completed, it gets added back to the end of the work queue. If a test plan has run for too long without completing, it also gets added back to the end of the work queue. This guarantees that all test plans will be run, even if a scheduler has crashed.

An authorized user must be able to add new test plans to the queue. This plan can either be:

  • Continuously running, so it will be scheduled at regular priority,
  • Once-off, so it will be scheduled as soon as possible.

When a new test plan has been added to the queue, all connected schedulers are informed so that they may request a new plan, if they have free resources.

An authorized administrator must be able to manage the queue. Some of these operations include, but are not limited to, disabling tests, removing tests, and user management.

Scoreboard

The scoreboard records the results of each run. It needs to track the following:

  • Timestamp for each run,
  • Identifier for the test plan that was run,
  • Work queue that supplied this test plan,
  • Scheduler that allocated the resources for the test,
  • Final result of each run,
  • Links to log files extracted from each run.

The scoreboard must provide an interface to query it for test run information.

The scoreboard should present a web interface for constructing queries and displaying results that would be useful to developers, testers, release managers, and infrastructure administrators.

The scoreboard should integrate some subset of this information to Launchpad. What subset is currently unspecified as it requires further discussion.

An authorized scheduler must be able to add test results to the scoreboard.

Implementation

Code

Data preservation and migration

Outstanding issues

See also

* http://www.robertcollins.net/unittest/subunit/ — a proposed wire-protocol for test to return results to the scheduler.


CategorySpec

MultiMachineTestingInfraSpec (last edited 2008-08-06 16:38:29 by localhost)