Summary

With our next LTS coming up, this is a great time to focus on stability and QA. As such, we'll work on setting up automated testing of as much as of Ubuntu server as possible. This includes (but is not necessarily limited to) daily runs of the security team's regression test suite, enabling upstreams' test suites at build time, performance regression tests, and automating ISO testing.

Release Note

A considerable amount of time has been spent on setting up automating testing of the server product. It is our hope that this will provide a more solid release of Ubuntu Server than ever before.

Rationale

Regression testing can be a repetitive job. Thankfully, a lot of things can be done automatically. Many packages have test suites that we're not running (for one reason or another), we have qa-regression-tests, and lots of other means for testing things without minimal day to day effort.

User stories

  1. Soren uploads a new version of dovecot, but is worried he might break compatibility with some IMAP client. However, he sleeps well that evening, knowing that the automatic regression testing suite will raise an alert over night if something broke.
  2. Jamie notices a bug in libvirt and works on a patch. While test-building the package locally, the test suite informs him that he broke a little-used feature in an obscure corner of the lxc driver in libvirt. He promptly fixes it, the test suite is happy again, and Jamie can proceed to upload the package to Ubuntu.
  3. Matthias uploads an update to eglibc which triggers a bug in php5. The next morning, the server team receives a report from the automated test system telling them there's a regression in php5. Looking at versions of the packages in the dependency chain from the last succesful test run and this new one, they quickly pinpoint the culprit and start working on a fix.

Assumptions

Design

The goal is to detect as many problems as early as possible.

Implementation

qa-regression-testing scripts

We will integrate the security team's qa-regression-test collection into checkbox, and have it run on a daily basis. Feedback will be collected by the QA team and turned into bugs for the server team to deal with.

Performance testing

The Phoronix test suite seems to be reasonably comprehensive. We will run it on a daily basis and keep an eye on performance regressions. Of course this needs to run on the same hardware every time.

Upstream test suites

A number of server packages are known to provide test suites:

The packages that provide a build time test suite will be rebuilt in a PPA every day to catch regressions introduced by things further down the dependency chain.

ISO testing

BoF agenda and discussion

Automated testing is a great way to prevent/detect regressions.

Security team qa-regression-testing scripts:

What we want:

Running tests in EC2.

Test results reporting:

Inclusion in milestone reports presented during the release meeting team.

QA team: easy to run the tests and process the test results internally (black box).

How are test suites updated because of changes in the system? Who?

What needs testing?

Integration list:

  1. qa-regression-testing scripts
    1. enable selected phoronix tests
  2. upstream test suites
    • integrate postgresql test suite
    • integrate puppet-testsuite suite package
    • integrate dovecot imap test suite (not packaged)
    • apache tests (has a framework, use documented in QRT)
    • libvirt test suite (not run during build, but could be), also tests in QRT (but not python-unit)
    • mysql test suite runs during the build
    • openldap test suite runs during the build
    • cups test suite runs in the build, but has concept of other levels (eg smbtorture)
    • samba
      • 'make test', but needs to be built with --enable-socket-wrapper
      • smbtorture
    • php5
  3. integrate iso testing tests in checkbox:
  4. review all the packages on the server CD
  5. Multi-system environements: documentation.
    • pacemaker
    • drbd

What sort of testing do we want to perform?

Misc

chat with Steve Beattie on 2010-06-09

2010-06-09T15:04:35

hggdh

sbeattie: so now it is us...

2010-06-09T15:04:48

hggdh

brb

2010-06-09T15:05:14

sbeattie

hggdh: no worries, I need a beverage refill.

2010-06-09T15:12:29

hggdh

sbeattie: I am back

2010-06-09T15:13:46

sbeattie

hggdh: moi aussi.

2010-06-09T15:15:35

hggdh

sbeattie: ├ža va. So... on qa-r-t: you were saying some of the tests are potentially complex/impossible to set up

2010-06-09T15:16:02

sbeattie

Yes, digging up my notes now.

2010-06-09T15:17:55

sbeattie

hggdh: here's what I had, last updated around beta 1 or so in lucid: http://paste.ubuntu.com/447390/<<BR>>

2010-06-09T15:20:09

hggdh

cool. Are they all under checkbox (those committed)?

2010-06-09T15:20:45

sbeattie

hggdh: committed means I'd committed to a local bzr tree and was awaiting merger into checkbox trunk; I'm updating my checkbox checkout to see if I'd gotten the committed ones merged.

2010-06-09T15:21:20

hggdh

sbeattie: ah, OK

2010-06-09T15:25:19

hggdh

sbeattie: another Q -- I see coreutils there. Upstream delivers coreutils with an extensive test suite, which is run everytime we build it

2010-06-09T15:26:11

hggdh

so, do we need it in qa-r-t? or can we just run a build (say) every day with updated packages?

2010-06-09T15:26:57

sbeattie

hggdh: heh, our coreutils test is very weak; it's basically an example test of /bin/{true,false} I used in a presentation to demonstrate how to write qa-r-t tests.

2010-06-09T15:27:09

hggdh

oh, OK

2010-06-09T15:27:16

hggdh

I had not yet looked at it

2010-06-09T15:27:29

sbeattie

hggdh: their testsuite is not included in a package?

2010-06-09T15:27:52

sbeattie

is it run during our coreutils package build?

2010-06-09T15:27:58

hggdh

sbeattie: no, it is not packaged as coreutils-tests, say. But it is run on every build

2010-06-09T15:28:23

hggdh

I had a brief look at it, and it is fully immersed into their makefile environment

2010-06-09T15:29:00

hggdh

also, I remember one of the maintainers stating that the utilities we run some few thousands of times during the tests

2010-06-09T15:29:23

hggdh

s/we run/were run/

2010-06-09T15:30:10

sbeattie

hggdh: I think build-time is sufficient for testing to ensure coreutils is okay; if you're hoping to catch bugs that coreutils depends on (glibc, kernel) then kicking off a frequent/daily rebuild may make sense.

2010-06-09T15:30:31

sbeattie

(all assuming package build fails if some threshhold of tests fail)

2010-06-09T15:30:51

hggdh

sbeattie: yes, build fails on a test error (I know, had them myself ;-)

2010-06-09T15:30:59

sbeattie

hggdh: awesome!

2010-06-09T15:31:18

hggdh

sbeattie: I will add them on the regression builds we currently do daily

2010-06-09T15:31:19

sbeattie

okay, looks like cups got merged, you can cross that one off.

2010-06-09T15:33:21

sbeattie

( http://bazaar.launchpad.net/~checkbox-dev/checkbox/trunk/annotate/head:/jobs/qa_regression.txt.in is the reference for what's been already merged)

2010-06-09T15:38:06

sbeattie

hggdh: okay, based on review, all the ones that are listed as COMMITTED have been merged and are in fact DONE

2010-06-09T15:40:00

hggdh

sbeattie: OK. I am updating my local copy of your list with a :1,$s/COMMITTED/DONE/

2010-06-09T15:40:38

sbeattie

hggdh: yep, now reviewing the list of tasks you have on the blueprint

2010-06-09T15:43:11

sbeattie

hggdh: ao cups, cyrus-sasl2, and mysql tasks are already done.

2010-06-09T15:43:15

sbeattie

s/ao/so/

2010-06-09T15:44:27

sbeattie

clamav used to have a need to wait between startup and the tests running, requiring manual intervention; this may have been fixed and needs exploration.

2010-06-09T15:44:58

sbeattie

fetchmail: don't recall the issues, needs exploration

2010-06-09T15:46:36

sbeattie

libvirt starts virtual machines (as you might expect); I had passed on that because I was using ESX guests as a testrun environment (to have an accurate idea of the limitations of the test network)

2010-06-09T15:47:03

sbeattie

... and thus I wasn't going to be able to kick off kvm guests

2010-06-09T15:47:05

hggdh

and it does not make sense to run libvirt on virt...

2010-06-09T15:47:24

sbeattie

yeah

2010-06-09T15:47:30

hggdh

OK. updating the ones done on the blueprint (and crediting you)

2010-06-09T15:48:06

sbeattie

net-snmp: the test script took arguments of some kind, and thus needs reworking before it can be integrated.

2010-06-09T15:49:11

sbeattie

apache2: IIRC, the same script was used to test the various flavors of apache (worker, threaded, etc.) and needs some thought before integration can occur.

2010-06-09T15:49:51

sbeattie

dhcp3: sets up a dhcp server; needs re-work to bind this to a fake interface or somesuch.

2010-06-09T15:50:39

sbeattie

dnsmasq: my note is unclear to me, needs exploration (sorry)

2010-06-09T15:50:46

hggdh

heh

2010-06-09T15:51:26

sbeattie

freeradius: our lucid packages appear to have some breakage.

2010-06-09T15:52:26

sbeattie

ipsec-tools: needs a setup environment of hosts/networks to test setting up vpns.

2010-06-09T15:53:51

sbeattie

httpd tests: qa-r-t doesn't have a script named that, not sure if it's a copy/waste error with lighttpd (which is also there)

2010-06-09T15:54:20

sbeattie

http://bazaar.launchpad.net/~ubuntu-bugcontrol/qa-regression-testing/master/files/head:/scripts/ is the listing of the test scripts

2010-06-09T15:56:20

sbeattie

libnet-dns-perl: my note isn't helpful, my guess is that errors may have been related to networking restrictions in the datacenter, needs exploration

2010-06-09T15:56:45

hggdh

sbeattie: those are the ones already integrated, correct?

2010-06-09T15:58:08

sbeattie

hggdh: the scripts in that directory? Some are, some aren't; the tree was mostly developed by the security team to test their updates and they run them manually on the packages they're working on.

2010-06-09T15:59:48

sbeattie

our goal here is to run as many of these as we can going forward to catch regressions in the development release/milestones.

2010-06-09T16:01:30

sbeattie

lighttpd: requires apache is not running, which is tricky if we enable the apache test script, as checkbox installs everything at once, and apache's postinstall starts it up.

2010-06-09T16:03:06

sbeattie

nagios3: I didn't explore this much because of the existence of nagios1 and nagios2 tests; we could probably get away with just enabling the nagios3 test. Needs exploration.

2010-06-09T16:03:40

sbeattie

nfs-utils: needs external to the host nfs clients and servers.

2010-06-09T16:04:15

hggdh

huh... thunderstorm arriving...

2010-06-09T16:04:56

sbeattie

ntp: needs access to external ntp servers.

2010-06-09T16:05:21

sbeattie

hggdh: heh, good luck. :-)

2010-06-09T16:07:35

sbeattie

we don't get many thunderstorms out west, though I heard one rumble this morning; I miss a good thunderstorm.

2010-06-09T16:09:14

sbeattie

nut: had unknown failures, needs exploration with the test script. Though I don't recall how useful the tests are for systems without a UPS attached.

2010-06-09T16:10:17

sbeattie

ah, nut has a dummy driver that the test script uses.

2010-06-09T16:11:27

sbeattie

pptpd: test has some hardcoded networking assumptions that cause failures, I think.

2010-06-09T16:13:12

sbeattie

python: script needs a little re-working as it takes an argument to specify which version of python (2.4, 2.5, 2.6) to test.

2010-06-09T16:13:47

sbeattie

ruby: similar issues as python

2010-06-09T16:14:40

sbeattie

samba: needs working external clients and servers in its environment

2010-06-09T16:16:16

sbeattie

squid: test requires multiple protocol (http, https, ftp) access to various ubuntu.com hosts.

2010-06-09T16:16:42

sbeattie

hggdh: I think that covers all the ones on your task list.

2010-06-09T16:17:24

hggdh

sbeattie: thank you. I am updating the blueprint with your notes (so that we have a reference)

2010-06-09T16:17:54

hggdh

sbeattie: is python 2.4 still in use?

2010-06-09T16:18:49

sbeattie

hggdh: looks like it got purged in lucid.

2010-06-09T16:19:29

sbeattie

(it's in main for dapper, hardy, jaunty, and karmic, which is why the security team cares)

2010-06-09T16:19:42

hggdh

K, so it stays

2010-06-09T16:20:10

sbeattie

well, for checkbox integration, we can possibly drop it.

2010-06-09T16:21:01

sbeattie

and just focus on the "current" supported python.

2010-06-09T16:21:18

sbeattie

python2.5 also got dropped in lucid, if rmadison is to be believed.

2010-06-09T16:21:24

hggdh

so, look at 2.6 only right now

2010-06-09T16:22:01

sbeattie

hggdh: that would be the short-term approach I'd take.

2010-06-09T16:22:38

hggdh

sbeattie: thank you. I will probably have Qs later on, if you do not mind

2010-06-09T16:23:25

sbeattie

hggdh: happy to answer what I can. I've been meaning to document this more, both for our internal uses and to encourage community members to contribute testcases.

2010-06-09T16:27:02

sbeattie

I, heh, do have a work item to add; late in the lucid cycle, zul added a mysql-testsuite which contains upstreams test infrastructure (and, AFAIK, he didn't test it's packaging at all); integrating it into our mysql test script has not made it to the top of my todo list.

2010-06-09T16:27:27

sbeattie

woo; grammer/english fail.

2010-06-09T16:27:42

hggdh

sbeattie: heh. I will check with zul

2010-06-09T16:30:08

*

sbeattie needs to step away for a bit

2010-06-09T16:30:14

--

sbeattie is now known as sbeattie-afk


CategorySpec

AutomatedServerTestingSpec (last edited 2010-07-14 15:14:23 by pool-71-252-251-234)