## page was renamed from MeetingLogs/devweek1001/devweek1001/AutoServerTests == Dev Week -- Automated server testing -- soren -- Tue, Jan 26 == UTC {{{#!IRC [20:00] * soren clears throat [20:00] Hi, everyone. [20:00] Thanks for coming to my session on Automated Server Testing. [20:00] So.. [20:00] In the server team, we've traditionally had a problem with collecting test results. [20:01] (question in #ubuntu-classroom-chat, by the way. please put "QUESTION" so that I will spot them) [20:01] This is because our target audience and most of our users are using Ubuntu on servers that are being used to service real users. [20:01] Real users, as you are probably aware, depend on their servers to work. [20:01] They need mail server to be up and delivering mail so that they can get their daily dosage of spam.. [20:02] They need their file server to be around so they can get access to their music and various pirated software.. [20:02] They need their proxy server to work so that they can log onto facebook.. [20:02] They need the LDAP server to work so that they can look up the phone number for the pizza guy.. [20:02] And other important things. [20:02] You get the idea. [20:02] If something should fail, it means pain and suffering for the poor sysadmin. [20:02] Hence, sysadmins are very hesitant to upgrade anything before it's been through lots and lots of QA. [20:03] However, unless /some/ of them /do/ upgrade, there's not going to be much QA work done. [20:03] This places us in a rather unfortunate situation, where a significant portion of our bug reports don't come in until after release. [20:03] Anyone involved in Ubuntu development will know that this is a hassle, since fixing things after release is much more tedious than before release, since we have much less freedom to make changes. [20:04] This is very difficult to change, and I haven't come up with a golden solution. [20:04] However, the sooner we catch problems, the more time we have to work on fun stuff since we'll be putting out less fires in the end. [20:04] See, while we're cursed with a user base that doesn't start testing our product until it's essentially too late.. [20:05] ..we areblessed with a type of software that traditionally comes with a good test suite. [20:05] MySQL for instance, comes with an extensive test suite. [20:05] This test suite runs every time we upload a new version of mysql to Ubuntu. [20:06] If the test suite fails, the build fails, and the uploader gets an e-mail. [20:06] ...and it's all very obvious that something needs fixing. [20:06] This is great. [20:06] Well.. [20:06] Sort of. [20:06] The thing is, every package in Ubuntu has dependencies of some sort. [20:06] For instance, almost everything depends on libc [20:07] This means that a change in libc will inevitably affect MySQL somehow. [20:07] Luckily, if this causes problems, it is (hopefully) caught by MySQL's test suite. [20:07] Less luckily, this test suite, as I just mentioned.. [20:07] is run when MySQL is uploaded.. [20:07] not when libc is uploaded. [20:08] So we may not notice a problem until the next time someone uploads MySQL. This could be weeks or even months! [20:08] And trying to narrow down the change that broke something is hard with all the stuff doing on in Ubuntu development over the course of months. [20:08] So.. [20:09] to address this, we've set up and automated system that rebuilds MySQL ( and a bunch of other stuff) every night in a PPA. [20:09] That way, if we trust the test suite, we can relax and know that MySQL still works, despite any changes in its dependency chain. [20:09] We do the same for libvirt, php5, postgresql, etc. [20:10] Basically, anything that has a test suite that runs at build time and that causes the build to fail if it doesn't pass, should be added. [20:10] This at least makes me sleep better :) [20:11] So, the automated testing stuff in Lucid consists of two parts. [20:11] The above is the first part, which is pretty nice. [20:11] The second part is awesome: [20:11] :) [20:11] It's an automated ISO testing system. [20:11] ISO testing is the thankless and tedious job of installing Ubuntu from an ISO over and over again.. [20:12] ..with small adjustmets each time to make sure things haven't changed unexpectedly. [20:12] QUESTION: ~Shkodrani> why not run the test suite only when a packege on which, for instance MySQL relays on? [20:13] The cost of checking whether something in MySQL's dependency chain has changed is rather high. At the very least, it's tedious. [20:13] ..and just doing the rebuild is cheap and simple to get up and running. [20:13] It's all run by a 10 line shell script or thereabouts. [20:13] Ok, ISO testing.. [20:14] Every time we come close to an alpha, beta or any other kind of release.. [20:14] ..we all spend a lot of itme going through this install process. [20:14] Well, we /should/ anyway. I positively suck at getting it done, but there you go. [20:14] My fellow server team member, Mathias Gug, has had a preseed based setup running for a while now. [20:15] Basically, preseeding is a way to answer all of the installer's questions up front. [20:15] So, he takes all the answers.. [20:15] passes them to the install using clever hacks.. [20:15] ..and the install zips through the instlalation without bothering Mathias with questions. [20:15] In the end, he can log into the installed system and run the las tparts of the test cases. [20:16] This has served us well, and has probably saved us several man days (or weeks?) of testing tie over the last few years. [20:16] However, it doesn't actually test the same things as the ISO test cases describe. [20:16] The ISO test cases speak of the interaction between the user and the installer.. [20:16] However, the point of preseeding is to /avoid/ interaction, and to skip it entirely. [20:16] Don't get me wrong.. [20:17] Preseed testing is super valuable. [20:17] Installing that way is a supported install method, so having this well tested is wicked cool and really important. [20:17] ...but I wanted to test the interactivity as well. [20:18] So, being the virtualisation geek that I am.. [20:18] I decided to use the KVM autotest framework to do the ISO testing. [20:18] Now, KVM autotest was designed to test KVM. [20:19] KVM developers use it to install a bunch of different operating systems and test things to make sure they didn't change anything in KVM that broke functionality in one of the guest operating systems. [20:19] What we want to do, though, is somewhat the opposite. [20:19] We assume that KVM works and instead want to test the operating system. [20:20] So, the KVM autotest framework works by runing a virtual machine.. [20:20] grabs a screenshot every second.. [20:20] ..and when the screenshot looks a particular way (e.g. when a particular dialog comes up), [20:21] it can respond with a series of key presses or mouse events. [20:21] This way, we can emulate a complete, interactive install session. [20:21] Awesome stuff. [20:21] I've started documenting this, but haven't gotten all that far, since I kept changing things faster than I could update the docs :) [20:22] The documentation lives at https://wiki.ubuntu.com/AutomatedISOTesting [20:22] If you all open that page.. [20:22] ..and scroll down to the "step files" section.. [20:23] you can see a sample step from a "step file". [20:23] A step file is a description of a test case. [20:23] Now, looking at the sample, you can see a "step 9.45" and a "screendump" line. [20:23] They're pretty much just meta-data for the creator or editor of the step file [20:24] so don't worry about those. [20:24] The important lines are the "barrier_2" and "key" ones. [20:24] The barrier_2 line tells the testing system to wait.. [20:24] ..until the rectangle of size 117x34 of the screen, starting at 79x303.. [20:24] should have md5sum de7e18c10594ab288855a570dee7f159 within the next 47 seconds. [20:25] If this doesn't happen, the test will fail, and a report will be generated. [20:25] If it does pass, it goes on to the next step: "key ret" [20:25] As you can probably guess, "key ret" sends a keypress to the guest, namely Return. [20:26] The result of those two lines is: Wait for the language prompt right after boot to show up, and once it does, press return to accept the default "English". [20:26] Now, pretty soon, it became obvious that there was going to be a lot of duplication involved here. [20:26] ...all the installs would have to wait for that prompt and respond to it in the same way. [20:27] Even worse: If that prompt were to change, /every/ step file would need to be updated. [20:27] Even worse again: In the beginning there was no concept of "updating" step files. You had to start all over. [20:28] Starting over makes plain old ISO testing feel like a fun time. [20:28] It's not. [20:28] Just so you know. [20:28] I love people for doing it, but it's really not that much fun. :) [20:28] Ok, so to address the mass duplication of steps and stuff, I added a step file generator. [20:29] The step file generator generates a step file (you probably guessed this much) based on the task to be installed and the partitioning scheme to be used. [20:30] This means that I can tell the test frame work: Hey, please test an install of the LAMP task, with LVM partitioning and do it on amd64. [20:30] And it does so. [20:30] See, this is all running in a virtual machines. [20:30] Virtual machines are cool. [20:30] So cool, in fact... [20:30] That you can use them to make installer videos. [20:30] So, to see what happens during a test run, you can attach a recorder thingie and turn the result into an avi. [20:31] Now, like any decent TV chef, I've cheated and done this all in advance. [20:31] Now, unlike most decent TV chef's, what I did in advance failed. [20:31] And even more unlike TV chef's, I'm going to show it to you anyway, because it's useful. [20:32] Without further ado: [20:32] heh.. [20:32] wait for it.. [20:32] http://people.canonical.com/~soren/lamplvminstall.avi [20:32] There we go. [20:32] wget http://people.canonical.com/~soren/lamplvminstall.avi ; mplayer lamplvminstall.avi [20:32] This test case failed. [20:33] Somewhat surprisingly. [20:33] If you fast forward all the way to the end.. [20:33] (watch the rest as well, it's fun to watch the test system typing the username "John W. Doe III" and the password and whatnot) [20:34] ..at the end, you'll see if breaks off before the install actually finishes. [20:34] Like... seconds before it would have finished. [20:34] Honestly, I did not mean for this to happen, but it's a good learning experience :) [20:34] Ok, if we all look at.. [20:34] * soren digs through launchpad, bear with me. [20:34] http://bazaar.launchpad.net/~soren/autotest/automated-ubuntu-server-tests/files/head:/client/tests/kvm/generator_data/lucid/ [20:35] Those are the input files for the step file generator. [20:35] Yes, they are poorly named, but please appreciate that just days ago, they were all named "foo", "bar", "wibble", "wobble", etc. so this is a massive improvement. [20:36] QUESTION: That method could be used for UI testing in a *lot* of different GUI apps, not just ISO installations. Any plans to document/release it more generally? [20:36] (from rmunn) [20:36] Yes1 [20:36] ! [20:36] I meant to get that done for today, but the real world imposed and made a mockery of my plans. [20:36] This can totally be used to do GUI installs as well. [20:37] Looking at http://bazaar.launchpad.net/~soren/autotest/automated-ubuntu-server-tests/files/head:/client/tests/kvm/generator_data/lucid/ again.. [20:37] Specifically, 060-finish_install_and_reboot.steps [20:37] http://bazaar.launchpad.net/~soren/autotest/automated-ubuntu-server-tests/annotate/head:/client/tests/kvm/generator_data/lucid/060-finish_install_and_reboot.steps [20:38] This is the step that failed. [20:38] For some reason (that I have yet to figure out, I only spotted this failure an hour ago) this times out. [20:38] It says 579, but perhaps those a special kind of seconds that are not as long as most people's seconds. [20:39] The point is this: I only have to change the timeout in this one place, and all the test cases will be updated. [20:39] < ~rmunn> QUESTION: I see a lot of keystrokes used to select various dialog widgets. Can the KVM testing system simulate mouse clicks and/or mouse [20:39] movements (e.g., for testing mouseover stuff) as well? [20:39] cut'n'paste for the lose :( [20:39] Well.. [20:39] Yes. [20:39] Sort of :) [20:40] The autotest framework supports it, I've added support for it to the frontend, but kvm has an.. um.. issue :) [20:40] It used to emulate a mouse, so it would move the cursor relative to the current position. [20:40] However, these days, GNOME and such give you... [20:40] mouse acceleration! [20:40] Yay! [20:40] No. Not yay. [20:41] Mouse acceleration is the enemy when you're actually warping the mouse from one place to another, because it thinks you just moved your mouse /really/ fast, and then moves it even further than you wanted it to. [20:41] This took me /forever/ to realise. [20:41] So, I've made it pretend to use a tablet. [20:41] Tablets offer absolute positioning, so this helped a lot. [20:42] However, the command to tell kvm to click on something internally translated into "mouse_event(click_button1, 0, 0, 0)", where 0,0,0 are the coordinates. [20:42] Now, if you're in relative positioning mode (using a regular mouse), this is good. [20:42] You want to click right where you are. [20:42] ..if you're using a tablet, it means you can only click in the top left corner. [20:42] No fun. [20:43] I wrote a patch for that, but I'm not sure it's in upstream KVM yet, but it'll be in Lucid half an hour after I start working on those GUI test cases :) [20:44] So, yes, GUI testing is totally an optoin. [20:44] option, too. [20:44] Another problem I had with this is that it was designed to test a variable kvm against a static set of OS's. [20:45] The OS's should look and act the same regardless of what changed in KVM. That is the whole point of these tests: To make sure they don't change. [20:45] However, we change the OS all the time. That's what we do :) [20:46] ..but since the designers of this test system never meant for it to be used this way, they didn't add an option to edit these step files very conveniently. [20:46] To fix this, I've added an option to the test system to fall back to the stepmaker (the GUI used to create step files) if a test fails. [20:46] This is great if you're running tests on your laptop or a machine you have direct access to rather than a machine running in a dusty corner of a data center. [20:47] It really comes in useful when the screens change (wording changes, extra/fewer dialogs, change of theme (in the GUI)). [20:47] Having to start over is, as I mentioned, no fun at all. [20:48] Please shoot any questions you may have. I haven't really prepared much more than this. [20:48] Still, questions belong in #ubuntu-classroom-chat [20:50] If there are no more questions, I'll sing for the rest of the time slot. [20:50] 21:50:25 < ~Omahn23> soren: QUESTION As an end user/sysadmin, is there anything I can do to help in testing with this new framework? [20:51] Well, seeing as these things run in virtual machines, running them in more places is not going to make much difference, so /running/ the tests is probably not something we need help with [20:51] However! [20:51] The more test cases we can include, the better. [20:51] The more, the merrier. [20:51] I'd love to have more test cases to include in our daily runs of this system. [20:52] 21:52:01 < hggdh> QUESTION: so we can automate pseudo-interactive testing. How to deal with the tests that require meat between the keyboard and the chair? [20:52] Examples? [20:52] 21:50:12 < Ramonster> soren: Any thoughts on testing servers while they actually perform one of the roles you talked about at the start ? [20:52] Ramonster: You mean functional testing of e.g. a LAMP server? [20:53] 21:52:23 < mscahill> QUESTION: you briefly mentioned PPA testing. what packages are included in this testing? [20:53] * soren looks that up. [20:53] 21:53:08 < Ramonster> soren: Yeah [20:53] alright. [20:53] Um, yes, but it's not part of this work I've been doing. [20:53] We're not very strong in that area at all. [20:53] ...and that's a shame. [20:54] PKGS="libvirt postgresql-8.3 postgresql-8.4 mysql-dfsg-5.0 mysql-dfsg-5.1 openldap php5 python2.6 atlas" [20:54] Is the list of packages built daily. [20:55] Well, the security team has a bunch of tests they run whenever they change anything. They often can't rely on anyone else to test anything (since they don't go through -proposed), so they need to be really thorough. [20:55] I'm working on getting those run every day as well. They should provide some amount of functional testing. [20:55] 21:53:12 < yltsrc> QUESTION: is writing test cases required for bugfixing? [20:55] Not per se. [20:56] Most test cases need updating once a bug is fixed, and most things I can think of would be covered by this, so new test cases (for this system, I mean) wouldn't be a requirement for bug fixes. [20:57] 21:54:55 < mscahill> QUESTION: are there plans to allow automated testing for package maintainers with their own PPA? [20:57] Sure, anyone is free to run that script and do their own testing. [20:57] Hm... I may not have published it anywhere. [20:57] * soren fixes that. [20:57] Well, /me makes a note to fix that [20:58] I do have a few ideas for doing functional testing of upgrades of various sort, but most of those ideas are only a few hours old, so they're not even half baked yet :) [20:59] Did I miss any questions? [20:59] 21:59:33 < Ramonster> soren: That's the problem atm everyone is walking around with these half-baked ideas :) [20:59] Whoops, didn't mean to post that here :) [21:00] Thanks for showing up, everyone. [21:00] that's it! }}}