BootPerformance

Revision 4 as of 2009-01-23 05:35:00

Clear message

Summary

TBD

Release Note

Rationale

Use Cases

Assumptions

Definition

The very term "boot" is confusing: no two people agree exactly where the boot sequence begins, and where it ends.

From a user's point of view, the time it takes a machine to boot the time from when they first press the power button to the time that the system is fully loaded and settled down.

Quick study of any Windows user, an operating system that employs tricks of bringing a login screen up earlier and deferring many services to start during login, will show that the user doesn't trust the system until the hard drive light is off and things have stopped changing on the screen.

Put simpler, the real boot time is from the moment the user wants their machine for something to the moment they think it's ready to do that.

Unfortunately the system goes through seven distinct phases from our point of view:

  1. hardware initialisation, BIOS, etc.
  2. boot loader, including loading kernel and initramfs images
  3. kernel initialisation
  4. initramfs
  5. core system startup ("plumbing")
  6. X startup
  7. desktop startup

The first is completely out of our control, until such time as we are able to work sufficiently closely with the hardware vendor that we can remove BIOS from the equation. It's only from the second stage that we are executing code that we can modify and improve.

The boot loader is tiny and takes an immeasurably small amount of time to load, the primary amount of time is the delay to allow interaction and the loading phase afterwards. This varies from platform to platform, and is a complex subject in its own right; for the purposes of this specification, we assume the boot loader is also out of our control.

Thus our timer shall begin with kernel initialisation; this is also the easiest and most reliable measurement, since the kernel keeps its own timer from the moment its code execution begins and this serves as a very accurate clock available throughout userspace.

Our timer ends once the desktop has been loaded, all applets in the session have appeared, and the disk activity has ceased.

For the purposes of testing, auto-login is desired.

Process

There are two schools of thought as to the best approach to improving the boot time.

The first is that you take what you have, profile and chart it, and identify areas for improvement. You iterate over this process steadily reducing the time it takes for your system to boot. Usually you have no clear goal other than "faster than before".

The second is that you decide up front what time you intend to boot in, and plan a budget for your boot sequence accordingly. If you intend to reach a full desktop in ten seconds, you may only allocate two seconds to the kernel for example. Following this budget, you work on each piece individually until it's fast enough, and move on to the next piece.

It's hard to deny that this second school of thought gives brilliant results, it was the method followed by Intel's Arjan van de Ven for his "5 second boot" talk. And although their system was rather more stripped down and less flexible than a generic distribution image, his process has clear merit.

On the other hand, it also has risk associated with it. It's easiest to do by starting from scratch and putting a system together in pieces. When you have a complete distribution to upgrade, maintaining user configuration without regressions, it's somewhat harder to pull off.

At this point in time, we have a lot of low hanging fruit. Much of our core system could do with updating and generally tidying up. There's also a significant number of boot speed related bugs that we're aware of, and which would not be difficult to fix.

In short, without significant investigation, we know of enough work to fill a release cycle or more that will bring a noticeable reduction in boot speed.

Thus we will follow the first school of thought for now. We'll correct the problems we already know about, get everything up to date and cleaned up, and give ourselves a sane base to work from in future.

No doubt in a short number of releases time, we'll reach a point where we simply can't make it any faster by iterative reduction. It's at this point we'll set ourselves an aggressive budget, and begin focusing on each component individually to bring them in under budget.

Hardware

It is difficult to compare results of tests made on different pieces of hardware. Since the length of time to boot is ultimately related to the hardware underneath, especially the disk drive, changes must be made on one platform and compared there.

So that different people may work on boot performance and compare their results, it makes sense that reference hardware platforms be used. These should be systems that are stable in their configuration so that two different models should give near identical results.

This does not mean that the effect of changes on other platforms shall be discounted, indeed it is important to measure the effect generally to ensure regressions are not occurring. It's simply not useful to know that "my machine boots in 32s" unless we know how long it took to boot before that, and what was changed in the meantime.

Since the netbook form factor is one of the driving forces behind boot performance work, it makes sense to use a netbook as one of the reference hardware platforms. The chosen model is the Dell Mini 9, this has the Intel Atom processor and an SSD disk, so makes an excellent benchmark for this form factor.

It's also recommended that as the ARM architecture gains more prominence, an ARM-based platform is chosen.

Finally it's suggested that a standard laptop model, with a rotary disk, is selected to serve as the reference for the mass market.

Implementation

Investigate Behdad's posts on improving GNOME login time:

UI Changes

Code Changes

Migration

Test/Demo Plan

Unresolved issues

BoF agenda and discussion


CategorySpec