LucidBetterArchiveCrawler

Summary

The changelog generation currently is rather slow and requires a local mirror of the archive. By switching to launchpadlib we can ask for updates via the api every 30min and provide a much nicer experience for our users.

Release Note

The changelogs.ubuntu.com data is updated much more frequently now and this ensures that the data presented in update-manager is always current.

Rationale

Users who update frequently want to see the changelogs in update-manager/aptitude/synaptic as soon as possible.

User stories

Joe is a sysadmin and likes to know what is going on. He likes that he sees the changelogs for proposed changes now right away.

Assumptions

Launchpad should not deal with the changelog queries directly (to avoid the additional load on it). Instead we use a cache of static changelog files that is served via http.

Design

A new script is written that uses launchpadlib and the getPublishedSources(since_date) to get only information about recently changed packages. The source files are downloaded, extracted and copyright, NEWS.Debian and changelog extrated into changelogs/pool/. Then it created symlinks for any binary packages/versions that are different from the source package name/version.

It needs to run once initially with the full list of published sources to populate the pool/ from that point on it only requires to ask for updates since it was run last (which should make it really fast). A cron job that does this every ~30min should be fine. There is no need for a local mirror, all it needs is http access to launchpad and a current launchpadlib.

As a bonus we can update a RSS feed file when going over the list of changes.

Implementation

Some code in lp:~mvo/+junk/lp-changelogs-crawler.

Blocked because of bug #487597.

Migration

A new machines should be used that will eventually replace changelogs.ubuntu.com. It needs to be populated with the changelogs once. It should be carefully tested and scripts should be used to ensure there are no missing changelogs (this can be done via python-apt). Once we are confident in the new service, the dns should be switched to point changelogs.ubuntu.com to the new machine. We keep the old changelogs runnign for a certain amount of time to be able to emergency switch back.

When switch to a new changelogs.ubuntu.com we *must* also copy the various "meta-release" files.

Test/Demo Plan

To test before the switch update-manager needs to have a configurable changelogs location. This can then be used to test if all changelogs are available.

BoF agenda and discussion

During the session the soyuz team said they would not be able to do such a feature, it would have to be done via the API.


CategorySpec

FoundationsTeam/Specs/LucidBetterArchiveCrawler (last edited 2009-11-24 15:00:36 by p5B09D7A3)