MaverickFinishUpstart

Differences between revisions 6 and 7
Revision 6 as of 2010-12-08 09:58:25
Size: 10090
Editor: 89
Comment:
Revision 7 as of 2010-12-08 10:07:26
Size: 10036
Editor: 89
Comment:
Deletions are marked like this. Additions are marked like this.
Line 24: Line 24:
This feature will make it significantly easier to debug both Upstart and
any child processes it creates.
This feature will make it significantly easier to debug both Upstart and any child processes it creates.
Line 48: Line 47:
 * Jennifer is an application developer and wants to single-step a
  
daemon process in a production-like environment to help her
  
understand a bug reported by one of her users.

 * Donald has written a new daemon and is trying to understand why it
  
doesn't run as he expects.

=== Support for Override Files ===

 * Ben is a system administrator who managers a large group of database
  
servers. The vendor has provided a buggy upstart job configuration
  
file which needs to be worked around to avoid database servers from
  
failing to start. Ben cannot change the existing ".conf" file since
  
server security auditing software would flag this as an error.

 * Hermione wants to write a simple tool that allows a job to be
  
disabled without modifying the existing ".conf" files.

 * Rafael wants to be able to change the run-levels a job runs in
  
without modifying the existing ".conf" file.

=== Visualization of Jobs/Events ===

 * Finn is an administrator and wants a log of all jobs showing when they
  
ran and how long they took to run.

 * Lauren is a developer and wants to know how long the server
  
application she has written actually ran before it crashed and what
  
other jobs were running at the time of the crash.
 * Jennifer is an application developer and wants to single-step a daemon process in a production-like environment to help her understand a bug reported by one of her users.

 * Donald has written a new daemon and is trying to understand why it doesn't run as he expects.

=== Support for Override Files ===

 * Ben is a system administrator who managers a large group of database servers. The vendor has provided a buggy upstart job configuration file which needs to be worked around to avoid database servers from failing to start. Ben cannot change the existing ".conf" file since server security auditing software would flag this as an error.

 * Hermione wants to write a simple tool that allows a job to be disabled without modifying the existing ".conf" files.

 * Rafael wants to be able to change the run-levels a job runs in without modifying the existing ".conf" file.

=== Visualization of Jobs/Events ===

 * Finn is an administrator and wants a log of all jobs showing when they ran and how long they took to run.

 * Lauren is a developer and wants to know how long the server application she has written actually ran before it crashed and what other jobs were running at the time of the crash.
Line 114: Line 101:
 * a parser or tool that generates [[http://www.graphviz.org/|graphviz /
 *
dotty]] diagrams.
 * a parser or tool that generates [[http://www.graphviz.org/|graphviz / dotty]] diagrams.

Summary

  1. Introduce Debug Stanza
  2. Support for Override Files
  3. Visualization of Jobs/Events

Release Note

  1. Upstart now supports a new "debug" stanza that if specified in a job configuration file causes Upstart to leave the child process in a paused state. This allows developers to attach a debugger to the child to trace its behaviour.
  2. Support has been added for "override" files. These ".override" files -- which are not created by default -- allow tools to modify the behaviour of any job by overriding the job configuration file in various ways.
  3. It is now possible to visualize the flow of jobs and events as an aid to analysing system behaviour.

Rationale

Introduce Debug Stanza

This feature will make it significantly easier to debug both Upstart and any child processes it creates.

Support for Override Files

Although Upstart has a well-defined configuration syntax it is useful for system administrators and tools to be able to modify the default behaviour of a job without disrupting the provided job configuration file. Overrides provide this facility.

Visualization of Jobs/Events

Upstart processes jobs and events in a very efficient manner with jobs (tasks) often completing before the adminstrator has had a chance to run "initctl status/list". This can cause confusion since users sometimes wonder why their jobs haven't started, not realizing that they have already run. Similarly, events are processed rapidly. What is needed is a way to log and view the flow of jobs, events and state changes in an easy and flexible manner.

User stories

Introduce Debug Stanza

  • Jennifer is an application developer and wants to single-step a daemon process in a production-like environment to help her understand a bug reported by one of her users.
  • Donald has written a new daemon and is trying to understand why it doesn't run as he expects.

Support for Override Files

  • Ben is a system administrator who managers a large group of database servers. The vendor has provided a buggy upstart job configuration file which needs to be worked around to avoid database servers from failing to start. Ben cannot change the existing ".conf" file since server security auditing software would flag this as an error.
  • Hermione wants to write a simple tool that allows a job to be disabled without modifying the existing ".conf" files.
  • Rafael wants to be able to change the run-levels a job runs in without modifying the existing ".conf" file.

Visualization of Jobs/Events

  • Finn is an administrator and wants a log of all jobs showing when they ran and how long they took to run.
  • Lauren is a developer and wants to know how long the server application she has written actually ran before it crashed and what other jobs were running at the time of the crash.

Assumptions

Design

Introduce Debug Stanza

A new job configuration file stanza "debug" will be added which takes no parameters.

Support for Override Files

Recognize files of the form /etc/init/<job>.override. If present, these files will be parsed after the corresponding /etc/init/<job>.conf. Any stanza specified in the ".override" file will replace any specified stanza in the corresponding ".conf" file.

Override files accept the same syntax as job configuration files with one addition a new stanza called "manual". This stanza will only be recognized in an override file (it will be an error for it to exist in a ".conf" file) and if specified makes Upstart ignore any "start on" stanza found in the corresponding ".conf" file. If manual is specified a job can only be started manually by an administrator using start or initctl(8).

If the log priority is set appropriately or if the --verbose option has been passed to init, Upstart will display a log message when it runs a job whose environment has been modified by an override file.

Like ".conf" files, any changes to ".override" files will be automatically detected and applied to all relevant future jobs.

Visualization of Jobs/Events

initially, the following forms of visualization will be provided:

  • a structured log file (ASCII)
  • a parser or tool that generates graphviz / dotty diagrams.

Implementation

UI Changes

Should cover changes required to the UI, or specific UI that is required to implement this

Code Changes

Introduce Debug Stanza

If the debug stanza is specified, Upstart will:

  1. Log a message stating including the PID of the child process)
    • stating that it intends to pause the proces
  2. Pause the child process using pause(2).

Support for Override Files

  • Add inotify watches for ".override" files.
  • If a ".override" file exists without a corresponding ".conf" file,
    • and then a ".conf" file is created, the ".conf" file must still be parsed first.

Visualization of Jobs/Events

  • Increase default and verbose logging.
  • Write a parser to generate summaries and dot language output.

Migration

Introduce Debug Stanza

No forward migration issues perceived. However, note that the new debug stanza will introduce a backwards compatibilty issue should you try to use a job configuration file containing the "debug" stanza with older versions of Upstart. The solution is to remove the "debug" stanza.

Support for Override Files

No migration or compatibility issues perceived: older versions of Upstart only consider ".conf" files.

Visualization of Jobs/Events

No migration or compatibility issues perceived.

Test/Demo Plan

It's important that we are able to test new features, and demonstrate them to users. Use this section to describe a short plan that anybody can follow that demonstrates the feature is working. This can then be used during testing, and to show off after release. Please add an entry to http://testcases.qa.ubuntu.com/Coverage/NewFeatures for tracking test coverage.

This need not be added or completed until the specification is nearing beta.

Unresolved issues

This should highlight any issues that should be addressed in further specifications, and not problems with the specification itself; since any specification with problems cannot be approved.

BoF agenda and discussion

Use this section to take notes during the BoF; if you keep it in the approved spec, use it for summarising what was discussed and note any options that were rejected.

Please document the outcome of this session at:
https://wiki.ubuntu.com/UDSProceedings/N/PackageSelectionAndSystemDefaults#Finish Upstart

Last UDS (recap): Stable version of Upstart in Ubuntu now for a few releases, working out reasonably well, but there are a number of things we need to fix (the fact that mountall is needed, user services, etc.).  Used last UDS to get a sense of what complaints were from various people and groups, and made sure that appropriate bugs were filed.

Upstart's design is too simple.  The goal of the next version is to fix problems based on deployment experience while retaining the "Upstartishness", and reach an elegant, simple design that we don't need to change again: i.e. 1.0.

https://bugs.edge.launchpad.net/upstart/+bugs
https://bugs.edge.launchpad.net/upstart/+bug/447654

BUG: upstart events can trigger apparent deadlocks

Biggest upstart bug:
  job maurice has "start on A and B"
  then emit event A
  process emitting event A hangs until event B occurs

The issue is when we use or:

  start on A and (B or C)

If you now emit A, then B, then C, the C will block waiting for a _second_ A.  

(Diagram: http://bit.ly/ddfeGG)

The solution is to make states from events.  And then jobs can wait on states in addition to events.

BUG: upstart keeps files open on /

BUG: pid tracking can be defeated which leads to upstart breakage

we should be using the proc connector to track children, this would resolve the tracking issue

BUG: service which is slow to start can appear started

Jobs which are starting slowly can appear started to a subsequent start of that job which makes that second start return immediately when the service is not actually started. This for example can trigger gdm to start too early.

This can be fixed by queuing new events against a job when a job is transitioning.

ISSUE: its very hard to find out why jobs are running

We want to be able to generate a dependency graph from a boot to find out why jobs have run.  A solution here would also allow an interactive boot.

ISSUE: chroots do not work, as you talk to the 'wrong' upstart

likely solution, tell upstart about 'chroots' so that it can track them and use the right job tree
 * if upstart is explicitly told about chroot/etc/init, then automatically start jobs in that chroot at boot
 * otherwise, assume /etc/init relative to /proc/PID/root when event received

Proposed Changes:
 - add the concepts of states which are based on events and persist beyond the event, which jobs can depend on
 - child tracking should use proc connector
 - events should be queued against jobs when the job is transitioning
 - overrides to local configuration of jobs without editing them
 - new hook on starting * to allow tracking of job dependancies
 - upstart will know about chroots and make itself available in there if it is going to start jobs automatically; and use the root directory for local start within a chroot to start the right jobs

Need to get buy in for these features


CategorySpec

FoundationsTeam/Specs/MaverickFinishUpstart (last edited 2010-12-15 09:22:10 by 92)