KernelKarmicBugHandling

Differences between revisions 6 and 7
Revision 6 as of 2009-05-26 07:04:02
Size: 5421
Editor: 80
Comment: add uds discussion
Revision 7 as of 2009-06-05 20:00:07
Size: 8221
Editor: c-76-105-148-120
Comment: Draft Spec post UDS discussions
Deletions are marked like this. Additions are marked like this.
Line 11: Line 11:
Due to the high volume of incoming kernel bugs an improved approch to bug management is being introduced. See KernelKarmicBugHandling for more information. Due to the high volume of incoming kernel bugs an improved approach to bug management is being introduced. See KernelKarmicBugHandling and [[http://wiki.ubuntu.com/KernelTeam/KernelTeamBugPolicies|KernelTeamBugPolicies]]for more information.
Line 15: Line 15:
At the time of the Jaunty 9.04 release there were over 5000 [[https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bugs|open kernel bugs]]. That's a 1300+ net increase in bugs compared to the number of open bugs when Intrepid released. The kenrel team can not realistically be expected to close 5000 bugs in a single release cycle. If the total number of incoming kernel bugs continues to grow at a faster rate than the existing number of bugs can be closed, it's obvious that a new approach to dealing with the incoming bug volume needs to be addressed. Otherwise the probability of a critical or high priority bug not being seen by the kernel team becomes a greater concern. At the time of the Jaunty 9.04 release there were over 5000 [[https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bugs|open kernel bugs]]. That's a 1300+ net increase in bugs compared to the number of open bugs when Intrepid released. The kernel team can not realistically be expected to close 5000 bugs in a single release cycle. If the total number of incoming kernel bugs continues to grow at a faster rate than the existing number of bugs can be closed, it's obvious that a new approach to dealing with the incoming bug volume needs to be addressed. Otherwise the probability of a critical or high priority bug not being seen by the kernel team becomes a greater concern.
Line 20: Line 20:
 * Sue reports a high impact bug against the Karmic Alapha release. Unfortunately her bug is lost amongst the massive volume of existing kernel bugs.
 * Joe reports a regression he's seen from Intrepid to Jaunty. Again this bug is lost in the masses and continues to be a regression for Karmic.
 * Sue reports a high impact bug against the Karmic Alpha release. Unfortunately her bug is lost amongst the massive volume of existing kernel bugs.
 * Joe reports a regression he's seen from Intrepid to Jaunty but may be resolved with the latest upstream mainline kernel. However, Joe is unaware he should test the upstream kernel and the bug remains unresolved.
Line 24: Line 24:

The biggest issue is making sure that the incoming bugs are in an appropriate debug state for the developers to begin working on a bug. It may take weeks of communication back and forth before a reporter has attached all the appropriate logs. It's been decided that much of this could be streamlined by using a series of Arsenal scripts. Below is a description of how the Kernel Arsenal scripts will work. The end goal is for a bug reported against the Ubuntu kernel in Launchpad will have also tested the latest upstream mainline kernel to verify if the issue exists or is fixed upstream. The Ubuntu Kernel Team's focus will shift to those bugs which are confirmed to exist upstream or are fixed upstream but exist in the Ubuntu kernel. We are also updating the [[http://wiki.ubuntu.com/KernelTeam/KernelTeamBugPolicies|KernelTeamBugPolicies]] wiki to match the changes being introduced here.

=== Kernel Arsenal Scripts ===

==== arsenal/contrib/linux/process-new-bugs.py ====
{{{
if Status == New
    if has_tag("omit"|"workflow"|"review-request")
        exit

    if has_tag("apport-bug")
        tag "needs-upstream-testing"
        logs_complete = True
        message = test upstream

    if has_tag("apport-package")
        logs_complete = True

    if Importance == Wishlist
        exit

    if bug_tasks > 3
        tag "review-request"
        exit

    if symptom == workflow
        tag "workflow"
        exit

    if symptom == sound bug
        tag "kernel-sound"

    if attachments == 0
        tag "needs-kernel-logs"
        tag "needs-upstream-testing"
        message = apport-collect
    elif parse_attachments()
        logs_complete = True

    if old bug (ie last comment > 120 days ago)
        tag "needs-verification"

    tag "kj-triage"

    if logs_complete
        status = Confirmed
    else
        status = Incomplete

    if message
        post message as a comment to the bug
}}}

==== arsenal/contrib/linux/process-incomplete-bugs.py ====
{{{
if status="Incomplete (with response)" and has_tag("needs-kernel-logs")
    if "apport-collect data" in message_colection:
        remove tag "needs-kernel-logs"
        status = Confirmed
    else
        tag "review-request"
}}}

==== arsenal/contrib/linux/process-incomplete-bugs.py ====
{{{
if status="Confirmed" and has_tag("kj-triage") and not has_tag("needs-kernel-logs"&&"needs-upstream-testing")
    status = Triaged
}}}

== Test/Demo Plan ==

The Kernel Arsenal scripts which should land soon at https://code.edge.launchpad.net/~arsenal-devel. They have a dryrun option enabled to ensure bugs are handled appropriately before turned on.

Additionally we will target small subsets of the overall number of kernel bugs until it's deemed reasonable to run the scripts for every kernel bug. This will then be handled by a cron to regularly run the Kernel Arsenal scripts.

== Unresolved issues ==


== BoF agenda and discussion ==
Line 42: Line 122:
== Test/Demo Plan ==

A wiki must be created to document any changes to the bug reporting/triaging process.

== Unresolved issues ==

This should highlight any issues that should be addressed in further specifications, and not problems with the specification itself; since any specification with problems cannot be approved.

== BoF agenda and discussion ==

Summary

It has become apparent that the incoming volume of kernel bugs has become problematic to manage. The ratio of incoming bugs to resources available will not scale as the volume continues to increase. The goal of this spec is to introduce better bug management workflows and practices to combat this growing number of bugs.

Release Note

Due to the high volume of incoming kernel bugs an improved approach to bug management is being introduced. See KernelKarmicBugHandling and KernelTeamBugPoliciesfor more information.

Rationale

At the time of the Jaunty 9.04 release there were over 5000 open kernel bugs. That's a 1300+ net increase in bugs compared to the number of open bugs when Intrepid released. The kernel team can not realistically be expected to close 5000 bugs in a single release cycle. If the total number of incoming kernel bugs continues to grow at a faster rate than the existing number of bugs can be closed, it's obvious that a new approach to dealing with the incoming bug volume needs to be addressed. Otherwise the probability of a critical or high priority bug not being seen by the kernel team becomes a greater concern.

User stories

  • Bob reported a bug originally against Hardy and hasn't updated his bug since. However, this bug remains open and contributes to the large volume of bugs that must be tracked. This bug should be closed.
  • Sue reports a high impact bug against the Karmic Alpha release. Unfortunately her bug is lost amongst the massive volume of existing kernel bugs.
  • Joe reports a regression he's seen from Intrepid to Jaunty but may be resolved with the latest upstream mainline kernel. However, Joe is unaware he should test the upstream kernel and the bug remains unresolved.

Implementation

The biggest issue is making sure that the incoming bugs are in an appropriate debug state for the developers to begin working on a bug. It may take weeks of communication back and forth before a reporter has attached all the appropriate logs. It's been decided that much of this could be streamlined by using a series of Arsenal scripts. Below is a description of how the Kernel Arsenal scripts will work. The end goal is for a bug reported against the Ubuntu kernel in Launchpad will have also tested the latest upstream mainline kernel to verify if the issue exists or is fixed upstream. The Ubuntu Kernel Team's focus will shift to those bugs which are confirmed to exist upstream or are fixed upstream but exist in the Ubuntu kernel. We are also updating the KernelTeamBugPolicies wiki to match the changes being introduced here.

Kernel Arsenal Scripts

arsenal/contrib/linux/process-new-bugs.py

if Status == New
    if has_tag("omit"|"workflow"|"review-request")
        exit

    if has_tag("apport-bug")
        tag "needs-upstream-testing"
        logs_complete = True
        message = test upstream

    if has_tag("apport-package")
        logs_complete = True

    if Importance == Wishlist
        exit

    if bug_tasks > 3
        tag "review-request"
        exit

    if symptom == workflow
        tag "workflow"
        exit

    if symptom == sound bug
        tag "kernel-sound"

    if attachments == 0
        tag "needs-kernel-logs"
        tag "needs-upstream-testing"
        message = apport-collect
    elif parse_attachments()
        logs_complete = True

    if old bug (ie last comment > 120 days ago)
        tag "needs-verification"

    tag "kj-triage"

    if logs_complete
        status = Confirmed
    else
        status = Incomplete

    if message
        post message as a comment to the bug

arsenal/contrib/linux/process-incomplete-bugs.py

if status="Incomplete (with response)" and has_tag("needs-kernel-logs")
    if "apport-collect data" in message_colection:
        remove tag "needs-kernel-logs"
        status = Confirmed
    else
        tag "review-request"

arsenal/contrib/linux/process-incomplete-bugs.py

if status="Confirmed" and has_tag("kj-triage") and not has_tag("needs-kernel-logs"&&"needs-upstream-testing")
    status = Triaged

Test/Demo Plan

The Kernel Arsenal scripts which should land soon at https://code.edge.launchpad.net/~arsenal-devel. They have a dryrun option enabled to ensure bugs are handled appropriately before turned on.

Additionally we will target small subsets of the overall number of kernel bugs until it's deemed reasonable to run the scripts for every kernel bug. This will then be handled by a cron to regularly run the Kernel Arsenal scripts.

Unresolved issues

BoF agenda and discussion

Below are a few ideas that have been suggested (some more drastic than others):

  • Mark all open bugs as Won't Fix after every release. Reporter must reopen the bug once it's confirmed against the latest development kernel.
    • Consider some exceptions, like don't close bugs tagged regression-*
    • Alternatively consider closing out all old Incomplete kernel bugs. These account for 1400+ open bugs.
    • Also consider closing out New, Confirmed, Triaged bugs which have not had a comment for say 2+ months.
  • Only allow Ubuntu specific kernel bugs to be reported. If the bug exists upstream it should be reported upstream. Ubuntu kernel devs can help resolve bugs reported upstream.
  • Modify the bug reporting process to incorporate a series of questions to be answered such as:
    • Is this a regression?
    • Have you tested the upstream kernel?

    • Has this bug been confirmed against the upstream kernel?
    • How reproducible if this bug? What are the steps to reproduce?
  • Automated bug handling? Similar to what xorg does.
    • For ex. a bot will ask them to run apport-collect if general debug files are missing.
  • Disable (ie. get rid of) the "Report a Bug" button from https://bugs.launchpad.net/ubuntu/+source/linux

    • Encourage the use 'ubuntu-bug linux' instead.

A new Kernel's bug bankrupcy policy was proposed for treating kernel bugs.

1. Bugs reported against Linux package at LP must be tested against latest upstream kernel. If they're still valid, a new upstream bug report must be openand at LP an upstream bug watch must be put in place. 2. If no response is received after 30 days, the Bug would be marked as Won't Fix. 3. This policy would be discussed with Jono in order to get the Community input. 4. The upstream bug report must include the kernel version.

a few ideas discussed at the meeting:

- launchpad ubuntu bugs specific. check with kernel upstream. if the report still exists. - apport would help us to get more data, but we need more data. lp bugs must linked against upstream. - higher priority if a bug can be tested with upstream build - existing 5000 bugs. bug bankruptcy. Close them after 30 days automatically. - educate our users to indicate which kernel they are reporting the bug against to, specially at upstream. - take a survey against fedora and openssue about their bug bankrupcy policies/upstream bug report flow. - support for old hw i.e. - 95% time looking for patches to solve downstream problems. - testing upstream kernel to check if the bug stills there. - Ask for a new bug report if this exists at the latest upstream kernel. - compat-wireless stack against ... if you report a bug against it you need the date of building.. fast moving target. - encourage people to follow the path... - talk to jono about bug bankrupcy - upstream mainline kernel rights. - after release focus on regression. - be careful with junking upstream with bug reports not related. - take a sample and run a report, looking for bugs against older releases still present at latest cycle.


CategorySpec

specs/KernelKarmicBugHandling (last edited 2009-06-12 18:34:10 by c-76-105-148-120)