KernelKarmicHWdbWorkshop

Revision 7 as of 2009-05-26 20:11:07

Clear message

Summary

The goal of this spec is to produce a set of scripts which will allow the kernel to mine data from the HWDB.

Release Note

A series of HWDB scripts have been uploaded to [insert bzr tree]. These scripts provide the ability to extract interesting data from the hardware database. This data can help determine statistics regarding users of a specific device and bugs reported against that device. Those metrics can also help produce a prioritization of work based on the number of affected users with that hardware.

Rationale

Leading up to the Jaunty UDS the Ubuntu kernel team had been holding discussions with the Launchpad team to investigate building a hardware database in which end users could submit their hardware profiles to Launchpad. In theory, those profiles could then be linked to bug reports or have their data mined in general to determine for example how widespread a hardware specific issue may affect the overall Ubuntu user base. Having scripts to extract this data would allow the Ubuntu kernel team to examine how many users this issue may potentially affect to help prioritize work to be done and resources to be allocated. The scripts could also quickly determine which bugs relate to a specific piece of hardware or driver and then group them to one master bug.

User stories

  • We ran into a real world use case with bug 359392 for Jaunty. It would have been helpful if we could have queried all hwdb profiles to discover who had an i965 graphics controller [8086:2a02] and how widespread this issue would have been for the Ubuntu user base.
  • The kernel team would like to query all bugs tagged suspend/resume which use the rtl8187 driver. See the ubuntu kernel-team ml thread "Commonly found hardware in suspend/resume bugs.

Design

All scripts will be written in python and use launchpadlib. In particular, refer to the hwdb api

Implementation

Existing Scripts

Some scripts have already been written and are available at http://bazaar.launchpad.net/~adeuring/+junk/hwdb-scripts/files

  • bug-related-hardware.py

  • subscribed-bugs-of-device-owners.py

    • Search for users owning the given device or having a device controlled by the given driver, and return Ubuntu bugs these people are subscribed to. You must specify a driver name, or bus, vendor ID and product ID.
    • Examples:
    • python subscribed-bugs-of-device-owners.py -d iwl3945

  • device-statistics.py

    • Queries the entire hwdb and returns total number of devices owners, submitters, submissions etc.
    • Examples:
      $./device-statistics.py -nor -b PCI -v 0x8086 -p 0x2a02
      
      device owners: 45301 all submitters: 877776
      submissions with device: 55983 all submissions: 1179316
      total number of devices:  56537
      • Query only HWDB reports for Intrepid:
        • ./device-statistics.py  -b PCI -v 0x8086 -p 0x2a02 -s intrepid

      • Only intrepid/amd64:
        • ./device-statistics.py  -b PCI -v 0x8086 -p 0x2a02 -s intrepid -a amd64

      • You can also limit the counts to a driver and/or a (kernel) packagename, though this returns 0 for this device. Seems that most HWDB reports do not relate any driver with it. You'll get a bit more interesting driver/package-related numbers for the PCI device [0x10de:0x0429] (an NVidia graphics card), for example.

Script Requests

  1. I don't believe it is currently possible to search based on something like system.hardware.vendor="Dell Inc." and system.hardware.product="Inspiron 1420". I think the closest that can be done is specifying something like vendor_id=0x8086.
    1. Use Case: I want to examine all bugs tagged "suspend" and determine which hardware vendor and model (eg. system.hardware.vendor="Dell Inc." and system.hardware.product="Inspiron 1420") each bug is being reported against. This will help to determine possible duplicate bugs and ensure they are duped to one master bug.
    2. This is currently possible. We store these properties as the vendor/product IDs for the "bus" called "System". In the "real world", this can nevertheless be a problem with many IBM/Lenovo laptops. For these machines, system.hardware.product is something unreadable like "6457BAG", while the "ordinary" product name is stored in system.hardware.version.
  2. Need to be able to expire profiles.
    1. Use Case: Bug reporter's hw has died.
  3. Restrict hwdb searches to just a bug *reporters* profile
    1. Use Case: I just want to search for bugs that have a reporter with the specified hw. I don't care about the subscribers of the bug since some people subscribe to bug just to monitor a bug but don't have the affected hw. Also, many times people have the same symptom of a bug but in reality they don't have the same hw so it's a different bug.
  4. Restrict hwdb searches to just the bug *subscribers*
    1. Use Case: I can take a known hw specific bug and find subscribers who need to open a new bug because they have different hardware. Otherwise they just end up spamming the bug with unrelated comments and making it more difficult for developers and triagers to follow the relevant issue.
    2. Should this work like the following?
      • take a bug number
      • loop over all bug subscribers (or people saying they are affected by this bug)
      • show data about all hardware profiles from these people? (Filters for this data, like "show only PCI devices" or "show only the device drivers"?)
  5. Restrict hwdb searches to a specific package a bug was filed against.
    1. Use Case: A person may have submitted a hw profile but I only want to see bugs they reported against the "linux" kernel source package that the profile may be related to. I don't necessarily want to know about bugs they filed against say xorg for example.
  6. I don't want to have to specify a bus type when searching for a vendor and device id
    1. Use Case: It's just annoying specifying a bus type, it should be able to query across all buses
      • You might end up getting results for USB devices when you are looking for a PCI device, so I'm hesitating to remove this requirement.
    2. Use Case: Find the top 10 drivers with the most bugs assigned against them in the last 30 days.
      • This is a very interesting use case. But consider how slow the current scripts are: They either start with one given device (a driver would be the same, speed-wise), look up device owners and then and then look for related bugs; or they start with one given bug and then list devices from affected users. With the currently available webservice API, we would need to loop over a number of bugs, find affected users and finally look up drivers in hardware reports from these users. I think this use case really needs (but for sure deserves Wink ;) an "advanced" API method.

  7. Intermixing search options, I think this could be resolved through a better api.
    1. Use Case: For ex. I'd like to search for all subscribers of a bug 359392 which have the i965 graphics controller [8086:2a02]. Or I'd like to search for all bugs tagged "suspend" whose bug reporter has a graphics card which uses the nvidiafb driver.
      • Right, better API methods would make these searches easier. But your first question can be answered today, I think, see the new script "bug-related-device-owners.py" in lp:~adeuring/+junk/hwdb-scripts). For your second example, I suspect that the run time would be _really_ bad with the API method we have today.
  8. Customize output
    1. Use Case: Sometimes I only want to see affected users (eg launchpad id or email) or just bug numbers and titles. Or, as in the first use case I only want to see the system.hardware.vendor and system.hardware.product info for bugs tagged "suspend". Again, I think if the api were more extensive I could script my own custom queries and specify the output. It'd be great if I could query the hwdb through the api and it could return me a bug collection that I could iterate over or give me a hw profile object that I could print info like the system.hardware.vendor and system.hardware.product info from.

  9. Need to get authorization from people who submit hw profiles that it be okay for us to mine the data they've provided and additionally contact them regarding their hw.
    1. Use Case: I noticed that bug 359392 has the potential to affect a large number of users. A query of the hwdb validated this concern. It would be great if we could then be able to contact the affected users to issue calls for testing or give an advanced warning of the issue.

  10. The Documentation Team has proposed/suggested that it would be nice to consider exposing the information from the hwdb in a more palatable interface than strictly an API.
    • The only alternative right now are the HardwareSupport section under the ubuntu wiki. And under the Hardware section in the community help wiki.

    • This information has become out-of-date, poorly maintained, too large to manage, and often inaccurate. Instead, perhaps we could create a cron to mine the relevant data from the hwdb and program it to generate data that would be accessible through a web interface.
    • The Documentation Team has suggested/envisions something similar to "http://www.ubuntuhcl.org/". Once this becomes live, the Documentation Team would like to be made aware/informed so they can remove all hardware(specifically outdated and inaccurate) references from within the Wiki, and direct users to this proposed new database site instead. They have offered their assistance in helping to get this off the ground in any way they can as this could be a joint project.

    • This may sound "too" ambitious, but perhaps this proposed/theoretical website, when searching for a particular piece of hardware, could also search launchpad for known bugs and the forums for related posts. This would help centralize all hardware related documentation while still keeping them semi-separate.
    • For more information on this topic, check out: https://answers.edge.launchpad.net/checkbox/+question/70686.

  11. request from Bryce to run device-statistics on the following devices. Basically it is all the Intel chipsets, which are those numbered [8086:*]. The specific numbers are as follows:
    • define PCI_CHIP_I810 0x7121
    • define PCI_CHIP_I810_DC100 0x7123
    • define PCI_CHIP_I810_E 0x7125
    • define PCI_CHIP_I815 0x1132
    • define PCI_CHIP_I810_BRIDGE 0x7120
    • define PCI_CHIP_I810_DC100_BRIDGE 0x7122
    • define PCI_CHIP_I810_E_BRIDGE 0x7124
    • define PCI_CHIP_I815_BRIDGE 0x1130
    • define PCI_CHIP_I830_M 0x3577
    • define PCI_CHIP_I830_M_BRIDGE 0x3575
    • define PCI_CHIP_845_G 0x2562
    • define PCI_CHIP_845_G_BRIDGE 0x2560
    • define PCI_CHIP_I855_GM 0x3582
    • define PCI_CHIP_I855_GM_BRIDGE 0x3580
    • define PCI_CHIP_I865_G 0x2572
    • define PCI_CHIP_I865_G_BRIDGE 0x2570
    • define PCI_CHIP_I915_G 0x2582
    • define PCI_CHIP_I915_G_BRIDGE 0x2580
    • define PCI_CHIP_I915_GM 0x2592
    • define PCI_CHIP_I915_GM_BRIDGE 0x2590
    • define PCI_CHIP_E7221_G 0x258A
    • define PCI_CHIP_E7221_G_BRIDGE 0x2580
    • define PCI_CHIP_I945_G 0x2772
    • define PCI_CHIP_I945_G_BRIDGE 0x2770
    • define PCI_CHIP_I945_GM 0x27A2
    • define PCI_CHIP_I945_GM_BRIDGE 0x27A0
    • define PCI_CHIP_I945_GME 0x27AE
    • define PCI_CHIP_I945_GME_BRIDGE 0x27AC
    • define PCI_CHIP_G35_G 0x2982
    • define PCI_CHIP_G35_G_BRIDGE 0x2980
    • define PCI_CHIP_I965_Q 0x2992
    • define PCI_CHIP_I965_Q_BRIDGE 0x2990
    • define PCI_CHIP_I965_G 0x29A2
    • define PCI_CHIP_I965_G_BRIDGE 0x29A0
    • define PCI_CHIP_I946_GZ 0x2972
    • define PCI_CHIP_I946_GZ_BRIDGE 0x2970
    • define PCI_CHIP_I965_GM 0x2A02
    • define PCI_CHIP_I965_GM_BRIDGE 0x2A00
    • define PCI_CHIP_I965_GME 0x2A12
    • define PCI_CHIP_I965_GME_BRIDGE 0x2A10
    • define PCI_CHIP_G33_G 0x29C2
    • define PCI_CHIP_G33_G_BRIDGE 0x29C0
    • define PCI_CHIP_Q35_G 0x29B2
    • define PCI_CHIP_Q35_G_BRIDGE 0x29B0
    • define PCI_CHIP_Q33_G 0x29D2
    • define PCI_CHIP_Q33_G_BRIDGE 0x29D0
    • define PCI_CHIP_GM45_GM 0x2A42
    • define PCI_CHIP_GM45_BRIDGE 0x2A40
    • define PCI_CHIP_IGD_E_G 0x2E02
    • define PCI_CHIP_IGD_E_G_BRIDGE 0x2E00
    • define PCI_CHIP_G45_G 0x2E22
    • define PCI_CHIP_G45_G_BRIDGE 0x2E20
    • define PCI_CHIP_Q45_G 0x2E12
    • define PCI_CHIP_Q45_G_BRIDGE 0x2E10
    • define PCI_CHIP_G41_G 0x2E32
    • define PCI_CHIP_G41_G_BRIDGE 0x2E30

Test/Demo Plan

It's important that we are able to test new features, and demonstrate them to users. Use this section to describe a short plan that anybody can follow that demonstrates the feature is working. This can then be used during testing, and to show off after release. Please add an entry to http://testcases.qa.ubuntu.com/Coverage/NewFeatures for tracking test coverage.

This need not be added or completed until the specification is nearing beta.

Unresolved issues

  1. Doesn't work on Jaunty.
  2. The api documentation is pretty poor. For example, you have to just "know" that most if not all properties of "team" are also available for "person". The documentation doesn't indicate that in any way. Bug 363440 was filed a bug about this.

  3. Performance. It doesn't matter how good the documentation or APIs are if it takes too long to get the information out. Queries should be taking seconds, not hours. For Example: Running `subscribed-bugs-of-device-owners.py -b PCI -v 0x8086 -p 0x2a02` to see how many people may be affected by the i965 freeze issues (bug 359392) took 10+ hours to run! I actually don't know the exact total run time because I walked away after 10hrs and came back in the morning to see if it had finished.

    • We should indeed provide a number of API methods. Queries "canned on SQL level" should be quite fast compared with the existing scripts which need thousands of API calls to achieve a result.
  4. The main issue is the lack of linkage between bugs and hw-info. From what Abel has explained, the hardware information is submitted via checkbox while the bug reports are generated by apport (or filed through launchpad directly). The two utilities don't seem to know about each other. The linkage between hw profiles and bug reports needs to be bi-directional. Also, the way that existing scripts try to match up hw profiles to a bug doesn't take into account that a single user could have made multiple hw-info submissions related to different systems (desktop vs. laptop for example).
    • Right, the best option would be to automatically create a HW profile when apport reports a bug and to link the profile and the bug. That's something we should discuss during UDS, I think. This next-best option would be to allow bug reporters to link a bug with a hardware profile via a web UI. The problem here is that ca. 1/3 of all people who provided at least one HW profile actually submitted two or more profiles: It can be quite difficult to figure out which HW report belong to the machine affected by a bug. Martin had some interesting ideas how we can present the HW profiles on a web page, but I am not sure if/when we will have enough "developer capacity" to implement this. The third option is to provide a webservice API method which creates these links

(would be also needed for option 1), and to write a script which creates these links. I think we should discuss options for better ways to link bug reports and hardware profiles at UDS in Barcelona. There are so many different people involved -- the QA team, checkbox developers, apport developers, LP developers -- that a meeting during UDS it is probably the best way to get something started.

BoF agenda and discussion

Use this section to take notes during the BoF; if you keep it in the approved spec, use it for summarising what was discussed and note any options that were rejected.


CategorySpec