Diff for "QATeam/Specs/MetricsBasedTesting"

MetricsBasedTesting

Differences between revisions 2 and 3

Launchpad Entry: karmic-qa-metrics-based-testing
Created: UDS Karmic
Contributors:

Summary

All our current testing is a binary pass/fail but we would also like to track the evolution of certain parameters such as boot speed and power usage that may not have a clear pass/fail threshold but for which historical data is desireable. We will extend out infrastructure to collect and display such data.

Rationale

Work to improve performance requires access to rich performance data rather than binary pass/fail.

User stories

Design

Gather non-binary data during various checkbox tests and feed them on to the developers who need this for their work.

Examples

Tracking bootspeed (using bootchart)
Power usage
I/O throughput
Amount of memory being used
- polled during install, application testing, immediately after boot, etc.
Webcam framerate

Implementation

Initial pass should be able to be implented quickly and provide value to groups who need this data.

Phase I (Karmic)

Gather metric data as attachments (files/blobs)
Provide access to gathered metric data for developers
Do not parse it in the certification system or results tracker -- just gather and regurgitate

Provide access to the data without doing any parsing

Have a directory containing the data
.../bootchart/$machinename/$timestamp.tgz or .../$machinename/metrics/bootchart/$timestamp.tgz
- linked from the certification 'librarian' so there is no necessity for copying the files around
Have a file containing metadata about the machine from this view
- this data is already available from the website and can be retrieved programmatically
- information such as processor speed, ram, etc.

Phase II (Later)

Store metric data in database as numeric value
- Do not yet parse or analyze heuristically
Provide simple access for ad-hoc reporting
Generate alerts when metrics change by a sufficiently large amount or above a threshold

Unresolved issues

Should this data be public or private?
- Will proprietary data be exposed if the data is public? (E.g. unreleased hardware)
- We should be able to make a split the same way we do with the existing released data
  - this is done with a flag in the certification database marking systems as private
Should benchmarks be pigeonholed into the pass/fail/skip system?
- should numeric data be an addition to the existing result type or a new one?
  - heno feels that this should be a new type -- our existing results ain't broke so don't fix 'em
  - also certain data doesn't really necessarily have a pass or fail... e.g. boot speed

CategorySpec

QATeam/Specs/MetricsBasedTesting (last edited 2009-08-21 15:26:54 by ua-178)

-  ⇤ ← Revision 2 as of 2009-05-27 08:39:15 → 
  Size: 5764
  Editor: 80
  Comment: Added notes from UDS discussion
+   ← Revision 3 as of 2009-06-10 11:40:56 → ⇥
  Size: 2859
  Editor: cpc4-oxfd8-0-0-cust39
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 4:
- * '''Created''':
+ * '''Created''': UDS Karmic
 Line 6:
- * '''Packages affected''':
-Line 10:
+Line 9:
-All our current testing is a binary pass/fail but we would also like to track the evolution of certain parameters such as boot speed and power usage that may not have a clear pass/fail threshold but for which historical data is desireable. We will extend out infrastructure to collect and analyse such data.
+All our current testing is a binary pass/fail but we would also like to track the evolution of certain parameters such as boot speed and power usage that may not have a clear pass/fail threshold but for which historical data is desireable. We will extend out infrastructure to collect and display such data.
-Line 12:
+Line 11:
-== Release Note ==

This section should include a paragraph describing the end-user impact of this change.  It is meant to be included in the release notes of the first release in which it is implemented.  (Not all of these will actually be included in the release notes, at the release manager's discretion; but writing them is a useful exercise.)

It is mandatory.
-Line 20:
+Line 14:
-This should cover the _why_: why is this change being proposed, what justifies it, where we see this justified.
+Work to improve performance requires access to rich performance data rather than binary pass/fail.
-Line 24:
+Line 18:
-== Assumptions ==
-Line 28:
+Line 21:
-You can have subsections that better describe specific parts of the issue.
+Gather non-binary data during various checkbox tests and feed them on to the developers who need this for their work.

==== Examples ====
 * Tracking bootspeed (using bootchart)
 * Power usage
 * I/O throughput
 * Amount of memory being used
  * polled during install, application testing, immediately after boot, etc.
 * Webcam framerate
-Line 32:
+Line 33:
-This section should describe a plan of action (the "how") to implement the changes discussed. Could include subsections like:
+Initial pass should be able to be implented quickly and provide value to groups who need this data.
-Line 34:
+Line 35:
-=== UI Changes ===
+=== Phase I (Karmic) ===
-Line 36:
+Line 37:
-Should cover changes required to the UI, or specific UI that is required to implement this
+ * Gather metric data as attachments (files/blobs)
 * Provide access to gathered metric data for developers
 * Do not parse it in the certification system or results tracker -- just gather and regurgitate
-Line 38:
+Line 41:
-=== Code Changes ===
+Provide access to the data without doing any parsing
 * Have a directory containing the data
 * .../bootchart/$machinename/$timestamp.tgz or .../$machinename/metrics/bootchart/$timestamp.tgz
  * linked from the certification 'librarian' so there is no necessity for copying the files around
 * Have a file containing metadata about the machine from this view
  * this data is already available from the website and can be retrieved programmatically
  * information such as processor speed, ram, etc.
-Line 40:
+Line 49:
-Code changes should include an overview of what needs to change, and in some cases even the specific details.
+=== Phase II (Later) ===
-Line 42:
+Line 51:
-=== Migration ===
+ * Store metric data in database as numeric value
  * Do not yet parse or analyze heuristically
 * Provide simple access for ad-hoc reporting
 * Generate alerts when metrics change by a sufficiently large amount or above a threshold
-Line 44:
+Line 56:
-Include:
 * data migration, if any
 * redirects from old URLs to new ones, if any
 * how users will be pointed to the new way of doing things, if necessary.

== Test/Demo Plan ==

It's important that we are able to test new features, and demonstrate them to users.  Use this section to describe a short plan that anybody can follow that demonstrates the feature is working.  This can then be used during testing, and to show off after release. Please add an entry to http://testcases.qa.ubuntu.com/Coverage/NewFeatures for tracking test coverage.

This need not be added or completed until the specification is nearing beta.
-Line 57:
+Line 59:
-This should highlight any issues that should be addressed in further specifications, and not problems with the specification itself; since any specification with problems cannot be approved.

== BoF agenda and discussion ==

Metrics: tracking test results over time for trending analysis

==== Examples ====
 * tracking bootspeed (using bootchart)
 * power usage
 * I/O throughput
 * amount of memory being used
  * polled during install, application testing, immediately after boot, etc.
 * Webcam framerate

==== Questions ====
 * How do we get the metrics from the test systems to the people who care about them?
  * currently this data is just stored on the filesystem of the certification server
  * we would like to be able to parse out this data and present it in a usable form to people who are interested
  * however, we don't want to be the bottleneck for this data and we need to make sure users have access to the raw data
-Line 84:
+Line 67:
- * Do we really need to see the analysis of these benchmarks online in our certification or results-tracking system?
  * this data might be better placed in a more specific offline analysis

==== Short-term Goals ====
 * Gather metric data as attachments (files/blobs)
 * Provide access to gathered metric data
 * Do not parse it in certification system or results tracker -- just gather and regurgitate
 
==== Mid-term Goals ====
 * Store metric data in database as numeric value
 * Do not yet parse or analyze heuristically
 * Provide simple access for ad-hoc reporting

==== Long-term Goals ====
 * Be able to track metrics heuristically
 * Generate alerts when metrics change by a sufficiently large amount
 * Automated quantitative analysis
 * Parse data from metrics (required for above goals)

==== Proposal ====
Initial pass should be able to be implented quickly and provide value to groups who need this data.  Basically we want to provide access to the data without really doing any parsing ourselves.
 * Have a directory containing the data
 * .../bootchart/$machinename/$timestamp.tgz or .../$machinename/metrics/bootchart/$timestamp.tgz
  * linked from the certification 'librarian' so there is no necessity for copying the files around
 * possibly have a file containing metadata about the machine from this view
  * this data is already available from the website and can be retrieved programmatically
  * information such as processor speed, ram, etc.

===== Tracking Data Online Proposal =====
Track online only one number for each result that reports metric data.  Continue to store any other information as an attachment, so if detailed analysis is desired we can provide access to the data.

The justification for this is that otherwise the server needs to know too much about each type of result... this just becomes untenable.

Ubuntu Wiki