The license statement for packages is not always accurate, due to conflicts at the file level inside the package. Improving the process for documenting the actual license(s) in effect, for a package, before it is accepted into main, is the goal.


Package license statements do not always accurately reflect all of the actual licenses that need to be complied with for that package. See: 1

When a new packages is nominated for main, as part of the license check the individual files should be scanned for licenses. By using current open source tools to analyze the contents, the file level can be surveyed efficiently, and the licenses can be recorded accurately in the package before it is accepted into main.

Use Cases


This specification covers the promotion of a new package into main. Once a new package has been adopted, ongoing scans to prevent inadvertent inclusion may be required.


When a new package is nominated to be promoted into main:


This specification is for a process change to be applied before accepting new packages into main, by using file level scanning tools to check the analysis.


There is increased awareness in the commercial supply chain to ensure that the licenses code is released under can be complied to [3]. There is a legacy of projects where the intent of the package maintainer does not match what the licenses in the contents say [1]. This step is a low cost overhead to avoid the problem becoming worse.

Scope and Use Cases

Scope at this time is limited to packages getting added into main. We may want to extend it to updates to main packages to go through the file level scrutiny, but first step is to make sure new packages are license accurately documented as a condition of addition.

Use Cases

Implementation Plan


Outstanding Issues

BoF agenda and discussion

Taken from whiteboard on blueprint (pre-UDS)

Notes from UDS

It's common in FOSS world for licenses and files to be mismatched, and for toplevel AUTHORS or debian/copyright to not be fully accurate.

Proposal to check that debian/copyright is accurate during main inclusion process. Should we also do this before any package even hits the archive? If it's an automated tool, it's no harder to do universe than main. (there are false positives, though) Upstream isn't necessarily involved in MIR process to help with any issues anyway.

Do we know how good we are right now in the archive? Not at all. But we're not even sure how bad it is right now.

Tools out there now are fossology, ninka(?), license-check.

Fossology is already setup for Debian... somewhere.

Sometimes source package license is actually wrong and debian/copyright is right because we've checked with upstream. And actually patching source is a headache for maintenance. DEP-5 needs a way to specify this, to avoid false positives. SPDX has a way around this.

Merging SPDX and DEP-5? DEP-5 needs to be human readable, but SPDX is going to XML. Fedora and Linux Foundation seem to be moving towards SPDX. As long as both are semantically compatible and machine-readable, we're fine.

SPDX isn't finished yet. RC before end of year.

Kyle has written a tool called get-licenses that will iterate through copyright files and produce a spreadsheet about which licenses it found. Parses DEP-5 files, calls license-check, will ping fossology server. Looks at installed packages, not just a source package sitting there. Still useful for flavors and such that have such publication requirements. Kyle plans to publish as a Launchpad project.

Action Plan:

Action items for next six months:



LicenseReviewProcessImprovementSpec (last edited 2010-10-26 13:56:51 by host194)