Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding binary packages for linux distros #202

Open
oliverchang opened this issue Sep 15, 2023 · 7 comments
Open

Encoding binary packages for linux distros #202

oliverchang opened this issue Sep 15, 2023 · 7 comments

Comments

@oliverchang
Copy link
Contributor

A common paradigm for Linux distros is a distinction between source and binary packages.

Currently, all of our Linux distro ecosystems refer to source packages, but this may not be the most convenient for all consumers.

Some possibilities:

Extend "package" field

One possibility is that we extend the existing "package" field to include a concept of packages built from this.

e.g.

"package": {
  "ecosystem": "<Linux distro>"
  "name": "<src package>"
  "binary_packages": [  // may need a better name to generalize this more
    "binary_package_name", 
    "binary_package_name2", 
  ]
}

Introduce convention to ecosystem definitions

Alternatively, we can introduce a convention to package names for Linux ecosystems.

e.g.

"package": {
  "ecosystem": "<Linux distro>"
  "name": "src:<src package>"
}

"package": {
  "ecosystem": "<Linux distro>"
  "name": "binary:<src package>"
}
@rhalar
Copy link

rhalar commented Sep 15, 2023

This would be a Good Addition, I believe.

The first option seems cleaner at first (source artifact as a name perhaps?) but note that currently purls are defined per package; anyone wanting to store purls per binary would have to construct them themselves. Not that that is too difficult to do though.

Another similarish thing we encountered are packages which are packaged differently, e.g. pypi has the 'wheel' and 'sdist' format, one or both present for a specific package. They normally correspond but don't necessarily contain the same files. This is perhaps less commonly relevant for vulnerabilities but has some impact on malicious packages in rare cases.
May or may not be relevant, just putting it out there.

@kurtseifried
Copy link
Contributor

Some more examples are compile options, e.g.:

https://nvd.nist.gov/vuln/search/results?form_type=Basic&results_type=overview&query=compile+option&search_type=all&isCpeNameSearch=false

E.g.

The go command may execute arbitrary code at build time when using cgo. This may occur when running "go get" on a malicious module, or when running any other command which builds untrusted code. This is can by triggered by linker flags, specified via a "#cgo LDFLAGS" directive. The arguments for a number of flags which are non-optional are incorrectly considered optional, allowing disallowed flags to be smuggled through the LDFLAGS sanitization. This affects usage of both the gc and gccgo compilers.

So there's also the inverse case where the source code is vulnerable, but the binary isn't.

@pombredanne
Copy link

@oliverchang This is something handled as qualifiers in PURLs.
Note that when there is a difference between the binary and source package, there are typically many binaries derived from a single source. Each of these binaries is a different package and may be subject to different vulnerabilities.
Tracking the relationships between multiple packages (e.g. a source and its binaries) is IMHO best done separately and outside of an advisory itself as this is a package attribute, not an advisory attribute, so I would advise against trying to combine the source package and binary packages in the same advisory: they can be different advisories at scale because the packages are effectively related but different things.

@msmeissn
Copy link
Contributor

Hmm, I thought "name" could already mean binary packages, I had planned my SUSE work like this.
But I can do other stuff... I added purls too.

@oliverchang
Copy link
Contributor Author

Hmm, I thought "name" could already mean binary packages, I had planned my SUSE work like this. But I can do other stuff... I added purls too.

The definition would be part of the OSV schema spec, so in theory you could define it as binary packages for the SUSE ecosystem.

That said, all our other current Linux distros decided to go with source packages. For example, Ubuntu does this, and also encodes binary packages in an ecosystem_specific field: https://github.com/canonical/ubuntu-security-notices/blob/0e4bc518ea69af64ff64f8ec171cbceced02e061/osv/USN-2169-1.json#L33

@oliverchang
Copy link
Contributor Author

Circling back to this one. It appears to me that many distro vulnerability DBs today (e.g. Debian, Ubuntu, Rocky Linux) key vulns on source packages, and do not triage whether the individual binary packages are impacted or not.

Examples:

In both cases it looks like the full list of binary packages are just expanded from the source package. Are there any instances today of distro vuln DBs not doing this?

Of course, this doesn't mean we shouldn't support encoding this -- just that there's perhaps no urgency to adding this if nobody is going to use this today.

@msmeissn
Copy link
Contributor

msmeissn commented Mar 4, 2024

I changed our OSV now to look similar to Ubuntus, use sourcepackage as primary affected index and include binaries in ecosystem_specific section.

For me the relevance is that automated checkers on Linux would look for binary packages mostly, not for source packages.
People looking for security information would look for sources.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants