You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
... interoperability between different SBOM standards, handling missing
information, imprecise definitions of SBOM elements, multiple formats for SBOM
elements (e.g., component name, version), and difficulties with ingesting/parsing data in
producing SBOM elements.
As an example, Section 4.3.3 Version observes: "There are many variations in how product versions are named, identified, and cited", including Numbers, Dates, Code Names, Version Indicators, and Git hashtags.
The NTIA framing document suggests formats and sources for obtaining content, observes that a common approach is to create a set of canonical names/representations, but with respect to version says:
As there is a wide range of versioning schemes in use, recording what is provided from the supplier accurately is the primary goal. Semantic versioning is preferred. Git hashes are also acceptable.
As a minimum expectation, declare the version string as provided by the supplier.
An information model cannot do much about bad, missing or inconsistent input data, but it can attempt to classify the data that it does find, flag data that cannot be classified, or canonicalize what can. For example a Version type could be defined as a Choice among known formats, the SemVer option would be classified as a SemVer and broken out into major, minor and patch components. And in response to the common practice of giving up and declaring an SBOM Version to be a "String", would flag examples like "four score and seven years ago, our fathers ...", which is a valid string but not any recognizable Version format.
Questions:
Should the OSIM TC attempt to tackle the SBOM Data Normalization Challenge?
If so, who will do the work, and who are the stakeholders?
The text was updated successfully, but these errors were encountered:
I'm not sure I'd put it at the top of the list, but I do think we should address. I do think we should 80/20 and not worry about the edge cases that would put us down too many rabbit holes. I don't think we need a global versioning system - we just need to extract the version information to make it useful within the ecosytem it works in. The objective is unambiguous identification of a piece of software - including version. For at least 80% of the most-needed apps, a given piece of software uses whatever it's ecoystem uses and it only needs to be internally consistent. So linux can use linux_6.5, linux_6.6; and Phoenix Liveview can use SemVar and it doesn't matter they using something different. So I think your list of common formats is a good one, with a generic text string as a catch-all. And we probably should make it extensible. We don't even need to figure it out automagically - it could be 'hardcoded' for each ecosystem.
The Mitre white paper "Data Normalization Challenges and Mitigations in SBOM Processing" highlights technical challenges in automating the production of SBOMs, including:
As an example, Section 4.3.3 Version observes: "There are many variations in how product versions are named, identified, and cited", including Numbers, Dates, Code Names, Version Indicators, and Git hashtags.
The NTIA framing document suggests formats and sources for obtaining content, observes that a common approach is to create a set of canonical names/representations, but with respect to version says:
An information model cannot do much about bad, missing or inconsistent input data, but it can attempt to classify the data that it does find, flag data that cannot be classified, or canonicalize what can. For example a Version type could be defined as a Choice among known formats, the SemVer option would be classified as a SemVer and broken out into major, minor and patch components. And in response to the common practice of giving up and declaring an SBOM Version to be a "String", would flag examples like "four score and seven years ago, our fathers ...", which is a valid string but not any recognizable Version format.
Questions:
The text was updated successfully, but these errors were encountered: