-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using UUIDv5 for SDOs and SROs #293
Comments
Relationships have a strong use case for UUIDv5’s when recording DNS resolutions. While you can store resolution information for domain-names in the Unfortunately this does not effectively convey the information since the Instead using an external relationship makes this far easier. A relationship has an explicit Using a UUIDv5 based relationship with created set equal to the “If This technique allows any vendor that can map things like domain resolution or certificate hosting history over time already to quickly provide a STIX output that is easy to ingest for existing systems. Our current proposal suggests the following properties to be used to generate a UUIDv5 for relationships: If it lapses and DNS had a hole in it then a new ID would be generated for the new start time with no relationships showing resolution in that time window |
I find it odd that SCOs use UUIDv5 but the Observed Data SDO doesn't. Observed Data SDO is effectively a container for SCOs, logically equivalent to a log event. Having a unique ID here helps a ton in data deduplication. |
@pcoccoli - I'm not sure I understand your question. Yes - the observed data object is essentially a log event. Wouldn't each log event have different first observed/last observed time. If you are seeing the same event over and over, you could create a new version of that observed data object, with updated last observed time and keep track of the number of times you saw the event in the number_observed property. |
IMHO, this sentence alone raises problems regarding the STIX principles for versioning, but if your point was strictly on the thing you said, I would agree: "A simpler solution would be to use a UUIDv5, based on the CVE id. All producers could determine what the appropriate vulnerability id is without having to store the object or obtain it from the common object repository, and just use the id for references to the CVE." However, this also means that people can only use OASIS STIX Namespace to determine the STIX ID of the Vulnerability object but NEVER to generate new vulnerability objects with that ID. I believe this raises a need to something that I've been spoting which is a library on top of stix2 that deals with this use cases. |
@rpiazza what about chat with cve.org guys to also provide stix version in their repo?
Last year they launched the JSON 5 Format, this year with MITRE help they could launch a version with STIX format. This approach ensures that the source of vulnerabilities management is also the producer of the stix objects and keep them updated following STIX principles. |
@pcoccoli Yes, totally understand your point. The only reason an Observed Data, as it is today, cannot be an SCO with a UUIDv5 is because of its field One possible approach to convert Observed Data to an object with no versions (SCO UUIDv5) would be something like:
Again, this will force to generate a lot of observed-data objects. So, I believe the TC went to the current approach in order to avoid a lot of objects, even though does not allow deduplication like you would understandably expect. |
@SYNchroACK Observed Data is a deprecated object. It is an artifact from when we first started building STIX 2. It represents a Graph inside of a Graph. The reason we went that way is we did not want every IP address to have a unique ID. It was not until we better understood how to use UUIDv5 that we looked at making that change. To address your other comment, SCOs are "facts" or empirical data that does not change and is not open to debate or confidence or other bits of data. You connect SCOs to intelligence and that intelligence can change and what not, or be added to. This is why there are UUIDv4 addresses for SDOs and UUIDv5 for SCOs. |
@jordan2175 You mean this Observed Data is deprecated? |
I think there might be some confusion here. The usage of the In this context the usage of deterministic IDs for both Observed Data and Sightings (as a type of relationship) would likely be extremely useful to prevent data duplication. |
Yup, exactly!
Well, in fact, even Relationship object should have a deterministic ID, however, with the current core structure of the objects, that cannot be achieved. In order to met that goal (which I totally agree), there is a need for a core restructure splitting objects in the following types: ParticlesAn object with or without deterministic IDs which represents a set of properties like the following, that must always have an embedded reference to an Atom object:
NotesA particle ID may be UUIDv4 or UUIDv5 depending on the scenario:
In practice, a particle can have a deterministic ID if the producer will never have to update it, otherwise, the versioning mechanism needs to be in place (like in stix 2.1) which then makese the case to use UUIDv4. AtomsAn object with deterministic IDs which represents base STIX element like:
NotesOn objects that represent threats like
MoleculesAn object with deterministic IDs which represents a set of Atom objects like:
CompoundsAn object without deterministic IDs which represents a special set of Atom objects like:
I have a draft of a proposal for a possible stix 3.0, in case you find it interesting, ping me. ;) |
The specification was written to encourage use of UUIDv5 for SCOs to avoid duplication of objects that represent the same thing - e.g., an IP address. There is an algorithm in the spec that should be used to generate the UUIDv5 ids, based on specified properties for each SCO and an explicitly defined namespace. Other algorithms may be used, as described in this text from section 3.4:
Using UUIDv5 ids for SDO/SROs is not explicitly discussed in the spec, but is not explicitly prohibited either. The following text from section 2.9 can imply that UUIDv5 ids can be used for them:
There is at least one use case for using UUIDv5 ids for SDOs - representing CVEs using the Vulnerability SDO.
It was recognized that having many duplicate Vulnerability objects to represent a particular CVE is not ideal. For this reason, the common STIX object repository includes a "canonical" Vulnerability object for each CVE, and the repository is updated nightly to include the CVEs created that day.
However, because of the large number of CVEs (over 100000) this seems not to be an ideal solution. A simpler solution would be to use a UUIDv5, based on the CVE id. All producers could determine what the appropriate vulnerability id is without having to store the object or obtain it from the common object repository, and just use the id for references to the CVE.
Based on the text from section 2.9, this is already possible to do, but the explicit namespace CAN NOT BE USED. This implies that producers would pick a namespace, which would most likely differ from other producers, defeating the whole purpose of the use of UUIDv5s. Of course, this namespace could be published so it is known to the community - but that seems problematic.
The proposal suggested in this issue is to explicitly allow the use of UUIDv5 for certain SDO/SROs.
The text was updated successfully, but these errors were encountered: