Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON Output does not match with OSCAL JSON Schema #15

Open
NotReallyJustin opened this issue Jun 26, 2024 · 4 comments
Open

JSON Output does not match with OSCAL JSON Schema #15

NotReallyJustin opened this issue Jun 26, 2024 · 4 comments

Comments

@NotReallyJustin
Copy link

NotReallyJustin commented Jun 26, 2024

Hello there! First off, I wanted to say I came across this tool for writing OSCAL and it's been amazing so far. There's just a few issues I came across so far while trying to validate OSCAL Pydantic output with the expected JSON Schema. So far, I've only messed with the catalog model but the issues should be relevant to all models such as SSPs and Profiles.

  • .json() in Pydantic will fill in undefined JSON keys with null, which breaks the OSCAL schema (proper OSCAL leaves the key-pair out of the final JSON altogether). I've attached a quick CLI tool to fix this in the comments below but there's probably a cleaner way to do it in OSCAL Pydantic
  • In the JSON keys, replace _ with -. The exception to this is "class_" which should become "class"
  • Small issue, but Pydantic's datetime cannot handle 0 microseconds (it'll leave off the microseconds altogether), which breaks ISO formatting. For instance, 2003-04-12T00:00:00.000000-05:00 becomes 2003-04-12T00:00:00-05:00. I would reccomend either adding a custom JSON encoding as a configuration (as shown below in DateTimeISOFix.py), or changing the datatype from datetime to str in /src/oscal_pydantic/*.py for classes such as PublicationTimestamp (which might be better since I saw some RegEx being commented out too)
  • For all models, wrap the output JSON in another layer of JSON. Sounds a bit confusing, so I'll give an example below:

This is what OSCAL Pydantic outputs:

{
        "uuid": "42a21239-b76b-43ec-b9e7-250553511544",
        "metadata": {
            "title": "Galactic Security Controls",
            "published": "2004-04-04T00:00:00.000000-05:00",
            "last-modified": "2024-06-26T14:58:18.540733-05:00",
            "version": "1.2",
            "oscal-version": "1.0.2",....

This is what's valid according to the schema:

{
    "catalog": {
        "uuid": "42a21239-b76b-43ec-b9e7-250553511544",
        "metadata": {
            "title": "Galactic Security Controls",
            "published": "2004-04-04T00:00:00.000000-05:00",
            "last-modified": "2024-06-26T14:58:18.540733-05:00",
            "version": "1.2",
            "oscal-version": "1.0.2" ...

Notice how the JSON is wrapped in another key-pair {"catalog":...}.

Take this with a grain of salt, since I'm writing this OSCAL Pydantic code without documentation and relying on existing knowledge of the Pydantic library itself and Pylance. If there is already a function that resolves these issues (or if there's any documentation out there in general. It's weird since I found this tool from a NIST presentation but there's nothing else online I could find about OSCAL Pydantic), I would love to know about it.

@NotReallyJustin
Copy link
Author

NotReallyJustin commented Jun 26, 2024

Github would not allow me to attach .js and .py files, so here are Pastebin links for them:

Let me know if there's any other files I could send on my end to help you

@RS-Credentive
Copy link
Owner

Hey, @NotReallyJustin I appreciate the feedback!

Couple of questions:

  1. Are you using v1 or v2 of the library?
  2. For v2, a call to ".model_dump_json()" should work, since I have overridden that method in the BaseModel to set "exclude_none = True", as well as defining a generic serializing function that automatically replaces "_" with "-" and fixes the "class" issue. If this does not work, please let me know as it is definitely a bug in that case.

Documentation of the library is definitely an issue, but I discovered that the approach taken to build the oscal-pydantic library is inherently limited (for reasons I could discuss at length if you are interested, but can spare you if you're not). I am working now on the "metaschema-python" library here: (https://github.com/Credentive-Sec/metaschema-python). Check the dev branch to see what we're working on.

I still use oscal-pydantic v2 for my projects while the metaschema work is being completed, but plan to retire this library as soon as the other one is working.

@RS-Credentive
Copy link
Owner

I am going on vacation for the next week so won't be as responsive to issues, but please feel free to send me any further questions and I will reply when I can.

@NotReallyJustin
Copy link
Author

Hello!
That sounds awesome; if Metaschema Python is ready to use, I'll definitely check that out!

  • I'm using v1 of the library since I saw that v2 is still in development (and that v1 was still being updated as of 3 months ago). I could check out v2, but I don't have any documentation to go off of and Pylance is not giving me autocomplete for those so it might take some time

Do you maybe have a list of functions/classes for OSCAL Pydantic v2 somewhere I could refer to? I remember seeing a few slides on the NIST website about how OSCAL Pydantic v2 improves on v1, so it'll be cool to fiddle with that a bit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants