Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds SARIF traces support to SAPP #93

Closed
wants to merge 2 commits into from

Conversation

the-storm
Copy link
Contributor

Pre-submission checklist

  • [✅] I've ran the following linters locally and fixed lint errors related to the files I modified in this PR
    • [✅] black .
    • [✅] usort format .
    • [✅] flake8
  • [✅] I've installed dev dependencies pip install -r requirements-dev.txt and completed the following:
    • [✅] I've ran tests with ./scripts/run-tests.sh and made sure all tests are passing

Summary

This PR adds a sarif output to SAPP traces. Currently SARIF output from SAPP doesn't output the trace of an issue. It just outputs the root node making it triage almost impossible.

Example of SARIF output on exercise_3 from Pysa tutorial is

{
          "ruleId": "5001",
          "level": "warning",
          "message": {
            "text": "Data from [UserControlled] source(s) may reach [RemoteCodeExecution] sink(s)"
          },
          "locations": [
            {
              "physicalLocation": {
                "artifactLocation": {
                  "uri": "views.py"
                },
                "region": {
                  "startLine": 24,
                  "startColumn": 19,
                  "endColumn": 36
                }
              }
            }
          ]
        }

Notice locations section only shows the root of where the issue is happening but not a full trace
Below is the SARIF output of the same issue after applying this PR

{
          "ruleId": "5001",
          "level": "warning",
          "message": {
            "text": "Data from [UserControlled] source(s) may reach [RemoteCodeExecution] sink(s)"
          },
          "locations": [
            {
              "physicalLocation": {
                "artifactLocation": {
                  "uri": "views.py"
                },
                "region": {
                  "startLine": 24,
                  "startColumn": 19,
                  "endColumn": 36
                }
              }
            }
          ],
          "codeFlows": [
            {
              "threadFlows": [
                {
                  "locations": [
                    {
                      "location": {
                        "physicalLocation": {
                          "artifactLocation": {
                            "uri": "views.py",
                            "uriBaseId": "%SRCROOT%"
                          },
                          "region": {
                            "startLine": 14,
                            "startColumn": 16,
                            "endColumn": 28
                          }
                        },
                        "message": {
                          "text": "flow from views.get_operator_safe(...result...) -[into]-> Obj{django.http.request.HttpRequest.POST}(...source...)  features: ['has:first-index', 'first-index:operator']"
                        }
                      },
                      "nestingLevel": 0
                    },
                    {
                      "location": {
                        "physicalLocation": {
                          "artifactLocation": {
                            "uri": "views.py",
                            "uriBaseId": "%SRCROOT%"
                          },
                          "region": {
                            "startLine": 22,
                            "startColumn": 16,
                            "endColumn": 42
                          }
                        },
                        "message": {
                          "text": "flow from views.operate_on_twos(...root...) -[into]-> views.get_operator_safe(...result...) via 1 propagators features: ['always-via:obscure:model', 'always-via:format-string', 'always-via:tito']"
                        }
                      },
                      "nestingLevel": 1
                    },
                    {
                      "location": {
                        "physicalLocation": {
                          "artifactLocation": {
                            "uri": "views.py",
                            "uriBaseId": "%SRCROOT%"
                          },
                          "region": {
                            "startLine": 24,
                            "startColumn": 19,
                            "endColumn": 36
                          }
                        },
                        "message": {
                          "text": "flow from views.operate_on_twos(...root...) -[into]-> eval(...sink...)"
                        }
                      },
                      "nestingLevel": 2
                    }
                  ]
                }
              ]
            }
          ]
        }

Notice the new object in SARIF output codeFlows
The PR is heavily inspired by _generate_trace_from_issue from the interactive output for SAPP

Test Plan

Analyzed exercise 3 with Pysa https://github.com/facebook/pyre-check/blob/main/documentation/pysa_tutorial/exercise3/ and attached the SARIF before and after
SARIF before this PR: exercise3_old.sarif.txt
SARIF after this PR: exercise3.sarif.txt
Screenshot before this PR Screenshot 2023-07-14 at 12 33 46

Screenshot after this PR Screenshot 2023-07-14 at 12 33 25

PS: I noticed there are no unit tests for SARIF output. I am happy to follow-up later with A PR that add some unit tests for SARIF export.
PS: I have also run the pre-submission checklist but I have noticed that the tests are already broken without any changes from my side.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 14, 2023
Copy link
Contributor

@arthaud arthaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work!

sapp/sarif.py Outdated Show resolved Hide resolved
sapp/ui/trace.py Outdated
},
"nestingLevel": nesting_level,
}
return location
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move all the logic regarding sarif into sarif.py?

Copy link
Contributor Author

@the-storm the-storm Jul 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which part exactly you'd like to move to sarif.py. The to_sarif function body just find the traces and calls _sarif_codeflow_location_from_trace_tuple. I don't want to move the latter to sarif.py because it is specific to traces. It is a private function that takes a trace_frame, tool and return the sarif codeflows object.

Copy link
Contributor

@arthaud arthaud Jul 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well IMO anything related to sarif should go into sarif.py. If we start putting function from every format into trace.py it will get overwhelming soon.
trace.py should provide a clean API that allows any formatter to get the data and do it's formatting.
From what I see, everything we use in the sarif functions is public? Tell me if I'm wrong.

sapp/sarif.py Outdated Show resolved Hide resolved
sapp/ui/trace.py Outdated Show resolved Hide resolved
sapp/ui/trace.py Outdated Show resolved Hide resolved
sapp/ui/trace.py Show resolved Hide resolved
sapp/ui/trace.py Outdated Show resolved Hide resolved
sapp/ui/trace.py Outdated Show resolved Hide resolved
sapp/ui/trace.py Show resolved Hide resolved
sapp/ui/trace.py Outdated Show resolved Hide resolved
@the-storm the-storm requested a review from arthaud July 20, 2023 14:52
sapp/ui/trace.py Outdated Show resolved Hide resolved
@arthaud
Copy link
Contributor

arthaud commented Jul 21, 2023

I still believe we should move that code into sarif.py.

@facebook-github-bot
Copy link
Contributor

@arthaud has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@arthaud
Copy link
Contributor

arthaud commented Jul 21, 2023

FYI There are type errors we need to fix:

[tools/sapp/sapp/sarif_types.py:45:0] Invalid type parameters [24]: Generic type `list` expects 1 type parameter, use `typing.List[<element type>]` to avoid runtime subscripting errors.
[tools/sapp/sapp/ui/issues.py:257:36] Incompatible parameter type [6]: In call `dict.__setitem__`, for 2nd positional argument, expected `Union[Dict[str, str], List[Dict[str, Dict[str, Union[Dict[str, int], Dict[str, str]]]]], str]` but got `List[Dict[str, List[Dict[str, List[typing.Any]]]]]`.
[tools/sapp/sapp/ui/trace.py:514:4] Incompatible return type [7]: Expected `Dict[str, Union[Dict[str, Union[Dict[str, Union[Dict[str, int], Dict[str, str]]], Dict[str, str]]], Dict[str, str]]]` but got `Dict[str, Union[Dict[str, Union[Dict[str, Dict[typing.Any, typing.Any]], Dict[str, str]]], int]]`.
Type `Dict[str, Union[Dict[str, Union[Dict[str, Union[Dict[str, int], Dict[str, str]]], Dict[str, str]]], Dict[str, str]]]` expected on line 534, specified on line 494. See [https://pyre-check.org/docs/errors...](https://pyre-check.org/docs/errors#covariance-and-contravariance) for mutable container errors.

@arthaud
Copy link
Contributor

arthaud commented Jul 21, 2023

I fixed the type errors and I am landing this. Please pull before working on follow-ups since I had to make many changes.

@facebook-github-bot
Copy link
Contributor

@arthaud merged this pull request in c4e6e2d.

@the-storm the-storm deleted the support-sarif-traces-sapp branch July 24, 2023 08:41
@the-storm the-storm mentioned this pull request Jul 25, 2023
facebook-github-bot pushed a commit that referenced this pull request Jul 27, 2023
Summary:
**Pre-submission checklist**
- [✅] I've ran the following linters locally and fixed lint errors related to the files I modified in this PR
    - [✅] `black .`
    - [✅] `usort format .`
    - [✅] `flake8`
- [✅]] I've installed dev dependencies `pip install -r requirements-dev.txt` and completed the following:
    - [✅] I've ran tests with `./scripts/run-tests.sh` and made sure all tests are passing
    - Tests are failing but already failing without this PR

This PR refactors the sarif relevant functions in the codebase and move it into `sarif.py` (centralized location). This was discussed as a follow-up to #93.

Pull Request resolved: #96

Test Plan: Run sapp on [exercise_3](https://github.com/facebook/pyre-check/blob/main/documentation/pysa_tutorial/exercise3/views.py) from Pysa tutorial and compared the output with this PR and without this PR and noticed no changes in the output.  SARIF output of [exercise3.txt](https://github.com/facebook/sapp/files/12164331/exercise3.txt)

Reviewed By: yuhshin-oss

Differential Revision: D47792963

Pulled By: arthaud

fbshipit-source-id: c58f2c5a1e1af5cd98210c4cd644c4c3d250154f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants