Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(iast): refactor iast request context by core.context #10988

Open
wants to merge 77 commits into
base: main
Choose a base branch
from

Conversation

avara1986
Copy link
Member

@avara1986 avara1986 commented Oct 9, 2024

This PR continues the work Christophe started in PR #10899, focusing on a refactor of the IAST context. Key highlights of this refactor include:

  • Similar to PR chore(asm): refactor replacing old asm_request context by core.context #10899, an IASTEnvironment class has been introduced, which contains request_enabled and IastSpanReporter. These no longer depend on spans, only on the core context.
  • The IAST context is still created in the Span Processor on_span_start.
  • The context now finalizes with with core.context_with_data, and when the WSGI request ends, it calls the listener context.ended.wsgi.__call__. Previously, when calling Span Processor on_span_finish, the context was sometimes lost. Now, it ensures that the context hasn’t already been reported to avoid this issue, as seen in cases like the Django tests, where Span Processor.on_span_finish is called instead of context.ended.wsgi.__call__.
  • The thread_local struct ThreadContextCache_ has been temporarily removed to align C++ context behavior with Python’s. This is reflected in the test test_oce_concurrent_requests_futures_in_spans. That is a rollback of fix(iast): improve overhead control logic #8452.
  • The function _patched_fastapi_function and Header patching was removed due to an error in FastAPI v0.92
  • Additional validation points have been added for is_iast_request_enabled to avoid unnecessary operations.
  • The logic from AppSecIastSpanProcessor.is_span_analyzed has been moved to is_iast_request_enabled.
  • _asm_request_context.listen has been renamed to _asm_request_context.asm_listen to avoid confusion with other listen methods.
  • _asm_request_context.in_context has been renamed to _asm_request_context.in_asm_context to differentiate it from in_iast_context.
  • The import of DDWaf_result from ddtrace.appsec._ddwaf has been moved inside the if TYPE_CHECKING block to prevent circular imports caused by this refactor.
  • IAST wrappers: check if wrapped function has a func to skip RecursionError: maximum recursion depth exceeded in comparison in some Python versions https://gitlab.ddbuild.io/DataDog/apm-reliability/dd-trace-py/-/jobs/668530031

Tests:

  • All IAST tests have been refactored to accommodate the new way of creating and using the context.
  • The test test_weak_hash_new_with_child_span has been removed because we no longer depend on spans or child spans.
  • The deprecated attribute oce._enabled = True has been removed.
  • An IAST test in tests/tracer/test_trace_utils.py had to be updated and was moved to taint_sinks/test_insecure_cookie.py, so it requires validation from the @DataDog/apm-sdk-api-python team.
  • Context creation/finalization has been updated in the benchmarks, requiring validation from the @DataDog/apm-core-python team.

Checklist

  • PR author has checked that all the criteria below are met
  • The PR description includes an overview of the change
  • The PR description articulates the motivation for the change
  • The change includes tests OR the PR description describes a testing strategy
  • The PR description notes risks associated with the change, if any
  • Newly-added code is easy to change
  • The change follows the library release note guidelines
  • The change includes or references documentation updates if necessary
  • Backport labels are set (if applicable)

Reviewer Checklist

  • Reviewer has checked that all the criteria below are met
  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Newly-added code is easy to change
  • Release note makes sense to a user of the library
  • If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

christophe-papazian and others added 30 commits October 2, 2024 11:19
@avara1986 avara1986 marked this pull request as ready for review October 14, 2024 10:24
@avara1986 avara1986 requested review from a team as code owners October 14, 2024 10:24
@avara1986 avara1986 requested a review from a team as a code owner October 14, 2024 12:50
@avara1986 avara1986 force-pushed the avara1986/refactor_iast_request_context_to_core branch from ccc8a9e to e4cc5f4 Compare October 14, 2024 14:04
Comment on lines +199 to +201
# log_messages = [record.message for record in caplog.get_records("call")]
# if not any(message.startswith("[IAST] no vulnerability quota") for message in log_messages):
# pytest.fail()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# log_messages = [record.message for record in caplog.get_records("call")]
# if not any(message.startswith("[IAST] no vulnerability quota") for message in log_messages):
# pytest.fail()

for message in log_messages:
if IAST_VALID_LOG.search(message):
pytest.fail(message)
# TODO(avara1986: Django raises message like "no quota", its ok
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe create a Jira task instead?

Comment on lines +139 to +154
# try_wrap_function_wrapper(
# "starlette.datastructures",
# "Headers.__getitem__",
# functools.partial(if_iast_taint_returned_object_for, OriginType.HEADER),
# )
# try_wrap_function_wrapper(
# "starlette.datastructures",
# "Headers.get",
# functools.partial(if_iast_taint_returned_object_for, OriginType.HEADER),
# )
# try_wrap_function_wrapper(
# "fastapi",
# "Header",
# functools.partial(_patched_fastapi_function, OriginType.HEADER),
# )
# _set_metric_iast_instrumented_source(OriginType.HEADER)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this be related to the FastAPI issue? Maybe just remove it?

Suggested change
# try_wrap_function_wrapper(
# "starlette.datastructures",
# "Headers.__getitem__",
# functools.partial(if_iast_taint_returned_object_for, OriginType.HEADER),
# )
# try_wrap_function_wrapper(
# "starlette.datastructures",
# "Headers.get",
# functools.partial(if_iast_taint_returned_object_for, OriginType.HEADER),
# )
# try_wrap_function_wrapper(
# "fastapi",
# "Header",
# functools.partial(_patched_fastapi_function, OriginType.HEADER),
# )
# _set_metric_iast_instrumented_source(OriginType.HEADER)

@@ -139,6 +142,9 @@ def is_pyobject_tainted(pyobject: Any) -> bool:

def taint_pyobject(pyobject: Any, source_name: Any, source_value: Any, source_origin=None) -> Any:
# Pyobject must be Text with len > 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is now far from the logic it explains

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ASM Application Security Monitoring changelog/no-changelog A changelog entry is not required for this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants