Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[uss_qualifier] Adding the interactions and invalid op checks steps #376

Merged

Conversation

punamverma
Copy link
Contributor

  1. Implemented test steps for checking interuss interactions and invalid data shared.
  2. Added Mock Uss FlightPlannerClient for clearing the area in prep_planners.

@punamverma
Copy link
Contributor Author

@BenjaminPelletier @barroco Need help. Not sure about the CI failure. Thanks.

@BenjaminPelletier
Copy link
Member

The end of the CI logs is:

2023-11-29 21:49:18.512 | ERROR    | monitoring.uss_qualifier.reports.validation.report_validation:validate_report:306 - Validation criterion 2 failed to validate.  Criterion definition:
applicability:
  skipped_actions: {}
pass_condition:
  elements:
    count:
      equal_to: 0.0

2023-11-29 21:49:18.512 | ERROR    | __main__:run_config:162 - Validation failed on test run report for configuration 'configurations.dev.uspace'

So, the criterion that the number of skipped actions = 0 was not validated. That means something was skipped. Downloading the monitoring-test-uss_qualifier-reports artifact and looking at the sequence view for U-space, I see:
Screenshot 2023-11-29 at 7 27 40 PM

Does that clarify?

@punamverma
Copy link
Contributor Author

Thanks @BenjaminPelletier. I think I understood it. I ll try fixing it, and let you know. Thanks.

@punamverma
Copy link
Contributor Author

@BenjaminPelletier Fixed. PR is ready for your review.

@punamverma punamverma marked this pull request as ready for review November 30, 2023 06:35
@punamverma punamverma changed the title [uss_qualifier]Adding the interactions and invalid op checks steps [uss_qualifier] Adding the interactions and invalid op checks steps Dec 1, 2023
@BenjaminPelletier
Copy link
Member

Let's put this PR into draft until ready for the next review

@punamverma
Copy link
Contributor Author

@BenjaminPelletier addressed the PR comments, PR is ready for your review again. Thanks.

Copy link
Member

@BenjaminPelletier BenjaminPelletier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you start to work on these comments, let's put this PR into draft state until it is ready for the next review.

@punamverma punamverma marked this pull request as draft December 4, 2023 07:03
@punamverma punamverma marked this pull request as ready for review December 12, 2023 07:01
@punamverma
Copy link
Contributor Author

@BenjaminPelletier Made the changes. Ready for your review.

self.begin_test_step("Validate flight2 GET interaction")
# ToDo - Add the test step details
self.end_test_step()
get_requested, already_notified = expect_get_requests_to_mock_uss(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why even make this call when tested_uss_notified since it doesn't do anything when that indicator is True? It seems like this test step is simply not applicable when tested_uss received a notification, so therefore it should be skipped. It would be ok to stop the test scenario entirely (exactly as in the original precondition), but we could continue the scenario in the presence of a notification.

In other words, we don't expect_get_requests_to_mock_uss when tested_uss_notified.

Copy link
Contributor Author

@punamverma punamverma Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to do expect_get_requests_to_mock_uss, as some USS implementation might make a GET request even if they are notified, in that case we don't want to stop the test. We want to check GET requests, and we don't care if they make GET requests with/without being notified.

We stop the test, when 'GET request is not made' and 'the USS has been notified'.
But, we don't stop if 'GET request is made' even when 'the USS has been notified/ not notified'. Depends on USS implementation.
And fail the check if 'GET request is not made', and 'USS is not notified'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this is what we want:

GET request made USS notified Desired outcome
No No Fail "Expect GET request" check
Yes No Pass "Expect GET request" check
No Yes Skip "Expect GET request" check since it's not applicable
Yes Yes Skip "Expect GET request" check since there is no way the USS could fail a requirement

To accomplish this, simply enclose the expect_get_requests_to_mock_uss call in an if not tested_uss_notified block and get rid of the evaluation of get_requested and already_notified just afterward.

Copy link
Contributor Author

@punamverma punamverma Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with rows 1, 2 and 3. But, 4th should be Pass instead of Skip.

GET request made is what we want the tested_uss to do. In both rows 2 and 4 in the above table, GET request is made.

In the test, the tested_uss makes a GET request for flight 2 while planning flight1. And the data through GET request is invalid, hence we expect tested_uss to fail a plan. Depending on the implementation, a USS might GET flight2 even though it received a notification for it. They might not use notifications for planning, instead make a fresh GET request when planning. We should not Skip the test for them, as they did make a GET request, and we know that plan was failed from previous step.

So, in both the rows of the table, I think the desired outcome should be Pass.

GET request made USS notified Desired outcome
Yes No Pass "Expect GET request" check
Yes Yes Pass "Expect GET request" check because this USS did make a GET request for flight2 which was invalid.

Would that be incorrect?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a USS already has the op intent details via a notification to their subscription, we would not expect them to perform a GET request. They are not strictly prohibited from making a GET request when they already have the details, but doing so would be unexpected and inefficient. The "Expect GET request" check is documented to say "[...] the only way to have verified this is by knowing all operational intent details, and (from previous checks of no notifications) the only way to know the operational intent details of flight is to have requested them via a GET details interaction." That check is not applicable and the documentation is not correct when the operational intent details are obtained a different way. It is therefore inappropriate to perform that check, and therefore it is inappropriate to mark it Passed.

GET request made is what we want the tested_uss to do.

Emphatically no; there is no requirement for a USS to make a GET request. The requirement (SCD0035) is that they get (and evaluate) the operational intent details before doing a particular thing. When they have no subscription, then we know they only way they can get the details is via a GET request, so in that situation we can conclude that they violated SCD0035 if they didn't make that GET request before doing the particular thing. When they do have a subscription, we can no longer conclude anything from the lack of a GET request which is why checking for a subscription used to be a precondition for performing this test step.

In the test, the tested_uss makes a GET request for flight 2 while planning flight1.

When they don't have a subscription, yes.

And the data through GET request is invalid, hence we expect tested_uss to fail a plan.

Not just the GET request, by the way -- the notification contains the same invalid data as well.

Depending on the implementation, a USS might GET flight2 even though it received a notification for it. They might not use notifications for planning, instead make a fresh GET request when planning.

Agreed (though this would be strange, inefficient, and eyebrow-raising; if they're not interested in the information they're getting through the subscription, then there is no reason for them to be loading the system with it)

We should not Skip the test for them, as they did make a GET request

No; the check says that the only way to have met SCD0035 in the situation we set up (where the check is performed) was to have performed a GET request. That is not true if they received a notification and also made a GET request (the situation to which the check pertains was never successfully set up).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed separately at length, stopping the test scenario if we found that tested_uss has a subscription is not appropriate because the test scenario can continue (not correct that ScenarioCannotContinue) and the rest of the scenario includes valuable checks that we can successfully make even when tested_uss has a subscription. This includes, for instance, the only checks currently for SCD0085.

It is definitely appropriate to skip the checks in "Validate flight2 GET interaction test step" because they are not applicable when tested_uss has a subscription (as we do by placing it within an if block). However, the following step "Validate flight1 Notification sent to Control_uss test step" can be performed regardless of whether a subscription was detected. Likewise for the remaining two cleanup test steps in the test case. The second test case is independent of the first test case and can be performed regardless of subscription status in the first test case. Even if we were confident that we would need to skip "Validate flight 2 GET interaction test step" in the second test case (which we are not; some circumstances and implementations may result in being able to perform just one or the other), proceeding with the test scenario would still be valuable in order to perform checks like the ones in "Tested_uss attempts to plan flight 1, expect failure test step" and "Validate flight 1 not shared by tested_uss test step".

For these reasons, the else block should be removed.

My understanding of your hesitance to remove the else block is that you want the scenario to somehow not be successful if that "main" step/check cannot be performed, and raising ScenarioCannotContinueError is the means to accomplish that. We discussed that scenarios do not fail except for execution errors that indicate the test itself is broken; rather success is based on all checks passing when they are, in fact, made during the test, and ensuring that the test was capable of detecting noncompliance for each of the requirements that needed to be verified. Therefore, raising ScenarioCannotContinueError does not indicate that the scenario was not successful. We also discussed that we generally try to verify as many requirements as we can with a given simple test scenario, so there is no one "main" step/check without which the scenario does not have value. In this case specifically, checking SCD0085 is very valuable because this is the only scenario in which that requirement is checked. Therefore, we should not forego checking SCD0085 just because we are unable to check SCD0035 in this particular way.

Copy link
Contributor Author

@punamverma punamverma Dec 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, got it. I understand that ScenarioCannotContinueError is not the correct error to stop the scenario. And as you think we should not stop the test. I think I am good with removing the else block, but that brings a question to me how do I use the report to show that a check got skipped. And, that could probably be my lack of knowledge of reporting framework in interuss.

I know that in case when notifications are made, GET request check is going to be marked Not Tested. Is that the indication we need to use?

My hesitance, is because I am not clear how in the report do we indicate that GET request validation was skipped? You had mentioned that SCD0035 will be verified else where, hence this wouldn't show as not verified.
Then is there a way that I can use to indicate GET request validation could not be performed and was skipped? And that brought up the question if we can create an interuss requirement that can help us to show that the check got skipped. So that we are not dependent on SCD0035 which is being tested at other places.
Or is there any other way we could indicate skipping of a validation test?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The top-level goal of automated testing is verifying participant compliance to requirements. The way automated testing accomplishes this is to set up many situations where particular outcomes of observations can allow us to conclude that the participant did not meet one or more requirements. We can never positively declare that a participant meets all requirements; we can only say that we checked a bunch of situations and were unable to declare the participant non-compliant with any of a set of requirements with which we could have potentially declared them non-compliant. With that in mind,

how in the report do we indicate that GET request validation was skipped

Skipped checks are indicated as Not tested in the tested requirements artifact. An example tested requirements artifact that is generated with every merge to main may be found here. However, skipping a check does not have any impact on whether we can verify compliance to the associated requirement unless that check is the only check verifying compliance to the requirement. This is because of the above: verification of compliance is based on 1) no failures and 2) capability to detect noncompliance.

Then is there a way that I can use to indicate GET request validation could not be performed and was skipped?

It shouldn't be important to know this. If a USS is required to perform actions that will allow us to set up the check, then we can indicate failure if the USS does not perform those actions, and therefore we will either perform the check or report a failure. If a USS is not required to perform actions that will allow us to set up the check, then we can perform the check if possible, but if we can't perform the check then we can't report any failures because the USS was not required to do the things that would enable us to perform the check. We are in the latter situation here. The problem is that you want to claim that there was a failure in that latter situation when in fact there was not. The fact that GET request validation could not be performed and was skipped is not relevant to whether the USS under test is fully compliant with all ASTM F3548-21 requirements.

InterUSS sometimes imposes additional requirements to make it possible to perform effective automated testing -- for instance, implementation of the flight_planning interface is not prescribed by ASTM or any other standards or regulatory body, rather it is something InterUSS requires a USS to do in order to be tested by uss_qualifier for ASTM F3548-21. We try hard to keep these additional requirements to a minimum, and we try extremely hard to make it so these requirements do not preclude any reasonable system that is compliant with the requirements we want to test from passing the test suite after some additional development wholly outside the system under test. We are open to potential additional requirements that follow these criteria, but I am not sure what requirement we could reasonably add here. "GET request validation must be performed" is not a requirement a USS can use to design their system. "A USS may not proactively make any subscriptions before a concrete flight plan is proposed and subscriptions must be removed after all flights in the area are finished" is a more reasonable requirement, but F3548-21 specifically allows this design freedom and InterUSS would not want to take away this design freedom.

NASA can make whatever requirements it wants if there is a NASA program under which USSs would be tested -- so, for instance, NASA could specify "A USS may not proactively [...]". However, that requirement would be above and beyond ASTM F3548-21 and therefore would be checked in a separate NASA program test suite and not in the ASTM F3548-21 test suite. NASA could even add a "GET request validation must be performed" requirement for the NASA test program, but that I would expect USSs to challenge the value of that requirement because it is not apparent what outcome could not be accomplished without this requirement. ASTM, through its standards development process, has come to the conclusion that a USS needs to verify an operational intent does not conflict with an equal priority operational intent under certain circumstances (SCD0035). This is a performance-based requirement that is purposefully agnostic of the means by which that requirement is accomplished. Making a GET request is one path to compliance with this requirement, but it is not the only one. If NASA requires that USSs must take the GET request path specifically, that limits design freedom that F3548 allows, and it is unclear what objective could not be met if an alternate path were taken (yet the F3548 performance-based requirement were still met).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that you want to claim that there was a failure in that latter situation when in fact there was not. The fact that GET request validation could not be performed and was skipped is not relevant to whether the USS under test is fully compliant with all ASTM F3548-21 requirements.

No, I am not wanting to show it as a failure. I am wanting to show it as a skipped test that we could not verify on that uss, because of certain conditions that occurred for this uss - in this case notifications due to subscription.

We are open to potential additional requirements that follow these criteria, but I am not sure what requirement we could reasonably add here. "GET request validation must be performed" is not a requirement a USS can use to design their system. "A USS may not proactively make any subscriptions before a concrete flight plan is proposed and subscriptions must be removed after all flights in the area are finished" is a more reasonable requirement, but F3548-21 specifically allows this design freedom and InterUSS would not want to take away this design freedom.

No, I don't want "GET request validation must be performed" instead it is "If a USS makes GET requests, they must validate the data received". And, if they don't we want to be able to know that it got skipped because they never made a GET request because of notifications due to subscriptions.

I am not sure how feasible would it be to ask that for the requirement "A USS may not proactively make any subscriptions before a concrete flight plan is proposed and subscriptions must be removed after all flights in the area are finished". But, if it can then that would definitely force a USS to make a GET request.
I am not sure if we want to that extent, but we can discuss that in our meeting.

NASA can make whatever requirements it wants if there is a NASA program under which USSs would be tested -- so, for instance, NASA could specify "A USS may not proactively [...]". However, that requirement would be above and beyond ASTM F3548-21 and therefore would be checked in a separate NASA program test suite and not in the ASTM F3548-21 test suite. NASA could even add a "GET request validation must be performed" requirement for the NASA test program, but that I would expect USSs to challenge the value of that requirement because it is not apparent what outcome could not be accomplished without this requirement. ASTM, through its standards development process, has come to the conclusion that a USS needs to verify an operational intent does not conflict with an equal priority operational intent under certain circumstances (SCD0035). This is a performance-based requirement that is purposefully agnostic of the means by which that requirement is accomplished. Making a GET request is one path to compliance with this requirement, but it is not the only one. If NASA requires that USSs must take the GET request path specifically, that limits design freedom that F3548 allows, and it is unclear what objective could not be met if an alternate path were taken (yet the F3548 performance-based requirement were still met).

I don't think there is any such NASA requirement at the moment, we can check on that just in case, during our meeting.
But through the report we would want to know that if GET request validation could not be done for a USS because they had notifications due to subscriptions. So, the USS is not non-compliant but there are some tests we could not do on a USS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made changes as discussed. Removed else block.

self._scenario.end_test_step()
return oi_full.reference

def expect_shared_with_invalid_data(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think expect_shared_with_specified_data is the better thing to do here because that clarifies/separates the responsibilities of 1) providing invalid data (responsibility of uss_qualifier/test designer) and 2) sharing the data provided (responsibility of mock_uss). But, we certainly shouldn't check for invalid data in a function whose name indicates it is going to check for specified data. If the implementation is going to stay as-is then it's better if the name stays as it was to accurately describe what the function does.

@punamverma
Copy link
Contributor Author

punamverma commented Dec 13, 2023

Made all changes, as required. Except for one comment, which I have replied. #376 (comment)

self.begin_test_step("Validate flight2 GET interaction")
# ToDo - Add the test step details
self.end_test_step()
get_requested, already_notified = expect_get_requests_to_mock_uss(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this is what we want:

GET request made USS notified Desired outcome
No No Fail "Expect GET request" check
Yes No Pass "Expect GET request" check
No Yes Skip "Expect GET request" check since it's not applicable
Yes Yes Skip "Expect GET request" check since there is no way the USS could fail a requirement

To accomplish this, simply enclose the expect_get_requests_to_mock_uss call in an if not tested_uss_notified block and get rid of the evaluation of get_requested and already_notified just afterward.

@punamverma
Copy link
Contributor Author

@BenjaminPelletier I removed the else block as discussed.

self.begin_test_step("Validate flight2 GET interaction")
# ToDo - Add the test step details
self.end_test_step()
if tested_uss_notified:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A useful tool to check the correctness/behavior of uss_qualifier functionality is to inspect the various artifacts produced from a test run. The CI runs most of the configurations in configurations/dev with output going to uss_qualifier/output -- this output is available as a downloadable artifact when the CI is run on GitHub, or it will be on your local machine if run locally. Here is a portion of the tested requirements artifact for the f3548_self_contained test configuration:

Screenshot 2023-12-15 at 3 55 05 PM

It seems like we would want to make sure that check happened.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed the condition.
But, I see that there are some subscriptions that are not getting cleaned up, and causing notifications to happen.

Copy link
Member

@BenjaminPelletier BenjaminPelletier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'll agree this was a challenging review. To try and make the next one more efficient, I'd like to try extra hard to follow the InterUSS policy on large contributions. It targets a maximum of 30 committer comments total for a PR (and ideally many fewer!). This PR gathered 91 committer comments (excluding 5 nits and 2 invalid comments) with 460 lines added when initially marked ready for review, so to target no more than 30 comments, let's not exceed 151 lines added in the next PR.

General guidance for making PRs small is linked in item 6 of General principles. I find Google's guidance very helpful also. In the case of this PR specifically, the first obvious place to split is the description, which includes two separate bullet points -- each of those bullets could have been a separate PR. In addition, implementation of the first test case could have been one PR followed by the implementation of the second test case in a follow-up PR. Similarly, only one test steps or a few test steps could be included in the first PR and later steps placed in followup PRs. Once it was obvious a non-trivial refactoring was necessary in scenarios/astm/utm/test_steps.py, this PR could have been paused to perform that refactoring in its own PR first.

InterUSS committers can also help suggest where to split. Ideally, that suggestion would be based solely on a plain-English summary of the work as the idea is to split by concept rather than by any particular features of the code files. As a last resort, we could also potentially look at the code of a larger PR to make a suggestion, but this would be undesirable as it would require reading and understanding the entire code of the larger PR before being able to make a suggestion -- that's exactly one of the things small PRs try to avoid.

@BenjaminPelletier BenjaminPelletier merged commit 6139bde into interuss:main Dec 17, 2023
9 checks passed
github-actions bot added a commit that referenced this pull request Dec 17, 2023
* Adding the interactions and invalid op checks steps

* Added mock_uss to prep_planners doc

* fix format

* Fix doc

* Adding missing code

* Adding missing mock_uss resource to PrepareFlightPlanners

* Fix per PR comments

* Removing post check for a particular url

* Fixing check details

* Adding 5 s back, removing url check no post interactions

* Wait only for checking  POST notifications

* Fix per PR comments

* Fix per PR review

* Fix format

* Refactoring code, removed wait time for other than post interactions check

* querying for interactions optimistically

* Fix passing ref id

* Update monitoring/uss_qualifier/scenarios/astm/utm/data_exchange_validation/test_steps/validate_no_notification_operational_intent.md

Co-authored-by: Benjamin Pelletier <[email protected]>

* Fix per review comments

* Fix per review

* Adding missing code for Query

* Fixed documentation for wait time as per PR comments

* Adding subscription check with GET request check

* Fixed per PR comments

* Fix per review

* Removing args from comments

* Conditional call of GET request based on notification

* Refining check message

* Removing else condition, and renaming test step and check as discussed

* Changing method name per review

* fixing the condition for GET request check

* Want to trigger CI run that failed on netrid test

---------

Co-authored-by: Benjamin Pelletier <[email protected]> 6139bde
github-actions bot added a commit to BradNicolle/monitoring that referenced this pull request Dec 18, 2023
…s#376)

* Adding the interactions and invalid op checks steps

* Added mock_uss to prep_planners doc

* fix format

* Fix doc

* Adding missing code

* Adding missing mock_uss resource to PrepareFlightPlanners

* Fix per PR comments

* Removing post check for a particular url

* Fixing check details

* Adding 5 s back, removing url check no post interactions

* Wait only for checking  POST notifications

* Fix per PR comments

* Fix per PR review

* Fix format

* Refactoring code, removed wait time for other than post interactions check

* querying for interactions optimistically

* Fix passing ref id

* Update monitoring/uss_qualifier/scenarios/astm/utm/data_exchange_validation/test_steps/validate_no_notification_operational_intent.md

Co-authored-by: Benjamin Pelletier <[email protected]>

* Fix per review comments

* Fix per review

* Adding missing code for Query

* Fixed documentation for wait time as per PR comments

* Adding subscription check with GET request check

* Fixed per PR comments

* Fix per review

* Removing args from comments

* Conditional call of GET request based on notification

* Refining check message

* Removing else condition, and renaming test step and check as discussed

* Changing method name per review

* fixing the condition for GET request check

* Want to trigger CI run that failed on netrid test

---------

Co-authored-by: Benjamin Pelletier <[email protected]> 6139bde
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants