Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PASS_TO_PASS and FAIL_TO_PASS test cases #257

Open
chenzimin opened this issue Nov 21, 2024 · 0 comments
Open

PASS_TO_PASS and FAIL_TO_PASS test cases #257

chenzimin opened this issue Nov 21, 2024 · 0 comments
Labels
documentation Improvements or additions to documentation

Comments

@chenzimin
Copy link

chenzimin commented Nov 21, 2024

Describe the issue

Hi, sorry if I missed this in your paper or somewhere in your GitHub repository.

But I tried to execute all existing test cases for a given SWE-bench project, for example by using the docker image (I assume that SWE-bench team have uploaded it) from https://hub.docker.com/r/swebench/sweb.eval.x86_64.psf_1776_requests-1142:

docker run -it swebench/sweb.eval.x86_64.psf_1776_requests-1142:v1
pytest -rA

Here is the full list of test results:

PASSED test_requests.py::RequestsTestCase::test_basic_building
PASSED test_requests.py::RequestsTestCase::test_entry_points
PASSED test_requests.py::RequestsTestCase::test_invalid_url
PASSED test_requests.py::RequestsTestCase::test_params_are_added_before_fragment
PASSED test_requests.py::RequestsTestCase::test_path_is_not_double_encoded
FAILED test_requests.py::RequestsTestCase::test_BASICAUTH_TUPLE_HTTP_200_OK_GET - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_DIGESTAUTH_WRONG_HTTP_401_GET - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_DIGEST_HTTP_200_OK_GET - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_HTTP_200_OK_GET_ALTERNATIVE - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_HTTP_200_OK_GET_WITH_MIXED_PARAMS - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_HTTP_200_OK_GET_WITH_PARAMS - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_HTTP_200_OK_HEAD - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_HTTP_200_OK_PUT - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_HTTP_302_ALLOW_REDIRECT_GET - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_POSTBIN_GET_POST_FILES - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_POSTBIN_GET_POST_FILES_WITH_DATA - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_custom_content_type - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_decompress_gzip - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_different_encodings_dont_break_post - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_links - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_prepared_request_hook - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_request_ok_set - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_status_raising - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_unicode_get - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_urlencoded_get_query_multivalued_param - TypeError: __init__() got an unexpected keyword argument 'strict'
FAILED test_requests.py::RequestsTestCase::test_user_agent_transfers - TypeError: __init__() got an unexpected keyword argument 'strict'

The first question is why does not all test cases pass? I also found this is the case for several other instances.

The second question is, what is your reasoning for not allowing the use PASS_TO_PASS test cases for evaluation of SWE-bench? This question is related to the first one, the test cases that pass before fixing the issue shouldn't be a secret, having the existing test failing makes it harder to determine if they are caused by the bug that the issue raised, or if it is SWE-bench issue.

I come from automated program repair background, therefore in there, the usual assumption is that existing test cases will pass and we will use them as regression test to test that our patch did not break the existing functionality.

Suggest an improvement to documentation

No response

@chenzimin chenzimin added the documentation Improvements or additions to documentation label Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

1 participant