Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mc_revise_pr_type_checking_and_linting #77

Merged
merged 30 commits into from
Apr 24, 2024

Conversation

maxachis
Copy link
Collaborator

Fixes

Description

  • The prior implementation include a lot of logic within the yaml file itself and was consequently difficult to debug and prone to unexpected errors.
  • The new version migrates the majority of logic into python scripts, where the logic is rendered more comprehensible.
  • Additionally, substantial logging has been added, to further improve debugging capabilities.

Testing

  • Internal testing has been added via test_type_and_docstyle_checking.py

Prior testing was demonstrated in the following PRs:

  • This PR, which (after a few modifications) showcases how it would work on a PR
  • This other PR, which shows it functioning again, but now only on the new file I've created. The other file, despite not being corrected, is not checked -- it only runs these checks on files that have been modified.

Performance

  • As before, the Github action is light, and should run in 30 seconds or less
  • Some overhead is added by transferring logic to Python, but this performance cost should be in the realm of milliseconds.
  • In the case of performance dips, additional enhancements can be done to cache mypy and pydocstyle modules, although these are fairly light regardless

This change introduces a new utility script that checks type hints and docstring styles in Python code files. It looks specifically for modified files in a Git repository and runs mypy and pydocstyle on them. The utility is tested with various scenarios, including cases with no file modifications and cases where errors are detected.
Added a new function named 'get_project_root' in miscellaneous_functions.py. This function locates the root directory of the project by looking for root markers. It starts by checking the current file's directory and moving up levels until a root marker is found. This enhances code reusability and makes reference to root directory easier.
Enhanced the script to provide a clearer error reporting method. It now segregates 'mypy' and 'pydocstyle' errors into distinct sections. Furthermore, the script can find and use a ".pydocstyle" configuration file if present at the repository root.
Removed specific mypy and pydocstyle tests. Improved error detection and implemented changes to test the type and docstyle checking functionalities more efficiently and thoroughly. The error reporting method is also enhanced to segregate 'mypy' and 'pydocstyle' errors.
The test function `test_check_files_with_errors` has been removed from `test_type_and_docstyle_checking.py`. This function previously simulated a scenario where errors are found in scripts and expected a `SystemExit` to be raised. It will no longer be part of our test suite.
The `run_command` function in `util/type_and_docstyle_checking.py` has been refactored to improve error handling, using a try/except block. Debugging output has also been significantly enhanced across all functions, aiding in easier detection and tracing of potential issues.
The code for finding and checking modified Python files in the GitHub workflow has been simplified. It now makes use of a Python script `util/type_and_docstyle_checking.py` which improves error handling and debugging outputs, ideally leading to an easier detection and tracing of potential issues.
@maxachis maxachis requested a review from mbodeantor April 18, 2024 12:07
@maxachis maxachis requested a review from mbodeantor as a code owner April 18, 2024 12:07
@maxachis maxachis added the enhancement New feature or request label Apr 18, 2024
A code snippet has been added in `util/type_and_docstyle_checking.py` to adjust the current working directory and path settings. This code addition aids in solving import issues that might otherwise crop up, due to the scripts being run from different locations.
The command_to_run assignment in `util/type_and_docstyle_checking.py` has been modified to use a simplified one-line string format. This was done to prevent possible errors stemming from having a multi-line shell command.
The process for handling exceptions in util/type_and_docstyle_checking.py has been adjusted to include the standard output. This was done because some error responses will still generate standard output, which can provide valuable debugging information.
The error handling in the function run_command() of util/type_and_docstyle_checking.py has been simplified. It no longer explicitly handles subprocess.CalledProcessError exceptions, as there is no longer a specific need to print any error messages or outputs in case of failure. The standard output is simply returned.
The function run_command() in util/type_and_docstyle_checking.py has been wrapped in a try/except block to handle subprocess.CalledProcessError exceptions. When subprocess.run encounters a CalledProcessError, the standard output of that subprocess is returned instead of letting the error halt the execution.
A new function named print_header has been added to miscellaneous_functions.py. This function prints a given text as a header-type string with borders above and below it. This will be utilized to improve output readability.
The error announcement segment in 'type_and_docstyle_checking.py' has been updated to use the 'print_header' function. This improves readability by delineating results from MYPY and PYDOCSTYLE checks.
The regular expression used for identifying error counts in 'type_and_docstyle_checking.py' has been revised. The change now allows for matching any digit, thereby increasing the flexibility and accuracy when matching with the number of errors in the output.
@josh-chamberlain
Copy link
Contributor

Thanks for making test issues so we can quickly see the behavior!

  1. I can't find where I commented this before, but this adds some comment/notifications noise. Is there always going to be a comment, or is there a way for it to show up as a test as in the PR I'm commenting on? Is there a neutral option, so it's not a pass/fail test, or a way to label it "optional" or "advisory"? This is the perfect contextual place to deliver that kind of information, and update it when it changes:
Screen Shot 2024-04-19 at 12 04 33 PM
  1. From your comment:

One challenge here seems to be that my current iteration doesn't play nice with forked branches. I'll see if I can tweak that.

Is that resolved? Anyone not explicitly added as a contributor will be making forks instead of branches.

Updated the fetching logic in the Github Actions workflow for Python checks to improve readability and clarity. The fetching of base and head branches is now separated into two distinct parts with clear naming conventions. Introduced a new environment variable, HEAD_REPO_PATH, to reference checked-out forks.
@maxachis
Copy link
Collaborator Author

maxachis commented Apr 20, 2024

@josh-chamberlain

  1. Unfortunately, there's no advisory option when it comes to Github Actions. However, there are other options. One of them is to post a single comment for the checks, and have any subsequent runs update that same comment. That would in my view be the best way to balance it, but I'll defer to your thoughts on the matter.
  2. Just resolved that, so yes!

Adjusted the Github Actions workflow for Python checks by replacing the HEAD_REPO_PATH environment variable with a common directive, "working-directory". This optimizes the fetching logic and makes the checked-out path easier to reference.
The Github Actions workflow for Python code and docstyle checking has been modified. A new environment variable named 'HEAD_REPO_PATH' now aids in accurately representing the checked-out path. This update also simplifies the error messages highlighting unset required environment variables to facilitate easier debugging.
This update introduces 'HEAD_REPO_PATH' environment variable to better track repository paths in the docstyle checking section of the python workflow. In addition, the error message for missing environment variables has been simplified for better clarity. This enhances user feedback and may facilitate faster debugging.
@josh-chamberlain
Copy link
Contributor

@maxachis

  1. can we use check runs? they support a few different conclusions. https://docs.github.com/en/rest/checks/runs?apiVersion=2022-11-28
  2. thanks!

The commit adds write permissions for issues and pull-requests in the GitHub workflow file (.github/workflows/python_checks.yml). These permissions allow modifications to be performed in issues and pull-requests during workflow execution.
The changes aim to enhance the Python linting process during CI/CD. The source repository checkout and Python environment setup steps have been updated to use newer versions. Additionally, the action used for flake8 linting has been changed to
The changes aim to enhance the Python linting process during CI/CD. The source repository checkout and Python environment setup steps have been updated to use newer versions. Additionally, the action used for flake8 linting
The workflow file '.github/workflows/python_checks.yml' has been modified to enhance Python code quality checks. It now uses the 'reviewdog/action-flake8' instead of 'py-actions/flake8', allowing for more extensive code checks via newly added flake8 plugins. Also, it only runs on pull requests, and has been updated to use older versions of 'checkout' and 'setup-python' actions.
The 'python_checks.yml' workflow has been updated to use newer versions of the 'checkout' and 'setup-python' actions. The upgraded versions are expected to improve the efficiency and reliability of the Python code quality checks. Moreover, additional flake8 plugins have been incorporated to extend code verification.
The 'python_checks.yml' workflow has been modified to integrate Reviewdog with flake8 for linting Python code. This allows linting results to be reported in a more user-friendly manner. Prior to this update, basic flake8 linting was used.
The update in 'python_checks.yml' integrates Reviewdog with flake8, enhancing our Python code linting process. This upgrade allows linting outcomes to be presented in a more efficient and user-friendly style, instead of the previous basic flake8 linting.
The update in 'python_checks.yml' integrates Reviewdog with flake8, enhancing our Python code linting process. This upgrade allows linting outcomes to be presented in a more efficient and user-friendly style, instead of the previous basic flake8 linting.
The update in 'python_checks.yml' integrates Reviewdog with flake8, enhancing our Python code linting process. This upgrade allows linting outcomes to be presented in a more efficient and user-friendly style, instead of the previous basic flake8 linting.
The update in 'python_checks.yml' integrates Reviewdog with flake8, enhancing our Python code linting process. This upgrade allows linting outcomes to be presented in a more efficient and user-friendly style, instead of the previous basic flake8 linting.
@maxachis
Copy link
Collaborator Author

maxachis commented Apr 23, 2024

@josh-chamberlain After much trial and error, I seem to have found the best solution, one that indeed utilizes check runs and is much simpler in execution, to boot. If you look at the "Files Changed" tab, you can see the results. Have a look and give me your thoughts. The files modified will be deleting as they're no longer necessary, but I figured it's useful to keep them here as a demonstration, for the moment.

One next step might be to decide which annotations we don't want. It's probably easier to suggest modifications to this as we go along and find ones we think are disruptive.

@josh-chamberlain
Copy link
Contributor

thanks for being willing to go back to various drawing boards @maxachis ! I think it's much nicer in the check runs section, and the inline notes are perfect. As someone who never contributes Python code, I will defer to others on which annotations we care about, but it seems these can all be safely ignored if needed

@maxachis
Copy link
Collaborator Author

@mbodeantor What are your thoughts? How does this look, and what annotations might we want to keep?

@mbodeantor
Copy link
Contributor

@maxachis couldn't we reuse the github actions using black from data-sources-app?

@maxachis
Copy link
Collaborator Author

@maxachis couldn't we reuse the github actions using black from data-sources-app?

@mbodeantor Black functions as a Code formatter, not a code linter. Thus, while Black is useful for ensuring that code adheres to a specific format, it will not check for docstring formatting, type hinting issues, unused variables, or other problems.

@mbodeantor
Copy link
Contributor

mbodeantor commented Apr 24, 2024

@maxachis Ah nice. Yeah I would say we go with this for now and make changes as seem appropriate moving forward. There's a lot of tickets around the V2 changes so it would be a huge help to get some focus on those

@mbodeantor mbodeantor merged commit 203cc0b into main Apr 24, 2024
3 checks passed
@maxachis maxachis deleted the mc_revise_pr_type_checking_and_linting branch April 24, 2024 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants