Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git diff support #296

Closed
wants to merge 23 commits into from
Closed

git diff support #296

wants to merge 23 commits into from

Conversation

msftcangoblowm
Copy link

  • feat: entrypoint sphobjinv-textconv act as textconv so can git diff inventory #295
  • test: add sphobjinv-textconv unittests offline and online
  • test: add integration test. Demonstrate git diff objects.inv
  • test: add pytest module with class to interact with git
  • docs: add step by step guide to configure git. Both bash and python code examples
  • docs: added sphobjinv-textconv API docs

Is the PR a fix or a feature?
This is a new feature release

Bump the version up to 2.4.0

Describe the changes in the PR
Closes #295

Does this PR close any issues?
Closes #295

Does the PR change/update the following, if relevant?

  • Documentation
  • Tests
  • CHANGELOG
  • version bump
  • doctest
  • checklink
  • tox -e flake8-noqa
  • tox -e interrogate
  • pre-commit
  • Integration tests
  • 100% coverage
  • [no] static type checking
  • [no] modifies sphobjinv entrypoint codebase
  • [no] ran CI/CD workflows (can't)

- feat: entrypoint sphobjinv-textconv ([bskinn#295])
- test: add sphobjinv-textconv unittests offline and online
- test: add integration test. Demonstrate git diff objects.inv
- test: add pytest module with class to interact with git
- docs: add step by step guide to configure git. Both bash and python code examples
- docs: added sphobjinv-textconv API docs
- fix: py38 with statements multiple --> nested
- fix: py38 standard types typing isms e.g. List
- fix: use Windows friendly line seperators
- test: print diagnostic info on Windows bin and site folders
- test: print entire os.environ rather than a single key
- test: for Windows, attempt add SCRIPTS folder to sys.path
- test: for Windows, walk SCRIPTS folder print files and folders
- test: wrong params for list.append instead use list.insert
@msftcangoblowm
Copy link
Author

I'm flying blind

Not a Windows user nor have access to a Windows box or VM

This is a plea for help!

On Windows, how to get the absolute path of these executables: sphobjinv and sphobjinv-textconv

These are incorrect

path_soi = Path(sys.executable).parent.joinpath("sphobjinv")
path_soi_textconv = Path(sys.executable).parent.joinpath("sphobjinv-textconv")

Affects

  • tests/test_cli_textconv
  • tests/test_cli_textconv_with_git

@msftcangoblowm
Copy link
Author

Packages site path: C:\Users\VssAdministrator\AppData\Roaming\Python\Python38\site-packages

site packages: ['c:\hostedtoolcache\windows\python\3.8.10\x64', 'c:\hostedtoolcache\windows\python\3.8.10\x64\lib\site-packages']

user site packages: 'C:\Users\VssAdministrator\AppData\Roaming\Python\Python38\site-packages'

path_scripts: C:\Users\VssAdministrator\AppData\Roaming\Python\Python38\SCRIPTS

@bskinn
Copy link
Owner

bskinn commented Aug 27, 2024

I would say, take a look at the other CLI tests and see if it's possible to adapt the approach I took with them to your new tests.

It's possible the needs here are different enough that my existing technique is insufficient, but I'd at least give it a shot first.

(That is, if you haven't already.)

@bskinn
Copy link
Owner

bskinn commented Aug 28, 2024

I'm not sure where the script is; looking at the output of this job step, though, it looks like it's in D:\a\1\s, at least for this run.

You could try the where command -- it's analogous to which on Linux.

@bskinn
Copy link
Owner

bskinn commented Aug 28, 2024

I'm also not sure why you need to invoke the sphobjinv-textconv with the full path to the entrypoint script. It should be on path; can't you just call it without any path specified?

@msftcangoblowm
Copy link
Author

Will follow your advice and try with a relative path rather than an absolute path. You are right, the venv bin|SCRIPTS folder should already be on the path

- test: for subprocess calls use relative path not absolute path
- test: carefully escape regex metacharacters
- test: print source .inv file and diff. On windows, issue with regex
- test: remove fixture windows_paths
- test: .git/config textconv executable resolve path
- test: shutil.which to resolve path to executable
- test: resolve both soi and soi-textconv executables path
- test: regression when to use resolved or unresolved executable path
- test: read inventory on disk
- test: print diagnostic before assertions
- test: consistantly use sphobjinv.cli.load:import_infile so compare apple with apples
Copy link
Owner

@bskinn bskinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few high-level thoughts/questions.

@@ -8,6 +8,101 @@ and this project follows an extension of
fourth number represents an administrative maintenance release with no code
changes.

### [2.4.1.10] - 2024-08-28
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msftcangoblowm Unless it strongly benefits your development process, please don't bother with bumping versions and documenting microchanges in the CHANGELOG this way.

I'll be cutting the actual v2.4.0 release only after this PR is merged to main, and anything you've added to the CHANGELOG I'll coalesce and probably squash.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine.

These are actually development prereleases, not administrative prereleases. So these entries are wrong and spammy. The final changelog entries are left to your discretion

Also refrained from making any tags


# .gitattributes
# Informs git: .inv are binary files and which cmd converts .inv --> .txt
path_ga = path_cwd / ".gitattributes"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you're doing this programmatically, instead of just revising the actual .gitattributes for the project?

Copy link
Author

@msftcangoblowm msftcangoblowm Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will follow your advice and use the projects .gitattributes file via a fixture

wd.add_and_commit(reason="sphobjinv-textconv", signed=False)

# Act
# make change to .txt inventory (path_dst_dec)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you're generating the objects.inv to be diffed on the fly?

It seems like it would be a lot easier to prepare one ahead of time and commit it to /tests/resource... that's what the dir is there for.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a realistic workflow. A line is added to a .txt --> soi --> .inv

soi convert -q zlib dec cmp with resource absolute paths did not work on Windows. So that issue would not have been found, if had not gone thru the entire workflow.

Will isolate that issue into a separate test and then follow your advice on using two pre-built resources

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind on this -- I was thinking about it incorrectly. git diff acts on the same file over time, not on two different files.

Instead, I'd recommend creating a small, trivial objects_textconv.inv for this, and then making a new version and committing a new version of that same objects_textconv.inv. That will give the two different versions of the files to be git diff-ed.

And then, the git diff command in the test would be something like git diff <sha1>..<sha2> tests/resource/objects_textconv.inv.

I might still not have this quite right, but it's at least closer.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

soi convert -q zlib dec cmp with resource absolute paths did not work on Windows. So that issue would not have been found, if had not gone thru the entire workflow.

This doesn't make sense to me, I can use absolute paths on Windows without problems in normal operation mode.

It could be something going funny with the Azure CI environment, but it's concerning that absolute paths aren't working correctly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the next commit, followed your advice on fixtures and created

  • tests/resource/objects_attrs_plus_one_entry.inv
  • tests/resource/objects_attrs_plus_one_entry.txt

tests/resource/objects_attrs_plus_one_entry.inv overwrites cwd/objects.inv

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a separate PR. Confirm, on Windows and Azure CI, what's the deal with absolute resource paths and soi convert

In this PR, that code has been removed

@@ -0,0 +1,239 @@
"""Class for working with pytest and git.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it work to make these artifacts available to the tests using fixtures, instead of via relative import?

return ret


class WorkDir:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this class is indeed specific for working with git, its name should reflect that.

from typing import List


def run(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is a simpler variation of the shell-running capability elsewhere in the test suite. What motivated you to implement your own here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests/wd_wrapper.py is a standalone component under setuptools-scm author's copyright. (my voice not the setuptools-scm author) It should be fully functional by itself, without dependencies outside of StdLib

The (subprocess wrapper module func) run, i provided, replaces the one offered by setuptools-scm in the original tests/wd_wrapper.py. Adding a setuptools-scm dependency for just one function in one module maybe a bad idea.

Elsewhere in the test suite, subprocess.check_output is used. It's obsoleted by subprocess.run.

sphobjinx has no dedicated _run_cmd module

In test_cli_textconv.py, TestTextconvStdioFail.test_cli_textconv_zlib_inv_stdin uses param subprocess.run.input to pipe in stdin and do a subprocess in one command! It's a little exotic, non-typical usage. So different from what's normally in a _run_cmd module.

_run_cmd

wd_wrapper.py original

Comment on lines +84 to +103
# Ideally, the only place sys.exit calls occur within a codebase is in
# entrypoint file(s). In this case, sphobjinv.cli.core
#
# Each unique custom Exception has a corresponding unique exit code.
#
# Testing looks at exit codes only.
#
# Not the error messages, which could change or be localized
#
# In command line utilities, relaying possible errors is common practice
#
# From an UX POV, running echo $? and getting 1 on error is
# useless and frustrating.
#
# Not relaying errors and giving exact feedback on how to rectify
# the issue is bad UX.
#
# So if the numerous exit codes of 1 looks strange. It is; but this is
# a separate issue best solved within a dedicated commit
assert f"EXIT CODES{os.linesep}" in str_out
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll definitely be taking a closer look at this, for the project as a whole. I appreciate you pointing it out.

Could you actually duplicate these thoughts into a new ticket on the project?

(I'm aware that the exception handling in the project is substandard and intend to do something about it eventually (see #118), but it hadn't occurred to me to tie an exception hierarchy directly to the CLI exit codes in this way.)

Copy link
Author

@msftcangoblowm msftcangoblowm Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for accepting the exception handling in cli entrypoints as a worthy issue.

Will create a new issue(s) ... after completing #295

In #296, purposefully avoiding modifying non-textconv modules. Figuring that it's outside of this PR's scope / mandate.

- test: add resources objects_attrs_plus_one_entry.{txt|inv}
- test: compare existing resources. Rather modify then .txt --> .inv
- test: add fixtures res_cmp_plus_one_line is_linux gitconfig gitattributes
- test(test_api_good_nonlocal): provide explicit reasons to skip test
- refactor .git/config append algo
- refactor: print git diff err message
- fix: specify encoding and linesep
@bskinn
Copy link
Owner

bskinn commented Aug 30, 2024

I did some digging and testing -- a separate sphobjinv-textconv entrypoint that takes the single path argument isn't actually needed.

Combining this Stack Overflow answer with this section of the Git docs, if I add the following to my .gitattributes file:

*.inv diff=objects_inv

And add this to my .git/config:

[diff "objects_inv"]
	textconv = sh -c 'sphobjinv co plain "$0" -'

It works:

(sphobjinv) C:\git\sphobjinv>git diff
diff --git a/.gitattributes b/.gitattributes
index c6f20db..2a23375 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -1,2 +1,3 @@
 tests/resource/objects_mkdoc_zlib0.inv binary
 tests/resource/objects_attrs.txt binary
+*.inv diff=objects_inv




diff --git a/tests/resource/objects_textconv.inv b/tests/resource/objects_textconv.inv
index 31805f5..19a4323 100644
--- a/tests/resource/objects_textconv.inv
+++ b/tests/resource/objects_textconv.inv
@@ -2,6 +2,6 @@
 # Project: textconv-test
 # Version: 2.3
 # The remainder of this file is compressed using zlib.
-sphobjinv.cli.convert py:module 0 cli/implementation/convert.html#module-$ -
-sphobjinv.cli.convert.do_convert py:function 1 cli/implementation/convert.html#$ -
+sphobjinv.cli.convert23 py:module 0 cli/implementation/convert.html#module-$ -^M
+sphobjinv.cli.convert.do_convert234 py:function 1 cli/implementation/convert.html#$ -^M

So, while I really appreciate all of the effort you've gone to here, I believe you can already do what you want with the above recipe.

Open to correction, though, if I'm missing something.

  - test: fix use git config to set textconv executable
  - test: add to WorkDir git config list/get/set support
@msftcangoblowm
Copy link
Author

msftcangoblowm commented Aug 30, 2024

Editing the file manually is easy, via code nontrivial.

  • On Windows, sphobjinv executables path must be resolved

  • On Windows, inventory path must be surrounded by single quotes so the backslashes are not removed by bash resulting in:
    C:hostedtoolcachewindowsPython3.8.10x64Scriptssphobjinv-textconv.EXE

  • On Windows, double quotes does not do the backslash escaping. No idea, if "$0" would work as expected

git config diff.inv.textconv "sh -c 'sphobjinv co plain \"\$0\" -'"
git config diff.inv.textconv

sh -c 'sphobjinv co plain "$0" -'

Bash quote escaping is a nightmare, best avoided. UX wise, this is a nasty ugly non-obvious non-trivial solution requiring advanced bash skills and substantial hair loss (stressful) and code hidden in a config file.

End of the day, the end user needs smooth UX. These details at this level of complexity would have to be hidden.

- test: fix inventory path must surround by single quotes
- test: attempt to diagnose WindowsPath getting garbled
- test: file not found create a .gitattributes
- ci: azure-pipelines coverage report omit setup.py
- ci: azure-pipelines need --testall to get 100% coverage
- ci: README.md affected by version. Update version
- test: WorkDir
- test: conftest fixture ensure_doc_scratch
- ci: Azure only issue skip
@bskinn
Copy link
Owner

bskinn commented Aug 30, 2024

I used my above-noted setup successfully without modification on both Windows (with Git for Windows) and Debian WSL, so it definitely works across those platforms.


What is the goal of editing the files via code? Providing the setup machinery so a user can run a command and the git textconv is set up automatically for them?

I'm confident (but still possibly wrong!) that the sphobjinv-textconv entry point, and thus also all the new testing around it, are not actually needed.

I like this capability a lot, but I don't want to add an 'automatically set up inv textconv' functionality to sphobjinv. It seems like it would be a bottomless source of support requests and edge case fixes.

It seems to me this is best handled by documenting and publicizing the method for setting it up. Different users will want it set up differently, and so the knowledge will be more valuable than automation here anyways.

@msftcangoblowm
Copy link
Author

msftcangoblowm commented Sep 1, 2024

  1. A simple python solution is always better than a complicated bash solution. Even if you can, doesn't mean you should

  2. UX wise, either method, but especially the bash method, needs a setup entrypoint. Could add that to soi-textconv. Didn't, to keep the PR scope limited. sphobjinv-textconv --setup. The bash route, recommend not adding to soi entrypoint. Already complicated.

So it's an UX issue. One command ... done. Much nicer than having to explain the intricacies of a complex sewer system.

@msftcangoblowm
Copy link
Author

msftcangoblowm commented Sep 1, 2024

What is the goal of editing the files via code?

Simplicity. Zero learning curve. Doing less work. Type one uncomplicated command ... done.

Seems like it would be a bottomless source of support requests and edge case fixes

Internally, soi-textconv runs git config. No one would have to manually edit the .git/config (nor the .gitattributes files) or care about the details.

Besides small changes to the parser, all the code for the setup has already been written and is known to work. Just refactor the integration unittest. After a refactor, the complexity of the integration test should be greatly reduced

The scope of soi-textconv is it configures git to understand inventory files. That scope shouldn't change besides the --setup and maybe --global flags.

@bskinn
Copy link
Owner

bskinn commented Sep 8, 2024

@msftcangoblowm

TL;DR - You've done a lot of work here, but I'm afraid it's too much... more than I'm willing to add to the project. There's still an avenue to a smaller contribution, though.


I'm on board with:

  • A new sphobjinv-textconv entrypoint to allow the git config entry to be cleaner
  • A bit of basic documentation of how to set up the .inv textconv
    • Your current location for it in the docstring for the new cli/core_textconv.py makes sense for now, but I might change my mind
  • A couple of targeted tests to exercise the new entrypoint.

But I don't want to uptake anything more than that:

  • I would want the new entrypoint to be narrowly scoped.
    • It should take no options, and only the single INFILE argument.
    • The main sphobjinv entrypoint should remain the primary, and only configurable, CLI interface surface.
  • For tests:
    • No direct testing of the operation of the git textconv.
    • Nothing new in the test suite driving git or editing .gitattributes or .git/config.
  • For docs:
    • No tutorials on anything git other than brief instructions for setting up the textconv. It's just beyond the scope for the sphobjinv documentation.

So—if you're interested in paring down the PR to just the above (or, starting a new PR to only implement the above, which might be easier/faster), I think there's still a path to a merge here. Let me know if you want to move forward, and I'll leave more thorough comments/feedback on the PR.

I apologize that you went to all this work before getting this feedback, I didn't have any sense before you started that you were envisioning something of this scale. Lesson learned, I should've started a conversation about scale and approach before you started working!


All of this said, I'm still not convinced that the new entrypoint is actually required. Using user-scope .gitconfig and attributes files, and installing sphobjinv with pipx, I've been able to get the textconv working on both Git for Windows on Win11 and on Debian WSL without having a sphobjinv-containing virtualenv activated.

In $HOME/.gitconfig (WSL) & %USERPROFILE%/.gitconfig (Win11):

[diff "objects_inv"]
  textconv = sh -c 'sphobjinv co plain "$0" -'

In $HOME/.config/git/attributes (WSL) & %USERPROFILE%/.config/git/attributes (Win11):

*.inv diff=objects_inv

For WSL (similar but nonidentical on Win11):

$ sudo apt install -y pipx
$ pipx install sphobjinv

As we've already discussed, the benefit of sphobjinv-textconv would be a simpler invocation in the .gitconfig... but not that much simpler, given that either way, it's two canned snips of config for users to add into the right files.

It's really not much more than a quick post on a blog or Medium or wherever.

(I do understand that a magic command to insert those snips for users would make it even simpler for those users; but, that magic command would end up needing to accommodate a wide variety of operating systems and user setups, and I'm confident it would run into all sorts of hairy edge cases that would be a nightmare to debug and develop around. Far better IMO just to educate users on how to set it up, rather than trying to write something that will attempt to automatically set it up for a wide range of users.)

Regardless, again, if you're still interested in putting together the smaller-scope contribution for adding sphobjinv-textconv, I'm on board with working with you toward that.

@msftcangoblowm
Copy link
Author

The main sphobjinv entrypoint should remain the primary, and only configurable, CLI interface surface.

If you feel strongly sphobjinv --setup is better UX than sphobjinv-textconv --setup, i'm totally on board with whatever results in better UX.

This is actually a good idea that did not occur to me.

If the flag should have another name, just suggest a better flag name.

It should take no options, and only the single INFILE argument.

The --setup flag has to go somewhere. If it goes onto the sphobjinv CLI, then sphobjinv-textconv need not be changed any further.

A bit of basic documentation of how to set up the .inv textconv
No tutorials on anything git other than brief instructions for setting up the textconv. It's just beyond the scope for the sphobjinv
documentation.

On board with simple docs just saying for git diff support run optional command sphobjinv --setup. Then not explaining the manual method or the nitty gitty details.

No direct testing of the operation of the git textconv.
Nothing new in the test suite driving git or editing .gitattributes or .git/config.

Can move that out of fixtures and into code modules.

So—if you're interested in paring down the PR to just the above

Yes would be interested. Would like to change the current PR rather than making a new PR.

Please confirm i can go ahead.

@msftcangoblowm
Copy link
Author

msftcangoblowm commented Sep 11, 2024

but, that magic command would end up needing to accommodate a wide variety of operating systems and user setups, and I'm confident it would run into all sorts of hairy edge cases that would be a nightmare to debug and develop around.

When modifying .git/config, sphobjinv-textconv calls git config. So any hairy edge cases could only be caused by upstream. git config will have a flag controlling whether those changes affect local or globally.

We, the maintainers, need to have read and understood the git config manual. Specifically how to get/set.

When modifying .gitattributes, sphobjinv-textconv is doing that, but it's one line. The only edge case, is whether needs \r\n or not.

The setup changes are local; affecting one repo. Down the road, someone may request a local/global switch. Global setup goes into ~/.config/git/config and where ever the global .gitattributes is located.

There is a Python package specifically for platform xdg user and site folders. The platform specific xdg folders location will not an issue. I already have a wrapper module and unittest for this. So copy paste ... done

@bskinn
Copy link
Owner

bskinn commented Sep 15, 2024

What I'm really hesitant about is whether I want to have a --setup (or whatever name) flag in sphobjinv at all. (On rereading what I wrote, I wasn't very clear about this -- apologies there.)

On one hand, I actually just used nbstripout seriously for the first time this week, which has a similar automated --install functionality, and it was really handy that it did it automatically for me.

On the other hand, these manipulations of git config and gitattributes really seem like something that should be encapsulated into their own library, and not rewritten in every library they're needed in. E.g., it seems like nbstripout has likely already implemented most or all of what would be required here (example)... and it would make way more sense to me, and I would be somewhat more inclined toward a sphobjinv-textconv --setup feature, if the git config/attributes manipulations were pulled into a library that --setup could call, instead of being re-implemented directly in the sphobjinv codebase. (Plus, from even a quick scan of the nbstripout code, I noticed multiple things that are sharp edges just waiting for users to eventually cut their fingers on... it's hard to actually do this the right way.)

Down the road, someone may request a local/global switch.

Yeah... and I could see at least the following sorts of requests popping up, too:

  • Can you make it install system-wide, not global or per-repo?
  • Can you make it tell me where the textconv is installed?
  • Can you make it uninstall the textconv for me?
  • I have a complicated setup, and your --setup broke my git. Can you undo what you did?
  • Can you make it install other gitconfig/gitattributes settings?
  • Can you expose the --setup functionality in the API?

Just a really large possible CLI/API surface that might be requested.


Ultimately, this sort of git config/attributes manipulation is solidly outside sphobjinv's mission, and I really don't want to open the door to issues/features surrounding the config/attributes manipulations themselves being on-topic for its issue tracker.

I'm open to continued discussion, but for now I'm only willing to consider the sphobjinv-textconv entrypoint itself, as an alias for sphobjinv convert plain $0 -, and documentation of the manual process for setting up the textconv.

@msftcangoblowm
Copy link
Author

msftcangoblowm commented Sep 16, 2024

You are right.

This should be a separate package, named git-filter, for interacting with git config and gitattributes.

It's not a sphobjinv specific issue. sphobjinv users could benefit from such a package.

git-filter install --local --package=sphobjinv

git-filter remove --local --package=sphobjinv

git-filter list --package=nbsplitout

And yaml files describing changes need by each package. Was thinking plugins, but that is overkill

@bskinn
Copy link
Owner

bskinn commented Sep 24, 2024

Thanks again for the work you did on this, @msftcangoblowm -- sorry it didn't end up being merged.

FYI, I did a bit of searching for other projects that have attempted this sort of git config, and found one: https://github.com/d12frosted/git-config-manager. I'm not sure what language it was written in, but it might be a useful resource.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

act as textconv so can git diff inventory
2 participants