-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use JGit in favor of shelling to git #16
base: master
Are you sure you want to change the base?
Conversation
- replace all git execution and parsing with use of the (JGit)[http://eclipse.org/jgit] library. - much faster import and analysis!
Apologies, a mistake fell through the net, corrected in 840bddd. |
- replace shallow-tree-walk with deep-tree-walk - invert control of tree walker: rather than the recursive helper function in commit-tx-data, pass a walker function to deep-tree-walk - a tree walker signals if the path is new, and if not, the deep-tree-walk will skip over uninteresting subtrees - add index to :node/object and make check for existing node much more efficient - fix logic for testing if path is new - make sure future is deref’ed when transacting a commit’s tx data
I believe I’ve discovered and corrected a bug in c1977d3. The code originates in 1d862fc. Here is the code from the current master. newpath (or (tempid? pathid) (tempid? nodeid)
(not (ffirst (d/q '[:find ?node :in $ ?path
:where [?node :node/paths ?path]]
db pathid)))) I believe the logic is intended to determine if a path is new with respect to a tree node. The path is new for the node if…
I’m a little thrown by the intention of Now, suppose I copy a file In this scenario, we are in the third case as the In c1977d3, I’ve replaced the code above with this: newpath (or (tempid? pathid) (tempid? nodeid)
(every? #(not= nodeid (:e %))
(d/datoms db :vaet pathid :node/paths))) Which in the third case is supposed to express that every node with the given path is distinct from the given node. I’d be happy to hear any further insight on this. |
no need to put an extra index on :node/object, just use the reverse index
- test a few scenarios to exhibit interesting corner cases - utilities to perform git operations - utilities to query and introspect the codeq db
I’ve developed some tests in a9ab20d. In the process I’ve hit some interesting edge cases… In Given a base commit Furthermore, if you revert a file as part of a commit, to a state in history, then obviously the ids for the object, file name, file path, and node all exist, and so that file will not show up as part of the commit in codeq. The contrast between these two cases of revert operations is interesting. I suppose one just has to be careful interpreting a query for commits from blobs or codeqs in the presence of reverts. In |
This pull request replaces the use of
Runtime.exec
to invoke git, with the use of the JGit library. This appears to result in much faster import and analysis times. Aside from the new functions, the major change todatomic.codeq.core
is the need to pass around an instance oforg.eclipse.jgit.lib.Repository
.