Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cedric injest attempt #9

Merged
merged 66 commits into from
Jan 17, 2024
Merged

Cedric injest attempt #9

merged 66 commits into from
Jan 17, 2024

Conversation

cedricdcc
Copy link
Contributor

  • Added auto ingest of files
  • Added rdfj2 for teplate designing of the sparql querries.
  • Created Watcher that will check for updates on mounted volume.

laurianvm and others added 30 commits September 8, 2023 07:17
make it easy to get the jupyer token
but moving the log elsewhere
reverses symlink direction to make docker happy (and the other usage satisfied)

applies the poetry build inside docker inage build to fix #1
further enhances the request from #3
some path changes allong the way (conform the reeversed symlinks)
this marks an important milestone for #4 as we can now insert triples!
completes the goal of "test driving the ingest of rdf into graphdb"

fixes  #4
@cedricdcc cedricdcc marked this pull request as ready for review November 30, 2023 22:45
Copy link
Contributor

@marc-portier marc-portier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor tweeks now should be easy
effort to actually split the deref stuff in another branch might not be worth the fuzz - but surely a better practice making for more clear follow up


- name: Commit and push changes
run: |
git config --global user.name 'cedricdcc'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? this auto-lint at server side is odd -- pls check how things were setup in pykg2tbl as part of the client-side commit (so no need for user in config)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather remove this one and add some todo to have the linting on the client - possibly with git-hook

- name: Run pytest
run: |
cd docker/lwua-ingest/lwua-py
poetry run pytest ./tests/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to run this kind of stuff via Makefile --> again pykg2tbl shows how to set it up

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment still applies

data/project.ttl Outdated Show resolved Hide resolved
data/project.ttl Outdated Show resolved Hide resolved
configs/dereference_test.yml Outdated Show resolved Hide resolved

# Assert
assert graph is not None
# Add more assertions based on what the read_graph function is supposed to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed!

"../lwua/templates")
# init J2RDFSyntaxBuilder
context = f"{URN_BASE}:ADMIN"
j2rdf = J2RDFSyntaxBuilder(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not doable to just reuse the syntax-builder in graph.py?

if so you would be testing that too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, the template folder location is different

docker/lwua-ingest/lwua-py/tests/test_queries.py Outdated Show resolved Hide resolved
print(query)
to_expect = "INSERT DATA { GRAPH <urn:lwua:INGEST:test_file.txt> { <http://example.com/subject> <http://example.com/predicate> <http://example.com/object> . } }"
# Assert
assert query == to_expect, f"Expected '{to_expect}', but got '{query}'"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsure -- but if RDFlib also supports insert statements onto an in-memory graph -- then you could actually provide a mock_gdb variable that allows for further testing

dotenv-example Outdated Show resolved Hide resolved
@cedricdcc
Copy link
Contributor Author

For future comments on the dereferencing part of the project please use the following gh-issue.

This is a temp solution until a seperate dereferencer branch is created.

@cedricdcc cedricdcc merged commit 804c6a4 into lifewatch:main Jan 17, 2024
Copy link
Contributor

@marc-portier marc-portier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would like to see this added to the codebase, but also add some issues to follow up on the missing parts:

  1. reconsider how the linting works and is applied
  2. introduce a Makefile for tests, lints, docker-build, ...
  3. then use the make test in the testing workflow (not the direct call to pytest)
  4. provide more and better tests as the current ones might give a false feeling of security

@marc-portier marc-portier mentioned this pull request Jan 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants