Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Form-based Entry for JSON-LD #325

Open
kzollove opened this issue Feb 29, 2024 · 2 comments
Open

Form-based Entry for JSON-LD #325

kzollove opened this issue Feb 29, 2024 · 2 comments
Assignees

Comments

@kzollove
Copy link
Collaborator

kzollove commented Feb 29, 2024

What if we were to use the approach developed by EHDEN and GOFAIR described in Results #3 in this poster:

Here a human populates some YAML which is converted into JSON-LD

This is old tech that OHDSI used in a 2020 ”studyathon” when an OHDSI community couldn’t meet in person

@jaygee-on-github
Copy link
Collaborator

It looks like we have a product that is already aligned with the discovery metadata. See here.

Doug Fils has a second candidate which he is working on so when he is a little further along, we can review both alternatives.

@fils
Copy link

fils commented May 23, 2024

Just wanted to follow up on this issue bit. As mentioned last week, the RML approach to mapping from a structured source to JSON-LD (RDF) works for sources such as CSV, YAML and others. So the approach mentioned in this poster is both aligned and compatible with that process.

This converted document we can reference as a "data graph". Once in that form there are many approaches that can be employed. These objects can be placed in S3 object stores. Either like AWS or Google or self hosted with tools like Minio. One nice feature for this is that they are both a commodity resource and highly reliable. I am also working on a Likned Data Notification pipeline for uploading that would allow for SHACL validation as well to ensure graphs are well formed.

Within the UNESCO project we are also defining a procedure to leverage this object store approach and then publish the graphs to an archive, zenodo in our case, with DOI. This allows for their use and citation in publications and also for a credit chain back to the authoritative source(s).

For data science work then you can pull from the object stores and form your graphs and or networks for use them there.

For the UNESCO (Ocean InfoHub) and NSF work (DeCODER) then we use these JSON-LD documents to build search engines too. These are built from the same set or documents in the S3 stores mentioned above that can be given a DOI. So it maintains the simple architecture. This is all being done with simple docker based components to allow for easy deployment by multiple groups.

So for example, this site: https://geocodes.earthcube.org/#/search/?q=ocean+acidification&resourceType=all
is feed with JSON-LD documents we work with form the geoscience community. If you look at a result like:
https://geocodes.earthcube.org/#/dataset/urn:geocodes:hydroshare:5cdc0770a40a749ee718d556f153f72bd9e4c138 and scroll to the bottom you can see the metadata button that shows the JSON-LD used in this record.

Similar for Ocean InfoHub:
Search: https://oceaninfohub.org/results/?search_text=ocean+acidification&page=0 where the "view JSON-LD source" can been with each result.

Finally, these data graphs can also be sent into a triplestore with a named graph to allow them to be pulled back out and serialized back into JSON-LD or over YAML again. So we can maintain a capacity to serialize and de-serialize them.

Happy to provide you with the architecture diagrams for these systems if you like. As mentioned this is all open source and self hosted though you can easily leverage cloud for it as well. The GeoCODES example is AWS hosted, the UNESCO one self hosted, but they both use the same basic architecture.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants