-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Use Case]: Evaluating fitness of WorldFAIR for OHDSI/GIS #324
Comments
In the tasks so far I didn't include development of an upper model based on Wild's exposome that we can use to classify all the catalog entries at the dataset level. This appears in a presentation I made recently: Here is the presentation If we had an "upper model" that we could use as buckets in which to break out the datasets, then we would be positioned to create a catalog with three levels following the Arcus schema that the library science group developed at CHOP. In the Arcus model a catalog consists of one or more collections and a collection contains one or more series and a series consists of one or more files/datasets. INSPIRE has begun to engage with the Arcus group at CHOP at least conceptually. Here is a presentation they recently made to INSPIRE: |
Jay, Doug, and Steve have built a schema.org JSON-LD from LinkML (In different context ). They will meet separately to detail that pipeline and then will update the task to push non-functional metadata into JSON-LD after this meeting There are other apps that run around that process that may be helpful. Will detail these (db schemas, documentation) DB tables/ schemas for capturing this metadata can be generated from schema.org JSON-LD. Natural language descriptions can be generated to describe these tables Doug is exploring using graphs to analyze these |
This Use Case is contributing directly to GIS WG by developing Authoring environment for discovery metadata that will go into staging database alongside catalog entries
|
@kzollove, we met on Tuesday. Tim, Doug Fils, Arofan Gregory and Jay were in attendance. We discussed metadata entry using YAML and forms. Tim demonstrated a recently developed DataCite form called DataCite Fabrica. We discussed middleware that would take us to JSON-LD and schema.org. Candidates included LinkML and RML.io technologies. Doug is going to put together a preliminary proposal working with Tim and the various approaches Tim has either used or wants to consider for the metadata entry. I will check with Doug later this week before our Friday meeting on 4/5 to find out our ETA on the proposal. |
@kzollove and @martyalvarez and @AEW0330 and @tibbben and @rtmill, we would like to present next week. We have two candidate authoring solutions. Both will support YAML or spreadsheet input and JSON-LD output right now at the dataset level but extensible to the variable level. The output is an empty instance of schema.org JSON-LD that can be aligned with any standard (more or less). In one candidate the mapplng is embedded in some code probably Python if I recall. In the other candidate the mapping is declarative. We might want to talk about the maintainability of the two candidates. |
The design for the output schema.org is a little open-ended as a feature. We have experience with and are interested in following the Science on Schema.org metadata guidance endorsed by the ESIP Partner Assembly a couple of years back. This guidance is remarkably cross-domain. The guidance can be found here. Note that some of the guidance is experimental developed to address a few special use cases. We are thinking the experimental guidance may apply. |
@jaygee-on-github, once you find a time that works for you and Doug Fils (and whoever else should be present), please let us know and @martyalvarez can help set up the presentation on this work. My preference is for a Friday meeting, but will join whenever! Thanks for all your work on this. |
Look forward to talking about these on the scheduled call. Obviously YAML to JSON-LD (RDF) is doable, but so is CSV or just tabular data to RDF. I've been exploring RML (https://rml.io/) which allows for a declarative mapping from tabular (or structured) to RDF. This would let people work in spreadsheets if they like and that maps better to their current data model. A forms based approach could also be used. Things like https://www.kobotoolbox.org/ are also possible alternatives to classic Google Forms. Connecting such transforms with validation via SHACL is another topic that might be of interest. I'll work up examples for the May 17th call. |
@kzollove @jaygee-on-github just FYI, we finally published the latest version of the document referenced in the original post on this thread. You can find it here: https://zenodo.org/records/11236871 During the editing of this document I was always keeping in mind how I would connect the UNESCO Ocean InfoHub (OIH) work to these guidelines. Part of the groups follow on work is start looking at implementation examples and documenting those. So, I'm happy to look them over in the context of this work as well as OIH. Note that guidance scopes the use of https://schema.org/StatisticalVariable along with the standard https://schema.org/variableMeasured. I am also meeting with the OBIS group (https://obis.org/) next week to talk about how we could align some of the discrete grid approaches we are working on. OBIS is developing what they call speciesgrids (https://github.com/iobis/speciesgrids) and I have been working on a similar to generate resources like the following. I am hoping we can generate these products in line with the CODATA recommendations. |
Thanks, Doug JaySent from my iPhoneJay Greenfield202.271.3179On Jun 13, 2024, at 3:07 PM, Douglas Fils ***@***.***> wrote:
@kzollove @jaygee-on-github just FYI, we finally published the latest version of the document referenced in the original post on this thread. You can find it here: https://zenodo.org/records/11236871
During the editing of this document I was always keeping in mind how I would connect the UNESCO Ocean InfoHub (OIH) work to these guidelines. Part of the groups follow on work is start looking at implementation examples and documenting those. So, I'm happy to look them over in the context of this work as well as OIH.
Note that guidance scopes the use of https://schema.org/StatisticalVariable along with the standard https://schema.org/variableMeasured.
I am also meeting with the OBIS group (https://obis.org/) next week to talk about how we could align some of the discrete grid approaches we are working on. OBIS is developing what they call speciesgrids (https://github.com/iobis/speciesgrids) and I have been working on a similar to generate resources like the following.
image.png (view on web)
I am hoping we can generate these products in line with the CODATA recommendations.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Project Lead:
@jaygee-on-github
Purpose:
This is the specification we will be evaluating to determine its fitness to purpose:
DiscoverabilityDraftForZenodo.pdf
The specification proposes some metadata content that we can use to mark up any digital object for the purpose of discovery. The metadata content has been taken from many standards including Dublin Core, ISO19115-1, schema.org conventions from ESIPFed Science on Schema.org and Ocean Data net, DCAT, DCAT-AP, and FDO Kernel Attributes-2.0.
The specification maps this content into a set of JSON-LD nodes in a knowledge graph. Each node has a property and ultimately a value taken from the use case. The knowledge graph is machine readable and can be queried by a software agent. It can also be validated using SHACL rules in specific use cases.
One use case for this specification is a catalog of datasets. In this context the specification provides mark up and a knowledge graph at both the dataset and the variable levels. Variable level metadata can be more or less advanced.
Tasks:
The text was updated successfully, but these errors were encountered: