Index | Description | Folder Name | Link |
---|---|---|---|
0. | SDG Ontology compiled by Dr Nuria B. Puig and E. Mauleon | 0_PuigOntology | Dataset |
6. | Terms by Indicator from SDGIO Ontology | 6_SDGIO_Terms | Link to SDGIO GitHub |
Index | Description | Folder Name | Link |
---|---|---|---|
1. | Mapping from "FP7-4-SD" Project (edited VS and LP) | 1_FP7-4-SD_edited | Link to Project website |
2. | Concepts UN Linked SDG tool extracted from academic publications | 2_LinkedSDG_Concepts | Link to LinkedSGS Tool |
3. | Concepts extracted from SDG Pathfinder documents extracted via ML | 3_SDGPathfiner_DocumentConcepts | Document Colletion ; Modelling Description |
4. | Keywords from SDG Pathfinder indicated by the SDG Pathfinder tool itself | 4_SDGPathfinder_Keywords | SDG Pathfinder |
5. | Concepts UN Linked SDG tool extracted from Administrative Documents | 5_LinkedSDG_DocumentExtracts | Link to LinkedSGS Tool |
7. | Concepts linked to SDGs from EC Policy Documents | 7_EC_Policy_Doc_Terms | Skrynnyk & Stanciauskas ( 2020 upcoming ) |
9. | Keywords from "Science4SDGs" project | 9_SIRIS_Science4SDGs | Link to "Science4SDGs" project |
Index | Description | Folder Name | Link |
---|---|---|---|
8. | FOS'es Linked to NABs Areas | 8_NABS_FOS | Link to Eurostat |
10. | A boost of SDG relevant FOS'es compiled by PPMI researchers | 10_PPMI_boost | PPMI |
Assigned labels from raw data sources are assembled in two steps:
-
Assembling terms
AssemblingTerms.py
Assembles terms fromraw_data/0_add/
data sources.- Term label conflicts from sources
00_add_validated/
are ignored meaning ifterm_1
is assigned toSDG_1
bysource_1
and toSDG_2
bysource_2
→term_1
is assigned to both. - Conflicts for term labels from
01_add_generated/
data sources are managed in two ways:- If the conflict is between validated and generated term label → generated term label is discarded while validated one remains.
- If the conflict is between generated & generated → both are discarded.
→ produces
InterimTerms.json
{ 'SDG_1': { 'term_1': ['source_1', 'source_2', ...], 'term_2': ['source_1', 'source_3', ...] ... } ... }
- Term label conflicts from sources
-
Assembling OSDG Ontology
AssemblingOntology.py
Assembles FOS fromInterimTerms.json
and02_add_all_to_all/
data sources.- 2.1. Terms from
InterimTerms.json
are matched to MAG Fields of Study subsetFOSMAP.json
which contains over 150 thousand fields. - 2.2. Matched FOS are added to the final ontology
FOS-Ontology.json
. - 2.3.
02_add_all_to_all/
FOS are added to the final ontologyFOS-Ontology.json
. - 2.4 Final ontology
FOS-Ontology.json
is adjusted based on1_replace/
and2_remove/
.
→ produces
OSDG-Ontology.json
{ 'SDG_1': ['fos_id_1', 'fos_id_2', ...], 'SDG_2': ['fos_id_3', 'fos_id_4', ...] ... }
- 2.1. Terms from
raw_data/
0_add/
00_add_validated/
Expert validated term labels
→ each data source must produce:*_ProcessedKeyTerms.json
{ 'SDG_1': ['term_1', 'term_2', ...], 'SDG_2': ['term_3', 'term_4', ...], ... }
01_add_generated/
Expert validated term labels
→ each data source must produce:*_ProcessedKeyTerms.json
{ 'SDG_1': ['term_1', 'term_2', ...], 'SDG_2': ['term_3', 'term_4', ...], ... }
02_add_all_to_all/
Expert validated FOS labels
→ each data source must produce:*_ProcessedFOS.json
{ 'SDG_1': [['fos_id_1', 'fos_name_1'], ['fos_id_2', 'fos_name_2'], ...], 'SDG_2': [['fos_id_3', 'fos_name_3'], ['fos_id_4', 'fos_name_4'], ...], ... }
1_replace/
Mapping for FOS SDG label reassignment fromSDG_a
toSDG_b
→ each data source must produce:*_ReplaceFOS.json
{ 'fos_id_1': [['SDG_1', 'SDG_2'], ...], 'fos_id_2': [['SDG_3', 'SDG_4'], ...], ... }
2_remove/
FOS to remove from sdg assigned FOS lists
→ each data source must produce:*_RemoveFOS.json
{ 'SDG_1': ['fos_id_1', 'fos_id_2', ...], 'SDG_2': ['fos_id_1', 'fos_id_3', ...], ... }
Blacklist
Irrelevant FOS
→ each data source must produce:*_Blacklist.csv
fos_id fos_name fos_id_1 fos_name_1 fos_id_2 fos_name_2 ... ...