Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev #28

Merged
merged 60 commits into from
May 21, 2024
Merged

Dev #28

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
767ba7f
add local notes
karacolada Jan 16, 2024
23bee5e
run CI for dev branch
karacolada Jan 16, 2024
5b87492
update index.php for dev
karacolada Jan 16, 2024
8dd2378
#23 added general metrics to YAML
karacolada Jan 18, 2024
656cd3d
#23 added CESSDA YAML
karacolada Jan 19, 2024
e10e308
#23 added new evaluators
karacolada Jan 19, 2024
d2c8d48
#23 added metric ID to existing evaluators
karacolada Jan 19, 2024
72a3177
#23 FRSM-04 added to minimal_metadata
karacolada Jan 22, 2024
3f59208
#23 added 6+17 to data_provenance, TODO should separate
karacolada Jan 22, 2024
72f24a9
split out FRSM-17
karacolada Jan 22, 2024
f9b6ff9
#23 FRSM-7
karacolada Jan 22, 2024
6c3f4e4
#23 FRSM-8
karacolada Jan 22, 2024
5d6c59c
#23 FRSM-09
karacolada Jan 23, 2024
66baa37
#23 FRSM-09 fix teststring
karacolada Jan 23, 2024
377aeff
#23 FRSM-10
karacolada Jan 23, 2024
235af89
#23 FRSM-12
karacolada Jan 23, 2024
e95aeeb
#23 moved FRSM-15 to its own evaluator, replaced by FRSM-16
karacolada Jan 23, 2024
582db82
#23 added new output models for new evaluators
karacolada Jan 23, 2024
5b971e0
remove github data from license check
karacolada Jan 23, 2024
14dbd5c
fix metadata preservation test
karacolada Jan 23, 2024
c97ae44
add logger to GH harvester
karacolada Jan 23, 2024
fbe9f13
#2 starting
karacolada Jan 23, 2024
d13ee50
started on #2 general-1
karacolada Feb 2, 2024
0e5f050
continue work on file parsing in #2
karacolada Feb 5, 2024
8b73c6e
#2 finished general-2
karacolada Feb 6, 2024
e3f36a9
#15 moved build script into requirements
karacolada Feb 6, 2024
a269d1d
fixed parsing errors
karacolada Feb 16, 2024
ea39e78
#2 finished/fixed 13-C1
karacolada Feb 16, 2024
b3f0f4e
#2 finish C2 and C3
karacolada Feb 28, 2024
a40fcc7
add test ID to debug messages
karacolada Feb 28, 2024
b1e760e
updated metric config for #2
karacolada Feb 28, 2024
7e680a8
make metric file configurable
karacolada Feb 28, 2024
db9e727
solved #25, max check for maturity
karacolada Mar 12, 2024
bd93a0a
fix bug when no license found
karacolada Mar 12, 2024
dcbd9e9
evaluation
karacolada Mar 12, 2024
39c4c01
add URL index to CSV
karacolada Mar 13, 2024
f2a3ddf
API key swapping
karacolada Apr 9, 2024
1c8c454
Update github.ini
karacolada Apr 9, 2024
7364bbe
document GH API token use
karacolada Apr 9, 2024
eab06b8
remove evaluation data
karacolada Apr 10, 2024
e90842c
Merge branch 'master' into dev
karacolada Apr 10, 2024
81ccd4d
fixed ruff linter errors
karacolada Apr 10, 2024
863eff9
tidy up dev branch
karacolada Apr 10, 2024
0410815
Update github.ini
karacolada Apr 10, 2024
e7e5708
dockerised client
karacolada Apr 16, 2024
7fd6854
switch webclient to http/s ports
karacolada Apr 16, 2024
5dbf58e
fix newline in token list
karacolada Apr 16, 2024
fefd154
fix dockerised webclient
karacolada Apr 16, 2024
44f3f59
add restart policy to docker compose
karacolada Apr 22, 2024
120a818
FRSM-15 docs update
karacolada Apr 22, 2024
4e70261
fix Jupyter Notebook bug
karacolada Apr 29, 2024
b66bca4
implement FRSM-16 general tests (#16)
karacolada Apr 29, 2024
985054b
make license header checks configurable
karacolada Apr 29, 2024
c843a9b
token rate limit logging
karacolada Apr 29, 2024
da42ad3
fix timeouts
karacolada Apr 29, 2024
534c356
fix fair letter for FRSM
karacolada May 2, 2024
0847e65
fix fair_letter for FRSM
karacolada May 2, 2024
8fc3dc6
update functional software test
karacolada May 3, 2024
4c6c528
Merge branch 'pangaea-data-publisher:master' into dev
karacolada May 3, 2024
0aadc41
fix linting
karacolada May 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@ fuji_server/helper/catalogue_helper_google_datasearch_copy.py
fuji_server/helper/create_google_cache_db_copy.py

# private config
fuji_server/config/github.cfg
fuji_server/config/github.ini
fuji_server/data/github_api_tokens.txt

# Created by https://www.gitignore.io/api/python,linux,macos

Expand Down
23 changes: 23 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,28 @@ If you receive the exception `urllib2.URLError: <urlopen error [SSL: CERTIFICATE

F-UJI is using [basic authentication](https://en.wikipedia.org/wiki/Basic_access_authentication), so username and password have to be provided for each REST call which can be configured in `fuji_server/config/users.py`.

#### GitHub API

F-UJI can optionally use the GitHub API to evaluate software repositories hosted on GitHub.
Unauthorised requests to the GitHub API are subject to a very low rate limit however, so it's recommended to authenticate using a personal access token.

To create an access token, log into your GitHub account and navigate to <https://github.com/settings/tokens>, either by clicking on the link or through Settings -> Developer Settings -> Personal access tokens -> Tokens (classic). Next, click "Generate new token" and select "Generate new token (classic)" from the drop-down menu.

Write the purpose of the token into the "Note" field (for example, *F-UJI deployment*) and set a suitable expiration date. Leave all the checkboxes underneath *unchecked*.

> Note: When the token expires, you will receive an e-mail asking you to renew it if you still need it. The e-mail will provide a link to do so, and you will only need to change the token in the f-uji configuration as described below to continue using it. Setting no expiration date for a token is thus not recommended.

When you click "Generate new token" at the bottom of the page, the new token will be displayed. Make a note of it now.

To use F-UJI with a single access token, open [`fuji_server/config/github.ini`](./fuji_server/config/github.ini) locally and set `token` to the token you just created. When F-UJI receives an evaluation request that uses the GitHub API, it will run this request authenticated as your account.

If you still run into rate limiting issues, you can use multiple GitHub API tokens.
These need to be generated by different GitHub accounts, as the rate limit applies to the user, not the token.
F-UJI will automatically switch to another token if the rate limit is near.
To do so, create a local file in [`fuji_server/data/`](./fuji_server/data/), called e.g. `github_api_tokens.txt`. Put all API tokens in that file, one token on each line. Then, open [`fuji_server/config/github.ini`](./fuji_server/config/github.ini) locally and set `token_file` to the absolute path to your local API token file.

> Note: If you push a change containing a GitHub API token, GitHub will usually recognise this and invalidate the token immediately. You will need to regenerate the token. Please take care not to publish your API tokens anywhere. Even though they have very limited scope if you leave all the checkboxes unchecked during creation, they can allow someone else to run a request in your name.

## Development

First, make sure to read the [contribution guidelines](./CONTRIBUTING.md).
Expand Down Expand Up @@ -140,6 +162,7 @@ server {
location ~ \.php$ {
include snippets/fastcgi-php.conf;
fastcgi_pass unix:/var/run/php/php8.1-fpm.sock;
fastcgi_read_timeout 3600s;
}

location ~ /\.ht {
Expand Down
16 changes: 16 additions & 0 deletions fuji_server/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,17 @@
from __future__ import absolute_import

# import models into model package
from fuji_server.models.api import API
from fuji_server.models.api_output import APIOutput
from fuji_server.models.any_of_fair_results_results_items import AnyOfFAIRResultsResultsItems
from fuji_server.models.body import Body
from fuji_server.models.code_provenance import CodeProvenance
from fuji_server.models.code_provenance_output import CodeProvenanceOutput
from fuji_server.models.community_endorsed_standard import CommunityEndorsedStandard
from fuji_server.models.community_endorsed_standard_output import CommunityEndorsedStandardOutput
from fuji_server.models.community_endorsed_standard_output_inner import CommunityEndorsedStandardOutputInner
from fuji_server.models.component_identifier import ComponentIdentifier
from fuji_server.models.component_identifier_output import ComponentIdentifierOutput
from fuji_server.models.core_metadata import CoreMetadata
from fuji_server.models.core_metadata_output import CoreMetadataOutput
from fuji_server.models.data_access_level import DataAccessLevel
Expand All @@ -27,6 +33,8 @@
from fuji_server.models.data_provenance_output import DataProvenanceOutput
from fuji_server.models.data_provenance_output_inner import DataProvenanceOutputInner
from fuji_server.models.debug import Debug
from fuji_server.models.development_metadata import DevelopmentMetadata
from fuji_server.models.development_metadata_output import DevelopmentMetadataOutput
from fuji_server.models.fair_result_common import FAIRResultCommon
from fuji_server.models.fair_result_common_score import FAIRResultCommonScore
from fuji_server.models.fair_result_evaluation_criterium import FAIRResultEvaluationCriterium
Expand Down Expand Up @@ -55,6 +63,8 @@
from fuji_server.models.related_resource import RelatedResource
from fuji_server.models.related_resource_output import RelatedResourceOutput
from fuji_server.models.related_resource_output_inner import RelatedResourceOutputInner
from fuji_server.models.requirements import Requirements
from fuji_server.models.requirements_output import RequirementsOutput
from fuji_server.models.searchable import Searchable
from fuji_server.models.searchable_output import SearchableOutput
from fuji_server.models.semantic_vocabulary import SemanticVocabulary
Expand All @@ -64,8 +74,14 @@
from fuji_server.models.standardised_protocol_data_output import StandardisedProtocolDataOutput
from fuji_server.models.standardised_protocol_metadata import StandardisedProtocolMetadata
from fuji_server.models.standardised_protocol_metadata_output import StandardisedProtocolMetadataOutput
from fuji_server.models.test_case import TestCase
from fuji_server.models.test_case_output import TestCaseOutput
from fuji_server.models.unique_persistent_identifier_software import UniquePersistentIdentifierSoftware
from fuji_server.models.unique_persistent_identifier_software_output import UniquePersistentIdentifierSoftwareOutput
from fuji_server.models.uniqueness import Uniqueness
from fuji_server.models.uniqueness_output import UniquenessOutput
from fuji_server.models.version_identifier import VersionIdentifier
from fuji_server.models.version_identifier_output import VersionIdentifierOutput

from importlib.metadata import version

Expand Down
2 changes: 2 additions & 0 deletions fuji_server/config/github.ini
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
[ACCESS]
# set equal to access token if available to increase rate limit (usually starts with 'ghp_')
token =
# absolute path to file with tokens, optional for rotating through multiple tokens
token_file =
50 changes: 48 additions & 2 deletions fuji_server/controllers/fair_check.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,18 @@
import pandas as pd

from fuji_server import __version__
from fuji_server.evaluators.fair_evaluator_api import FAIREvaluatorAPI
from fuji_server.evaluators.fair_evaluator_code_provenance import FAIREvaluatorCodeProvenance
from fuji_server.evaluators.fair_evaluator_community_metadata import FAIREvaluatorCommunityMetadata
from fuji_server.evaluators.fair_evaluator_data_access_level import FAIREvaluatorDataAccessLevel
from fuji_server.evaluators.fair_evaluator_data_content_metadata import FAIREvaluatorDataContentMetadata
from fuji_server.evaluators.fair_evaluator_data_identifier_included import FAIREvaluatorDataIdentifierIncluded
from fuji_server.evaluators.fair_evaluator_data_provenance import FAIREvaluatorDataProvenance
from fuji_server.evaluators.fair_evaluator_development_metadata import FAIREvaluatorDevelopmentMetadata
from fuji_server.evaluators.fair_evaluator_file_format import FAIREvaluatorFileFormat
from fuji_server.evaluators.fair_evaluator_formal_metadata import FAIREvaluatorFormalMetadata
from fuji_server.evaluators.fair_evaluator_license import FAIREvaluatorLicense
from fuji_server.evaluators.fair_evaluator_license_file import FAIREvaluatorLicenseFile
from fuji_server.evaluators.fair_evaluator_metadata_identifier_included import FAIREvaluatorMetadataIdentifierIncluded
from fuji_server.evaluators.fair_evaluator_metadata_preservation import FAIREvaluatorMetadataPreserved
from fuji_server.evaluators.fair_evaluator_minimal_metadata import FAIREvaluatorCoreMetadata
Expand All @@ -28,14 +32,21 @@
FAIREvaluatorPersistentIdentifierMetadata,
)
from fuji_server.evaluators.fair_evaluator_related_resources import FAIREvaluatorRelatedResources
from fuji_server.evaluators.fair_evaluator_requirements import FAIREvaluatorRequirements
from fuji_server.evaluators.fair_evaluator_searchable import FAIREvaluatorSearchable
from fuji_server.evaluators.fair_evaluator_semantic_vocabulary import FAIREvaluatorSemanticVocabulary
from fuji_server.evaluators.fair_evaluator_software_component_identifier import FAIREvaluatorSoftwareComponentIdentifier
from fuji_server.evaluators.fair_evaluator_standardised_protocol_data import FAIREvaluatorStandardisedProtocolData
from fuji_server.evaluators.fair_evaluator_standardised_protocol_metadata import (
FAIREvaluatorStandardisedProtocolMetadata,
)
from fuji_server.evaluators.fair_evaluator_test_cases import FAIREvaluatorTestCases
from fuji_server.evaluators.fair_evaluator_unique_identifier_data import FAIREvaluatorUniqueIdentifierData
from fuji_server.evaluators.fair_evaluator_unique_identifier_metadata import FAIREvaluatorUniqueIdentifierMetadata
from fuji_server.evaluators.fair_evaluator_unique_persistent_identifier_software import (
FAIREvaluatorUniquePersistentIdentifierSoftware,
)
from fuji_server.evaluators.fair_evaluator_version_identifier import FAIREvaluatorVersionIdentifier
from fuji_server.harvester.data_harvester import DataHarvester
from fuji_server.harvester.github_harvester import GithubHarvester
from fuji_server.harvester.metadata_harvester import MetadataHarvester
Expand Down Expand Up @@ -352,7 +363,7 @@ def harvest_all_data(self):

def harvest_github(self):
if self.use_github:
github_harvester = GithubHarvester(self.id)
github_harvester = GithubHarvester(self.id, self.logger)
github_harvester.harvest()
self.github_data = github_harvester.data
else:
Expand Down Expand Up @@ -442,6 +453,34 @@ def check_unique_persistent_metadata_identifier(self):
# self.metadata_harvester.get_signposting_object_identifier()
return self.check_unique_metadata_identifier(), self.check_persistent_metadata_identifier()

def check_unique_persistent_software_identifier(self):
unique_persistent_identifier_check = FAIREvaluatorUniquePersistentIdentifierSoftware(self)
return unique_persistent_identifier_check.getResult()

def check_software_component_identifier(self):
component_identifier_check = FAIREvaluatorSoftwareComponentIdentifier(self)
return component_identifier_check.getResult()

def check_version_identifier(self):
version_identifier_check = FAIREvaluatorVersionIdentifier(self)
return version_identifier_check.getResult()

def check_development_metadata(self):
development_metadata_check = FAIREvaluatorDevelopmentMetadata(self)
return development_metadata_check.getResult()

def check_open_api(self):
open_api_check = FAIREvaluatorAPI(self)
return open_api_check.getResult()

def check_requirements(self):
requirements_check = FAIREvaluatorRequirements(self)
return requirements_check.getResult()

def check_test_cases(self):
test_cases_check = FAIREvaluatorTestCases(self)
return test_cases_check.getResult()

def check_minimal_metatadata(self, include_embedded=True):
core_metadata_check = FAIREvaluatorCoreMetadata(self)
return core_metadata_check.getResult()
Expand All @@ -462,6 +501,10 @@ def check_license(self):
license_check = FAIREvaluatorLicense(self)
return license_check.getResult()

def check_license_file(self):
license_check = FAIREvaluatorLicenseFile(self)
return license_check.getResult()

def check_relatedresources(self):
related_check = FAIREvaluatorRelatedResources(self)
return related_check.getResult()
Expand All @@ -482,6 +525,10 @@ def check_data_provenance(self):
data_prov_check = FAIREvaluatorDataProvenance(self)
return data_prov_check.getResult()

def check_code_provenance(self):
code_prov_check = FAIREvaluatorCodeProvenance(self)
return code_prov_check.getResult()

def check_data_content_metadata(self):
data_content_metadata_check = FAIREvaluatorDataContentMetadata(self)
return data_content_metadata_check.getResult()
Expand All @@ -496,7 +543,6 @@ def check_semantic_vocabulary(self):

def check_metadata_preservation(self):
metadata_preserved_check = FAIREvaluatorMetadataPreserved(self)
metadata_preserved_check.set_metric("FsF-A2-01M")
return metadata_preserved_check.getResult()

def check_standardised_protocol_data(self):
Expand Down
32 changes: 31 additions & 1 deletion fuji_server/controllers/fair_object_controller.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ async def assess_by_id(body):
access_level_result = ft.check_data_access_level()
# print('F-UJI checks: license')
license_result = ft.check_license()
license_file_result = ft.check_license_file()
# print('F-UJI checks: related')
related_resources_result = ft.check_relatedresources()
# print('F-UJI checks: searchable')
Expand All @@ -98,16 +99,24 @@ async def assess_by_id(body):
ft.harvest_all_data()
uid_data_result = ft.check_unique_content_identifier()
pid_data_result = ft.check_persistent_data_identifier()
upid_software_result = ft.check_unique_persistent_software_identifier()
software_component_result = ft.check_software_component_identifier()
version_identifier_result = ft.check_version_identifier()
development_metadata_result = ft.check_development_metadata()
open_api_result = ft.check_open_api()
requirements_result = ft.check_requirements()
test_cases_result = ft.check_test_cases()
data_identifier_included_result = ft.check_data_content_metadata()
metadata_identifier_included_result = ft.check_metadata_identifier_included_in_metadata()
data_file_format_result = ft.check_data_file_format()
# print('F-UJI checks: data file format')
community_standards_result = ft.check_community_metadatastandards()
data_provenance_result = ft.check_data_provenance()
code_provenance_result = ft.check_code_provenance()
formal_metadata_result = ft.check_formal_metadata()
# print('F-UJI checks: semantic vocab')
semantic_vocab_result = ft.check_semantic_vocabulary()
ft.check_metadata_preservation()
metadata_preserved_result = ft.check_metadata_preservation()
standard_protocol_data_result = ft.check_standardised_protocol_data()
standard_protocol_metadata_result = ft.check_standardised_protocol_metadata()
if uid_result:
Expand All @@ -118,6 +127,20 @@ async def assess_by_id(body):
results.append(uid_data_result)
if pid_data_result:
results.append(pid_data_result)
if upid_software_result:
results.append(upid_software_result)
if software_component_result:
results.append(software_component_result)
if version_identifier_result:
results.append(version_identifier_result)
if development_metadata_result:
results.append(development_metadata_result)
if open_api_result:
results.append(open_api_result)
if requirements_result:
results.append(requirements_result)
if test_cases_result:
results.append(test_cases_result)
if core_metadata_result:
results.append(core_metadata_result)
if content_identifier_included_result:
Expand All @@ -136,10 +159,14 @@ async def assess_by_id(body):
results.append(metadata_identifier_included_result)
if license_result:
results.append(license_result)
if license_file_result:
results.append(license_file_result)
if access_level_result:
results.append(access_level_result)
if data_provenance_result:
results.append(data_provenance_result)
if code_provenance_result:
results.append(code_provenance_result)
if community_standards_result:
results.append(community_standards_result)
if data_file_format_result:
Expand All @@ -148,6 +175,8 @@ async def assess_by_id(body):
results.append(standard_protocol_data_result)
if standard_protocol_metadata_result:
results.append(standard_protocol_metadata_result)
if metadata_preserved_result:
results.append(metadata_preserved_result)
debug_messages = ft.get_log_messages_dict()
# ft.logger_message_stream.flush()
summary = ft.get_assessment_summary(results)
Expand Down Expand Up @@ -178,6 +207,7 @@ async def assess_by_id(body):
if ft.pid_url:
idhelper = IdentifierHelper(ft.pid_url)
request["normalized_object_identifier"] = idhelper.get_normalized_id()
results.sort(key=lambda d: d["id"]) # sort results by metric ID
final_response = FAIRResults(
request=request,
start_timestamp=starttimestmp,
Expand Down
58 changes: 58 additions & 0 deletions fuji_server/data/software_file.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
{
"Jenkinsfile": {
"category": [
"automation"
],
"parse": "full",
"pattern": [
"(\\w*/)*Jenkinsfile"
]
},
"README": {
"category": [
"documentation"
],
"parse": "full",
"pattern": [
"(\\w*/)*README(\\.(txt|md))?"
]
},
"docs_directory": {
"category": [
"documentation"
],
"parse": "file_name",
"pattern": [
"(\\w*/)*docs(/\\w*\\.\\w*)*"
]
},
"github_actions": {
"category": [
"automation"
],
"parse": "full",
"pattern": [
"\\.github/workflows/"
]
},
"license_file": {
"category": [
"provenance"
],
"parse": "full",
"pattern": [
"(\\w*/)*LICEN(S|C)E(\\.\\w*)?",
"(\\w*/)*licen(s|c)e(\\.\\w*)?"
]
},
"maven_pom": {
"category": [
"documentation",
"automation"
],
"parse": "full",
"pattern": [
"pom\\.xml"
]
}
}
2 changes: 1 addition & 1 deletion fuji_server/evaluators/fair_evaluator.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ def isTestDefined(self, testid):
else:
self.logger.debug(
self.metric_identifier
+ " : This test is not defined in the metric YAML and therefore not performed -: "
+ " : This test is not defined in the metric YAML and therefore not performed: "
+ str(testid)
)
return False
Expand Down
Loading
Loading