Closes #1483: Set collectedfrom to OpenAIRE in all the relations exported by the IIS #1484
+753
−202
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
My initial impression was it is going to be just a simple addition/extension but it turned out I had to modify quite a substantial part of the exporter module. Mostly due to having different Spark exporter classes for different relation types bound to exported entities (patent, software, citation relations) and a whole different stack for plain relations coming from text mining algorithms covered with classes extending
AbstractRelationBuilderModule
.I am adding support for
export_relation_collectedfrom_value
IIS input parameter specifying datasource identifier of all relations exported by the IIS.So every exported relation is going to have
Relation#collectedfrom[0]#key
set to predefinedOpenAIRE
value andRelation#collectedfrom[0]#value
set toexport_relation_collectedfrom_value
input parameter value provided at runtime (already added todefault-config.xml
file: https://git.icm.edu.pl/openaire/iis-deployment/-/commit/e12d3e4a23517b8e1d5601aa077c0293d8f08d40).The already existing exporter modules class hierarchy was changed by introducing
AbstractRelationBuilderModule
to covercollectedfrom
related processing when buildingRelation
object.The existing code is simplified by moving the code responsible for building
Relation
object toBuilderModuleHelper
class. Supplementing unit tests suite with the relevant tests inBuilderModuleHelperTest
also covering missing tests for other utiliy methods from theBuilderModuleHelper
class.export_relation_collectedfrom_value
was introduced in multipleworkflow.xml
files starting from IIS primary main workflow and going down to the exporter subworkflow.Integration tests were extended with additional "expectations" encoded in multiple
*.properties
files with expectedcollectedfrom
key
andvalue
.