Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate Detection: REST Contract submission section & item link #252

Merged
merged 6 commits into from
Mar 4, 2024

Conversation

kshepherd
Copy link
Member

@kshepherd kshepherd commented Jan 21, 2024

Adds REST contract descriptions for the Duplicate Detection feature, which has active PRs.

The initial work on this feature was supported by TU Berlin. This work was supported by FHNW and ZHAW.

Related PRs:

Backend: DSpace/DSpace#9265
Frontend: DSpace/dspace-angular#2749
Replaces: #198

The changes are made to items.md (new item link) and workspaceitems.md (new submission section), with the example data provided in a new workspaceitem-data-duplicates.md file.

items.md Outdated Show resolved Hide resolved
@kshepherd
Copy link
Member Author

@tdonohue i have addressed all review feedback for this feature and @MW3000 has tested the dspace and angular features successfully. The PRs are still waiting on a second approval. Are we able to get another review in for this feature?

@tdonohue tdonohue self-requested a review February 15, 2024 15:30
Copy link
Member

@abollini abollini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a pity to note that this feature is not longer a porting of the DSpace-CRIS detect duplicate feature available since 7.0 (actually version 4 more than 9 years ago).
The feature is much less extend than the one provided by DSpace-CRIS and look to be incompatible with this one. I see a risk in accepting this PR to make DSpace and DSpace-CRIS more far when instead we are trying to take the two projects as close as possible. I expect users to be confused by the differences and subtracting resources in the DSpace / DSpace-CRIS community that would be better spent in managing a convergence among these two projects.

items.md Outdated

### findDuplicates

**GET /api/core/items/search/findDuplicates?uuid=<:uuid>**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This endpoint is not aligned to our REST best practices. A search method in the items collection resource endpoint must return a list of Item Resource.
If we need this endpoint it should be moved to a new endpoint specific for duplicate groups but as the feature is intended to be used only in the submission I don't expect this to be really needed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have moved this to a separate controller

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kshepherd : This REST Contract doesn't appear to be updated with your new endpoint/controller. Could you update it please?

Also, take a look at my feedback on the backend PR, as I suggested you might rename this to be something like..

/api/submission/duplicates/search?uuid=

That is modeled similar to the Controlled vocabulary endpoints which are also used primarily in Submission: https://github.com/DSpace/RestContract/blob/main/vocabularies.md

workspaceitem-data-duplicates.md Show resolved Hide resolved
@kshepherd
Copy link
Member Author

I have moved duplicates to a separate controller and documented it as such (see also: REST and angular PRs as linked in description). I have also shifted solr work so it just uses an existing non-tokenized field instead of introducing a _signature field to avoid confusion and conflict.
If the endpoint name should also be changed to avoid confusion or conflict with 4science work coming in the future let me know, the endpoint name/path is of no consequence to me.

Copy link
Member

@tdonohue tdonohue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kshepherd : Overall, this looks good. I have a minor reorganization note though below that should be addressed.

submission.md Outdated

## Finding potential duplicate items

**GET /api/submission/duplicates/search?uuid=<:uuid>**
Copy link
Member

@tdonohue tdonohue Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend moving this documentation to a new duplicates.md page. This submission.md page is an overview of how to perform a submission from the REST API. So, this doesn't belong here, as it's not a task you need to do to perform a submission from the REST API.

Instead, this should have it's own "page" in the REST Contract, just like how the /api/submission/vocabularies endpoint (for controlled vocabs) has it's own page in vocabularies.md.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, done. also linked from endpoints.md

@github-actions github-actions bot added the merge conflict PR has a merge conflict that needs resolution label Feb 28, 2024
Copy link

Hi @kshepherd,
Conflicts have been detected against the base branch.
Please resolve these conflicts as soon as you can. Thanks!

@github-actions github-actions bot removed the merge conflict PR has a merge conflict that needs resolution label Feb 28, 2024
@kshepherd
Copy link
Member Author

@tdonohue applied changes and rebased -- i added duplicates.md to the main section of endpoints.md since if this is merged, it will not be experimental/under discussion

Copy link
Member

@abollini abollini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @kshepherd the search method needs a small modification to conform to our practice.

duplicates.md Outdated

## Search

**GET /api/submission/duplicates/search?uuid=<:uuid>**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not aligned to our REST architecture. We should have something like that
/api/submission/duplicates/search/byUUID

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my note in the implementation PR. I think this refactor would "happen naturally" if you were to move your *Controller into a *RestRepository class. So this comment is saying the same sort of thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. the angular data service will need updating too, if i am able to do this -- as i noted in the other PR comment, my understanding was that RestRepositories with searchBy methods etc require an addressable object, which the "duplicate" stubs are not (even though they are related to addressable Items).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kshepherd : I tried to clarify the RestRepository question in your implementation PR here. Please let us know if it still doesn't make sense though.

@kshepherd
Copy link
Member Author

Pushed update with new notes about get single / get all not being implemented, and the new search path

Copy link
Member

@tdonohue tdonohue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Thanks @kshepherd ! This looks good now. Verified that all feedback has been addressed from this PR & the backend PR

@tdonohue tdonohue merged commit 7b39dc0 into DSpace:main Mar 4, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: submission Related to configurable submission system component: workflow Related to approval workflow or configurable workflow high priority new feature
Projects
No open projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

3 participants