Skip to content

Commit

Permalink
[TLC-674] Update duplicate search path to /api/submission/duplicates,…
Browse files Browse the repository at this point in the history
… include in submission.md
  • Loading branch information
kshepherd committed Feb 28, 2024
1 parent 8feb9dc commit c68000a
Show file tree
Hide file tree
Showing 2 changed files with 93 additions and 99 deletions.
98 changes: 0 additions & 98 deletions duplicates.md

This file was deleted.

94 changes: 93 additions & 1 deletion submission.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ This is the WorkspaceItem object you created.
It is **important** to keep the `id` of the WorkspaceItem, as this is necessary to update it or access it again.
For example, using the `id`, you can load up the current state of your WorkspaceItem
```
GET /api/sumission/workspaceitems/<:id>
GET /api/submission/workspaceitems/<:id>
```

In the response, you'll see a list of `sections` which are available to complete for this WorkspaceItem.
Expand Down Expand Up @@ -66,3 +66,95 @@ The final Item's UUID will be the same as it was in the WorkspaceItem (i.e. the
`/api/submission/workspaceitems/<:id>/item`)
* If the Collection has an approval workflow configured, then a WorkflowItem will be returned. Its `id` can be used
to access the WorkflowItem via `/api/workflow/workflowitems/<:id>`.

## Finding potential duplicate items

**GET /api/submission/duplicates/search?uuid=<:uuid>**

Provides a list of items that may be duplicates, if this feature is enabled, given the uuid as a parameter.

The potential duplicates listed in the section have all been detected by a special Solr search that compares the
levenshtein edit distance between the in-progress item title and other item titles (normalised).

Note that although this appears in the submission category, the item UUID can also be an archived item.
Currently, the only frontend use of this feature is in workspace and workflow, so it is categorised as such.

Each potential duplicate has the following attributes:

* title: The item title
* uuid: The item UUID
* owningCollectionName: Name of the owning collection, if present
* workspaceItemId: Integer ID of the workspace item, if present
* workflowItemId: Integer ID of the workflow item, if present
* metadata: A list of metadata values copied from the item, as per configuration
* type: The value is always DUPLICATE. This is the 'type' category used for serialization/deserialization.

See `dspace/config/modules/duplicate-detection.cfg` for configuration properties and examples.

Example

```json
{
"potentialDuplicates": [
{
"title": "Example Item",
"uuid": "5ca83276-f003-460d-98b6-dd3c30708749",
"owningCollectionName": "Publishers",
"workspaceItemId": null,
"workflowItemId": null,
"metadata": {
"dc.title": [
{
"value": "Example Item",
"language": null,
"authority": null,
"confidence": -1,
"place": 0
}
],
"dspace.entity.type": [
{
"value": "Publication",
"language": null,
"authority": null,
"confidence": -1,
"place": 0
}
]
},
"type": "DUPLICATE"
}, {
"title": "Example Itom",
"uuid": "32f8f6e4-c79e-4322-aae7-07ee535f70a6",
"owningCollectionName": null,
"workspaceItemId": 51,
"workflowItemId": null,
"metadata": {
"dc.title": [{
"value": "Example Itom",
"language": null,
"authority": null,
"confidence": -1,
"place": 0
}]
},
"type": "DUPLICATE"
}, {
"title": "Exaple Item",
"uuid": "0647ff45-48f5-4c1b-b6d7-f5dbbc160856",
"owningCollectionName": null,
"workspaceItemId": 52,
"workflowItemId": null,
"metadata": {
"dc.title": [{
"value": "Exaple Item",
"language": null,
"authority": null,
"confidence": -1,
"place": 0
}]
},
"type": "DUPLICATE"
}]
}
```

0 comments on commit c68000a

Please sign in to comment.