Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data pump: tasklistitem difference #628

Open
milanmajchrak opened this issue Apr 10, 2024 · 2 comments
Open

Data pump: tasklistitem difference #628

milanmajchrak opened this issue Apr 10, 2024 · 2 comments

Comments

@milanmajchrak
Copy link
Collaborator

After import:
v7: cwf_pooltask = 2
v5: tasklistitem = 14

NOTE:

  • Data pump logged 14 imported tasklistitems
  • I think maybe the tasklistitem table was divided or it works differently in the v7 - I think we were solving this strange difference
  • Python Issue: 80-PY/import-tasklistitem dspace-import#35
@Paurikova2
Copy link
Collaborator

@milanmajchrak
The reason why the number of imported values into cwf_pooltask is different from tasklistitem is that tasklistitem joins the workflow item with the eperson while cwf_pooltask joins the workflow item with the group.

A workflow item is assigned to a collection. Each collection has an assigned group. The name of the group is based on the assigned collection and workflow step. So in the clarin-dspace database: Workflow item with id 2142 has an assigned collection with id 2 and this collection is assigned to group 19, because its name is COLLECTION_2_WORKFLOW_STEP_2 [object_id_workflowStep].

In the Dspace database, the workflow item is connected with the collection and group as in the clarin-dspace database.

The tasklistitem assigns the eperson to the workflow item, but the cwf_pooltask is not assigned to the eperson but to the COLLECTION_[uuid]_WORKFLOW_STEP_2 group. From this, it follows that all epersons from tasklistitem for one workflow item have to be assigned to the collection from cwf_pooltask assigned to this workflow item. Because in tasklistitem there are 2 workflow items and 7 epersons are assigned to each of them, in cwf_pooltask we have only 2 records, and the assignment of eperson to that group is in the epersongroup2eperson table. Since the workflow items are assigned to the same collection, the records in the cwf_pooltask table are assigned to the same group, and the epersongroup2eperson table has 7 more records.

Here is issue where is less detailed description:
dataquest-dev/dspace-import#35

@milanmajchrak
Copy link
Collaborator Author

OK, I think I understand. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants