-
Notifications
You must be signed in to change notification settings - Fork 436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: How does the correlation id work? Do we need to change it? #3335
Comments
I went to demo.dspace.org and looked in to the network and storage segments of the web developer tools:
In the dspace-angular source code I found the From my understanding: we are creating the correlation id via JavaScript if we are not able to load it from the store or via Cookies. The correlation ID is stored for as long as the browser session exists. It is not being regenerated unless we loose the stored value, which happens when the browser expires the session or someone deletes the correlation id manually. |
The correlation id might still be a case of strictly necessary cookie. We use it to investigate and track down bugs. I just don't know how GDPR defines strictly necessary cookies and tracking. |
@abollini Can you please take a look on this? Do you have further information about the correlation id and the gdpr? |
The correlation id is generated by angular at the first access if one is not already available, this mean that opening multiple tab or windows will share the same correlation id. The goal is to eventually have a "session id" to understand what is going on in the log. The application can work without this cookie (or the x-correlation-id header). My recommendation is to include the cookie in our "klaro" consent management as an optional one to help our system administrator to better understand how to platform is used and trouble shoot issues. BTW: Our x-correlation-id and x-request-id are custom-made solutions to the problem of distributed tracking, where the world's traceId and spanId are used. It would be nice for DSpace to move to the "standard" opentelemetry approach https://opentelemetry.io/ unfortunately, we haven't had yet the time to investigate deeper |
In today's DevMtg we discussed this ticket in more detail. General consensus was that there seems to be two tasks here (also summarized in @abollini's notes above):
|
I don't have the required discussion context, but I've addressed a question to our GDPR department to understand if we could store emails in logs for audit purposes and I will translate it here the answer (I will omit some parts):
Also I would like to add that we are forced by a national law, regarding data protection, to temporally store (for audit purposes) all the user's activity when dealing with personal data (like CRUD operations). Hope it helps in this discussion! |
@paulo-graca : Thank you for asking this back to your GDPR department. That's extremely useful information. It also seems to align well with the idea to possibly replace the email in logs with the EPerson UUID (mentioned in second bullet of my prior comment). Replacing emails in logs with EPerson UUID would provide the "pseudonymization" that your GDPR department mentioned. It would mean the logs by themselves are "anonymous" (meaning you cannot figure out who anyone is if you only have access to the logs). But, a System Administrator could still audit the logs by matching up the EPerson UUID from the database to their activities in the logs. So, your feedback seems to say that we are heading in the right direction. It also does verify that storing the email in the logs isn't necessary wrong. But, DSpace sites need to be aware that emails can be found in the logs, so they can treat their logs appropriately. That said, it does seem like it'd be better to switch to using the EPerson UUID instead of the email. |
In DSpace/DSpace#3303 we improved the logging of REST requests. As part of this the backend started to log a correlation id, if it was submitted in the request in an http header called
X-CORRELATION-ID
. It also logs the page that triggered the request against the REST API, if a uuid is submitted in a HeaderX-REFERRER
. While the aforementioned PR implemented this in the backend, #1255 implemented it in the frontend. In #1465 the place to store the correlation id in the frontend was changed. Furthermore we have an open issue that this is not documented in the REST contract: DSpace/RestContract#245.During a DSpace developer meeting questions about the correlation id came up:
The text was updated successfully, but these errors were encountered: