Skip to content

Commit

Permalink
Junes 2024: Keynote and session 1 notes
Browse files Browse the repository at this point in the history
  • Loading branch information
manics committed Jun 27, 2024
1 parent 136262e commit eada133
Show file tree
Hide file tree
Showing 6 changed files with 196 additions and 79 deletions.
Original file line number Diff line number Diff line change
@@ -1,17 +1,12 @@
# Facilitating researcher access to data with PySyft
# Breakout: Facilitating researcher access to data with PySyft

**Leads**: Dave Buckley (OpenMined)

## Proposal
## Notes

### Summary
Dave provided a quick overview of OpenMined as an organisation and their flagship open source product, PySyft

[OpenMined](https://openmined.org/) will present their [PySyft notebook](https://github.com/OpenMined/PySyft) which allows to "Perform data science on data that remains in someone else's server"
[OpenMined](https://https://openmined.org)
[PySyft](https://https://github.com/OpenMined/pysyft)

### Preparation

No required preparation beyond an open mind!

### Target audience

No specific target audience in mind - anybody interested!
The aim is to facilitate "remote data science" cf. OpenSAFELY, by providing a framework for such services
Original file line number Diff line number Diff line change
@@ -1,14 +1,8 @@
# Do we need an IG working group?

Breakout discussion
# Breakout: Do we need an IG working group?

**Leads**: Amy Tilbrook (UoE)

## Proposal

Breakout discussions are open talks around a topic. They are open ended, and while we hope actions and collaboration arise from them, there is no specific output expected by the end of the session.

### Prompts
## Prompts

- Is there a perception of lack of IG knowledge in UK TRE community ("an IG black hole"?)
- What support would benefit the community?
Expand All @@ -21,21 +15,65 @@ A) Specific IG working group? Thoughts on:

B) Is IG an element in ALL working groups? (as per previous group days: "working groups need a purpose, as creating and maintaining one takes effort, many people are interested in everything/all groups"). How could this be supported?

C) Something else?
C) Something else? Previous suggestions for remit of an IG working group:

- Previous suggestions for remit of an IG working group:
- "Something around Information Governance and policies" e.g. ISO27001 and also local e.g. University policies
- Advice from contributors for specific issues -
- Advice from contributors for specific issues
- Aligning strategies for dealing with TRE-specific IG challenges - e.g. AI/ML, commercial access, international access

### Summary

Open discussion on the need of an Information Governance working group, its potential remit and possible members.

### Preparation

No required preparation beyond an open mind!

### Target audience

No specific target audience in mind - anybody interested!
## Notes

- What do we mean by IG? Each org uses it in slightly different terms so we don't have a unified conception of what is meant?
- Grampian "it covers the whole process from the minute a researcher contacts us, to when the project data is deleted"
- Technical stuff is seen to be easy, IG stuff is hard - would like to know how we are supposed to interact with all the different projects
- Difference between umbrella TRE IG and project related IG
- IG = context and organisation dependent - might depend on risk appetite or data held.
- Looking to adapt existing processes to adapt to a TRE way of working (data access rather than data sharing)
- Definite appetite for support in this area.
- Goes beyond/outside of ethics/regulatory compliance.
- Is there an interest in standardisation? With such context/data related specifics is this even possible?

- For organisations who have been doing this a long time, the differences between controllers/processors and roles for TREs are fairly established.
- Scotland is working towards a federated approach to governance across Scottish TREs to facilitate/streamline data sharing.
- Often issues arise in research governance rather than IG or data protection teams - we can't change the rules/regulations around health research governance.
- Why does it take 2 years to get access to data?
- Does IG encompass data sharing (contracting?)
- legal base issues within NHS SDEs

- IG Challenges:

- Consented vs unconsented projects: How does consent unlock data to be used by a project?
- What advice and guidance can be given when talking to new organisations/other contexts?
- Consented vs. unconsented studies
- Research governance a barrier
- Anonymous vs. identifiable data
- How would the 'IG' world support the consents of a person for the data held by a data controller org to allow their data to be accessed by the project they have consented to.
- How would the IG world react to technology options to extend control across federated facilities such that it eases the ability to speed up access to the data but maintaining as much control and governance oversight as possible.
- What are the rules and how can we articulate them across the board?

- Ideas:
- A group that could give advice on how to negotiate the relationships?
- Work towards TRE specific research governance to support research governance teams - what?
- Playbook for open IG (need to be fairly high level) - e.g. survey of high level workflows of IG processes within 5 safes. How to determine safe people, safe settings etc?
- Have a look at these pages - transparency standards: https://www.abdn.ac.uk/research/digital-research/accessing-data-1688.php and https://www.abdn.ac.uk/research/digital-research/obtaining-permissions-1703.php
- A community where questions could be posed/answered, discussion forum on IG set up, definition.

## Summary

- Technical is easy, IG is hard - because:
- What is IG? What does it encompass (difference between IG and research governance?)
- IG is context, data, organisation dependent (risk appetites)
- Anything created for general use would need to be fairly high level
- Appetite
- to understand what IG set ups there are across the TRE community
- to develop something to support TRE specific research governance/open IG
- to have a forum where IG questions could be asked (both umbrella IG for TREs in general - especially for newer TREs, and project specific IG issues - e.g. consented studies)
- to understand what is specific TRE IG (data access) rather than data sharing IG

Suggested first things for an IG group to do:

- Find drivers/champions/leads
- See Grampian examples of workflows
- Set up survey of
- What does IG mean in your context?
- What workflows/governance processes can be shared?
Original file line number Diff line number Diff line change
@@ -1,19 +1,45 @@
# New research: language to use when explaining SDEs and TREs to the public
# Breakout: New research: language to use when explaining SDEs and TREs to the public

**Leads**: Emma Morgan (Understanding Patient Data)

## Proposal

### Summary

[Understanding Patient Data](https://understandingpatientdata.org.uk/)(UPD) has recently published their [final report](https://understandingpatientdata.org.uk/what-words-use) on the What Words To Use project with Research Works, which focused on exploring the best language to use when explaining Secure Data Environments and Trusted Research Environments to the public.

During the event UPD will make a 20 minutes presentation on the project and its results, followed by an open discussion with the community.

### Preparation

No required preparation beyond an open mind!

### Target audience

No specific target audience in mind - anybody interested!
[Understanding Patient Data](https://understandingpatientdata.org.uk/)(UPD) has recently published their [final report](https://understandingpatientdata.org.uk/what-words-use) on the _What Words To Use_ project with Research Works, which focused on exploring the best language to use when explaining Secure Data Environments and Trusted Research Environments to the public.

## Notes

- Emma took us through UPD project: how to explain TREs and related terms to the public, and generate some explainer materials.
- Part 1: Rapid evidence review
- Patients supportive of direction to data access through TREs
- limited evidence on specific aspects of TREs
- Commercial use of data sometimes controversial
- Comms around TREs: explaining TREs is hard
- Lack of consistency in terms used (SDE, TRE etc,). Variety of names confusing and needs to be resolved
- 5 Safes useful as conceptual basis
- Benefits of data use key
- Don't assume prior knowledge
- Part 2: workshops
- 7? workshops, 6 participants each, tried to provide a good demographic mix (age, ethnicity, gender, digital exclusion)
- People care about: Is the data identifiable? Who has access? Reassurance that the data is safe. What the data is being used for, and for what purpose/benefit.
- Some consensus in preferences over the use of certain terms/language.
- Part 3: Explainer materials/draft resource: different tiered 'levels' of information for different levels of interest
- 2x workshops
- Interviews with domain expertise to fact-check
- 1st level: Concise description of TRE/SDEs
- 2nd level: Animation being prepared w/story board and voiceover
- 3rd level: more detailed info on specific terms (e.g. 5 Safes)

## Discussion

- How might you use this information/resource?
- Honest broker service in NI, will flag this report with team who are leading on some work on public transparency (funding from UKRI). Liked the way the materials are adaptable for own use
- Works in HDR Global, lower and middle income countries, lots of interest in TREs there. Work could be useful across these different regions, approach could be taken and tested across different regions.
- RDS released TRE explainer, will tweak to reflect some of the findings from this work (over use of term 'de-identified').
- Concerns about methodology, findings or resources that would limit you adopting them?
- What do you think about the balance between transparency and accessibility?
- What other topics related to TREs would benefit from PPIE?

## Summary

Presentation then discussion with positive feedback.
There will be an animation that can be voiced over by different TREs with their specifics, accents...

Concerns about resources: trying to make something for everyone but there will always be gaps
10 changes: 5 additions & 5 deletions docs/events/wg_workshops/2024-06-05-june-meeting/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,11 +75,11 @@ There will be two sessions on the day of 45 minutes each.

#### Session 1

- [](./workshop-data-processing-tools.md) - workshop
- [](./discussion-information-governance.md) - discussion
- [](./discussion-what-words-to-use.md) - discussion
- [](./workshop-researcher-registry.md) - workshop
- [](./discussion-data-access-pysyft.md) - discussion
- [](./workshop-data-processing-tools.md) - Workshop
- [](./discussion-information-governance.md) - Discussion
- [](./discussion-what-words-to-use.md) - Discussion
- [](./workshop-researcher-registry.md) - Workshop
- [](./discussion-data-access-pysyft.md) - Discussion

#### Session 2

Expand Down
Original file line number Diff line number Diff line change
@@ -1,26 +1,43 @@
# Data processing tools

Workshop
# Workshop: Data processing tools

**Leads**: James Friel (University of Dundee), Aida Sanchez (UCL)

## Proposal
Discussion around data processing, de-identification, and cohort building.

## Required preparation

A general understanding of data anonymisation.
The ICO anonymisation guidance & the ADF (anonymisation decision making framework) may be of interest as a grounding in this.

## Target audience

People who work in data de-identification and data providers for TREs

### Prompts
## Prompts

- Risk appetite to deposit data in a TRE - What level on de-identification is comfortable for use within a TRE? e.g truncation, pseudo-anonymization
- Risk appetite to deposit data in a TRE - What level of de-identification is comfortable for use within a TRE? e.g truncation, pseudo-anonymization
- What do current data processing pipelines look like? And are their pain points in the process?
- What De-identification tools are being used? What has worked? What hasn't?
- What de-identification tools are being used? What has worked? What hasn't?

### Summary
## Notes

Intro discussion around data processing, de-identification , and cohort building.
CPRD Clinical Practice Research Datalink

### Preparation
- https://www.cprd.com/cprd-tre-features-guide-users

A general understanding of data anonymisation.
The ICO anonymisation guidance & the ADF (anonymisation decision making framework) may be of interest as a grounding in this.
Canon: have non-opensource tools (DICOM, FHIR, CSV, Free Text, 'omics, Pathology)

- Only available via agreement with https://research.eu.medical.canon/

NetCDF, ArcGIS Enterprise, 100+TB data, SPARK to process data

- Provide data to federated TREs

Plans for using OpenShift. Possible batch schedulers:

- https://www.coreweave.com/blog/sunk-slurm-on-kubernetes-implementations
- https://kueue.sigs.k8s.io/

### Target audience
## Summary

People who work in data de-identification and data providers for TRES
General discussion of approaches and tools used
Original file line number Diff line number Diff line change
@@ -1,22 +1,63 @@
# Researcher Registry Project
# Workshop: Researcher Registry Project

**Leads**: Emily Jefferson (HDR UK)
**Leads**: Emily Jefferson (HDR UK), Rachel Tesfaye, Senior Technical Programme Manager, HDR UK

## Proposal
- Project brief can be accessed [here](https://hdruk.box.com/s/spjh7o8cgr7arejvzna6dg386ndkuv2a)
- Presentation slides can be found [here](https://hdruk.box.com/s/jyyhabcwrks9b6vk7osd4uqwn1opou5y).

### Summary
## Session Details

Researcher Registry Project
The Researcher Registry is one of four [cloud pilot projects](https://www.ukri.org/news/pilot-projects-will-aid-better-and-safer-use-of-data-in-research/) funded by UKRI with the aim of aiding better and safer use of data in research. The project is a partnership initiative working towards an integrated ‘Know Your Researcher’ system, aligned with the ‘Five Safes’ Safe People principle, surfacing historical and current information about a researcher requesting access to a TRE.

HDR is developing a UK-wide standard for a "Safe Person" and also a technical solution to support TREs to be able to see information on Researchers and their organisations to help them make decisions on whether or not a Researcher is "safe".
The registry would also keep a record of which projects researchers have been approved to work on across TREs.
The aims of this project workshop are as follows:

HDR UK would like input into the current design from the community to ensure that it meets needs.
- **Project introduction**: high-level project overview, technical developments (blockchain technology implementation, rules engine development, Identity Document Validation Technology)
- **Researcher and Organisation Registry data flow:** explore key components of the ‘Know Your Researcher’ system
- **Open discussion:** key questions to ensure the capability of the Researcher Registry tool meets the needs of the UK TRE Community and adds value by streamlining existing researcher verification processes (see prompts below)
- **Summary and next steps:** summary of key outputs and actions from the session alongside next steps related to upcoming development and operational milestones

### Preparation
### Required preparation

No required preparation beyond an open mind!
Project brief can be accessed [here](https://hdruk.box.com/s/spjh7o8cgr7arejvzna6dg386ndkuv2a) which outlines the project background, goals, scope, timelines, stakeholders, current common model, and high-level data flow.

### Target audience

No specific target audience in mind - anybody interested!
All UK TRE Community members are welcome to attend this session including Community members with an interest or expertise in the Five Safes framework, particularly Safe People.

### Prompts

- What are the key considerations or concerns surrounding the researcher workflow?
- What are the key considerations or concerns surrounding the organisational workflow?
- What are the key considerations or concerns surrounding manual vs automated researcher verification processes? e.g.: up-to date project information, up-to-date training information etc.
- What other systems (upstream or downstream) may be impacted by the implementation of this tool?

## Notes

Open discussion on key questions to ensure the capabilities of the project serves the community.

### Researcher workflow

- Account for multiple organisations/institutional link to individual researcher.
- Given the nature of researchers often wearing multiple hats across different organisations/affiliations, the Registry will have the functionality for users to have a primary/lead organisation alongside adding additional affiliations.
- Delegate sponsorship will also be available for an extra degree of verification via [Identity Document Validation Technology](https://www.gov.uk/government/publications/identity-document-validation-technology/identification-document-validation-technology) (IDVT).

### Manual vs automated processes

- The researcher registry will have a rules engine capability enabling Issuers (TREs/SDEs) to write automated decision models
- Not weight limiting for OPC use case – doesn’t need to be automated, no issue with current manual checks
- What benefit is it yielding, potential delays to access in data? Security systems/enable quicker access to data for frequent fliers – thresholds for quicker approvals.
- Depending on TRE, take time for administration staff to do checks, potential efficiency benefit. Other benefit – most organisations have responsibility for employees in terms of access to sensitive data (including good/bad behaviour), but they don’t have an idea of who is currently accessing data and where and proper process. Or level for DPO of organisation to see all researchers and which projects they are on. Other benefit – One TRE able to see which TREs researchers have worked on. Better way of community of TREs.

### User testing

- Early testing of prototype and MVP is a priority
- We will be testing with NHS SDE Network exemplar pilots over the summer as early adoptors / examplar pilots. We will also test the prototype with UK TRE Community in autumn and test the MVP in winter.
- Also working with SAIL as a delivery partner

## Next steps

- Communicate information around IG and Policy Working Group to interested UK TRE Community members.
- Communicate information around prototype stakeholder reviews and testing to interested UK TRE Community members.
- Potentially host follow-up session at September UK TRE Community meeting.

Please contact [email protected] with any questions or comments.

0 comments on commit eada133

Please sign in to comment.