Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve link from run parameters table to run table #20

Open
3 tasks
donkirkby opened this issue Oct 1, 2014 · 0 comments
Open
3 tasks

Improve link from run parameters table to run table #20

donkirkby opened this issue Oct 1, 2014 · 0 comments
Labels
Milestone

Comments

@donkirkby
Copy link
Member

The Problem

We currently join the two tables this way:

miseqqc_runparameters.experimentname = lab_miseq_run.runname

Of course, it's not that easy, because the two fields are dates in string format that were entered by the user. This results in a join like this:

TO_DATE(L.RUNNAME, 'DD-MON-YYYY') = TO_DATE(REGEXP_SUBSTR(EXPERIMENTNAME, '\\d{1,2}-\\w{3,4}-\\d{2,4}'))

As ugly as that code is, it still breaks when someone decides that the date in the sample sheet is "wrong", and edits it after the run has started. Now the date in the miseqqc_runparameters table has the old value, and the date in the lab_miseq_run table has the new value. The report will not find the data for that run.

Proposed Solution

Foreign keys are a powerful tool, let's use one. Add a field lab_run_id to the miseqqc_runparameters table, and populate it when you upload the QC data. That way, the join becomes trivial, and later changes to the date fields won't break the link. If we can't make the join, then the QC upload will fail, and we have to clean up the files immediately, rather than trying to clean up old data months later.
Maybe using the date to join is too brittle, because people often want to "correct" the date they entered. Would it work to put the primary key from lab_miseq_run in the sample sheet's experiment name field? Then if a user wants to change the date in the sample sheet, they just change the project name field, and the experiment name is unchanged. Does the experiment name show up on the instrument, and will anyone care if it becomes a meaningless number?

  • Add lab_run_id foreign key to miseqqc_runparameters table.
  • Discuss using that same id number in the sample sheet's experiment name field.
  • Actually make the change to experiment name.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant