Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revamp how CDash removes old data #2093

Open
13 of 36 tasks
zackgalbreath opened this issue Mar 20, 2024 · 1 comment
Open
13 of 36 tasks

Revamp how CDash removes old data #2093

zackgalbreath opened this issue Mar 20, 2024 · 1 comment

Comments

@zackgalbreath
Copy link
Contributor

zackgalbreath commented Mar 20, 2024

Feature Request

How can we make CDash better?

PRs #1655, #1656, and #1657 added foreign keys to many of CDash's tables, helping to protect our data integrity & make sure that old data gets deleted automatically when it is no longer referenced.

I audited the rest of CDash's tables and came up with the following list of recommendations.

Unused tables we could probably drop without impacting existing functionality

  • apitoken

Tables that would benefit from foreign keys

Tables whose rows contain a timestamp that could be used for periodic deletion

  • dailyupdate
  • lockout
  • password_resets
  • subproject
  • subprojectgroup

It's worth noting here that the following tables are already cleaned up in addDailyChanges():

  • buildgroup
  • build2grouprule
  • failed_jobs
  • successful_jobs
  • usertemp

Shared data that could be deleted by periodic NOT IN (...) queries:

  • buildfailuredetails (buildfailure.detailsid)
  • buildfailureargument (buildfailure2argument.argumentid)
  • buildupdate (build2update.updateid)
  • configure (build2configure.configureid)
  • coveragefile (coverage.fileid)
  • image (test2image.imgid)
  • label -- this one seems tricky, we would have to check every label2* table.
  • note (build2note.noteid)
  • repositories (project2repositories.repositoryid)
  • site (build.siteid)
  • test (build2test.testid and/or testoutput.testid)
  • testoutput (build2test.outputid)
  • uploadfile (build2uploadfile.fileid)

Many of these tables are already being handled through clever queries in remove_builds() but if a row somehow "slips through the cracks" it currently requires manual intervention to delete it later on.

Functionality to more generally reconsider:

  • coveragefile2user -- this association seems tied to a particular version of a source file, I'm not sure that's actually useful?
  • dailyupdatefile -- we might not need this at all anymore since we dropped the "feed" a while back?
@williamjallen
Copy link
Collaborator

It's worth noting that some tables like the banner table contain "global" rows (using the project ID 0, for example), which makes it more difficult than it initially appears.

Great work putting together this list though! I'll gradually work though it as I have time.

github-merge-queue bot pushed a commit that referenced this issue Sep 24, 2024
In support of #2093, this PR adds
foreign-key constraints to each of the label pivot tables, as well as a
multitude of missing indexes.
github-merge-queue bot pushed a commit that referenced this issue Sep 27, 2024
This PR adds a foreign-key constraint to the `dynamicanalysisdefect`
table, as well as associated indexing, in support of #2093. Dynamic
analysis results are now cleaned up 100% automatically when their
associated build results are removed.
github-merge-queue bot pushed a commit that referenced this issue Nov 4, 2024
This continues our ongoing effort described in
#2093 to add foreign-key
constraints wherever possible for better data integrity.
zackgalbreath added a commit that referenced this issue Nov 18, 2024
This PR adds a foreign-key constraint to the `configureerror` table
in support of #2093.
github-merge-queue bot pushed a commit that referenced this issue Nov 18, 2024
This PR adds a foreign-key constraint to the `configureerror` table in
support of #2093.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants