Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Y24-402 - Specify 'no. pools' rather than 'no. samples per pool' in submission #4467

Open
wants to merge 42 commits into
base: migrating-to-number-of-pools-epic
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
9a47613
Adding columns and conditional formatting
dasunpubudumal Oct 28, 2024
ae562ca
Removing samples_per_pool column
dasunpubudumal Oct 29, 2024
f4ea40a
Adding study-project uniqueness validation
dasunpubudumal Oct 30, 2024
6054967
Merge remote-tracking branch 'origin/develop' into 4423-y24-402-speci…
dasunpubudumal Oct 30, 2024
73b3530
Refactoring with the column name change
dasunpubudumal Oct 30, 2024
273d79a
Refactoring with the column name change
dasunpubudumal Oct 30, 2024
7b9d50e
Refactoring with the column name change
dasunpubudumal Oct 31, 2024
7ced469
New tests
dasunpubudumal Nov 4, 2024
af5ef2a
More tests
dasunpubudumal Nov 4, 2024
0883d04
Making prettier happy
dasunpubudumal Nov 5, 2024
5f0691f
Making prettier happy
dasunpubudumal Nov 5, 2024
cb26850
Fixing linters
dasunpubudumal Nov 5, 2024
04c49ae
Fixing linters
dasunpubudumal Nov 5, 2024
8887732
Refactoring for tubes and plates
dasunpubudumal Nov 5, 2024
5aa28cb
Refactoring for tubes and plates
dasunpubudumal Nov 5, 2024
7aba186
Reverting some changes that aren't required
dasunpubudumal Nov 5, 2024
8223742
Replacing no of sample per pool with no of pools
dasunpubudumal Nov 5, 2024
ce715de
Adding validation logic for tubes
dasunpubudumal Nov 5, 2024
217a62f
Adding validation logic for tubes
dasunpubudumal Nov 5, 2024
89a86bd
Changing column name
dasunpubudumal Nov 5, 2024
f4649f8
Refactoring the existing bulk submission tests to include the new ones
dasunpubudumal Nov 5, 2024
7058a7c
Adding number of pools to the UI
dasunpubudumal Nov 5, 2024
6b25c81
Removing redundant column validations
dasunpubudumal Nov 7, 2024
0f93a2d
Refactoring for reviews
dasunpubudumal Nov 7, 2024
3ddc5b9
Refactoring for reviews
dasunpubudumal Nov 7, 2024
d47250d
Linting
dasunpubudumal Nov 7, 2024
c2c5e20
Update number of pools validation range in columns.yml
dasunpubudumal Nov 14, 2024
fa10bc9
Merge remote-tracking branch 'origin/develop' into 4423-y24-402-speci…
dasunpubudumal Nov 26, 2024
bdd639d
Raising exception for invalid labware types
dasunpubudumal Nov 26, 2024
eaacc48
[skip ci] Update comment
dasunpubudumal Nov 26, 2024
933afad
Fixing rubocop issue
dasunpubudumal Nov 26, 2024
b881745
Adding more tests
dasunpubudumal Nov 26, 2024
682785c
Adding method docs
dasunpubudumal Nov 27, 2024
dfce479
Refactor validation logic
dasunpubudumal Nov 27, 2024
8c75444
Adding tests for tubes
dasunpubudumal Nov 27, 2024
ff64214
Adding a test for invalid scenario
dasunpubudumal Nov 27, 2024
fea780f
[skip ci] refactoring some comments in tests
dasunpubudumal Nov 27, 2024
3375001
Updating tests to add shared examples for invalid scenarios
dasunpubudumal Nov 27, 2024
d58bd71
[skip ci] Updating tests to add shared examples for invalid scenarios
dasunpubudumal Nov 27, 2024
e409e5e
Changing shared example names
dasunpubudumal Nov 27, 2024
e44c0ce
Refactoring long methods
dasunpubudumal Nov 28, 2024
9ef7c10
Refactoring long methods
dasunpubudumal Nov 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .rubocop_todo.yml
Original file line number Diff line number Diff line change
Expand Up @@ -297,6 +297,7 @@ Metrics/AbcSize:
- 'app/controllers/api/v2/transfers/transfers_controller.rb'
- 'app/jobs/export_pool_xp_to_traction_job.rb'
- 'app/sample_manifest_excel/sample_manifest_excel/manifest_type_list.rb'
- 'spec/models/bulk_submission_scrna_spec.rb'

# Offense count: 1
# Configuration parameters: CountComments, Max, CountAsOne.
Expand Down Expand Up @@ -857,6 +858,7 @@ RSpec/LetSetup:
- 'spec/models/transfer_request_collection_spec.rb'
- 'spec/requests/api/v2/comments_spec.rb'
- 'spec/requests/api/v2/plates_spec.rb'
- 'spec/models/bulk_submission_scrna_spec.rb'

# Offense count: 40
# Configuration parameters: EnforcedStyle.
Expand Down
10 changes: 8 additions & 2 deletions app/models/bulk_submission.rb
Original file line number Diff line number Diff line change
Expand Up @@ -231,7 +231,7 @@ def process # rubocop:todo Metrics/CyclomaticComplexity
'gigabases expected',
'priority',
'flowcell type',
'scrna core number of samples per pool',
'scrna core number of pools',
'scrna core cells per chip well'
].freeze

Expand Down Expand Up @@ -312,6 +312,12 @@ def add_study_to_assets(assets, study)
assets.map(&:samples).flatten.uniq.each { |sample| sample.studies << study unless sample.studies.include?(study) }
end

# Assigns a value from the source object to the target object if the source value is present.
#
# @param [Hash] source_obj The source object containing the value to be assigned.
# @param [String, Symbol] source_key The key to look up the value in the source object.
# @param [Hash] target_obj The target object where the value will be assigned.
# @param [String, Symbol] target_key The key to assign the value in the target object.
def assign_value_if_source_present(source_obj, source_key, target_obj, target_key)
target_obj[target_key] = source_obj[source_key] if source_obj[source_key].present?
end
Expand All @@ -329,7 +335,7 @@ def extract_request_options(details)
['gigabases expected', 'gigabases_expected'],
['primer panel', 'primer_panel_name'],
['flowcell type', 'requested_flowcell_type'],
['scrna core number of samples per pool', 'number_of_samples_per_pool'],
['scrna core number of pools', 'number_of_pools'],
['scrna core cells per chip well', 'cells_per_chip_well']
].each do |source_key, target_key|
assign_value_if_source_present(details, source_key, request_options, target_key)
Expand Down
2 changes: 1 addition & 1 deletion app/models/pbmc_pooling_customer_request.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
# A class for customer requests that need the extra metadata fields used for PBMC pooling calculations
class PbmcPoolingCustomerRequest < CustomerRequest
has_metadata as: Request do
custom_attribute(:number_of_samples_per_pool, integer: true, required: false, default: nil)
custom_attribute(:cells_per_chip_well, integer: true, required: false, default: nil)
custom_attribute(:number_of_pools, integer: true, required: false, default: nil)
end
end
237 changes: 224 additions & 13 deletions app/models/submission/validations_by_template_name.rb
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# frozen_string_literal: true
# rubocop:todo Metrics/ModuleLength
module Submission::ValidationsByTemplateName
# Template names
SCRNA_CORE_CDNA_PREP_GEM_X_5P = 'Limber-Htp - scRNA Core cDNA Prep GEM-X 5p'
Expand All @@ -7,9 +8,13 @@
HEADER_TEMPLATE_NAME = 'template name'
HEADER_STUDY_NAME = 'study name'
HEADER_PROJECT_NAME = 'project name'
HEADER_NUM_SAMPLES = 'scrna core number of samples per pool'
HEADER_BARCODE = 'barcode'
HEADER_PLATE_WELLS = 'plate well'
HEADER_NUMBER_OF_POOLS = 'scrna core number of pools'
HEADER_CELLS_PER_CHIP_WELL = 'scrna core cells per chip well'

SAMPLES_PER_POOL = { max: 25, min: 5 }.freeze

# Applies additional validations based on the submission template type.
#
# This method determines the submission template type from the CSV data and calls the appropriate
Expand All @@ -32,11 +37,23 @@
case submission_template_name
# this validation is for the scRNA pipeline cDNA submission
when SCRNA_CORE_CDNA_PREP_GEM_X_5P
validate_consistent_column_value(HEADER_NUM_SAMPLES)
validate_consistent_column_value(HEADER_NUMBER_OF_POOLS)
validate_consistent_column_value(HEADER_CELLS_PER_CHIP_WELL)
validate_samples_per_pool_for_labware
end
end

def apply_number_of_samples_per_pool_validation
# Creates groups of rows based on the study and project name (pool_number, study-project) combinations
group_rows_by_study_and_project

Check warning on line 48 in app/models/submission/validations_by_template_name.rb

View check run for this annotation

Codecov / codecov/patch

app/models/submission/validations_by_template_name.rb#L48

Added line #L48 was not covered by tests
end

def group_rows_by_study_and_project
index_of_study_name = headers.index(HEADER_STUDY_NAME)
index_of_project_name = headers.index(HEADER_PROJECT_NAME)
csv_data_rows.group_by { |row| [row[index_of_study_name], row[index_of_project_name]] }
end

# Validates that the specified column is consistent for all rows with the same study and project name.
#
# This method groups the rows in the CSV data by the study name and project name, and checks if the specified column
Expand All @@ -45,25 +62,219 @@
#
# @param column_header [String] The header of the column to validate.
# @return [void]
# rubocop:disable Metrics/MethodLength,Metrics/AbcSize
def validate_consistent_column_value(column_header)
index_of_study_name = headers.index(HEADER_STUDY_NAME)
index_of_project_name = headers.index(HEADER_PROJECT_NAME)
index_of_column = headers.index(column_header)

grouped_rows = csv_data_rows.group_by { |row| [row[index_of_study_name], row[index_of_project_name]] }
grouped_rows = group_rows_by_study_and_project

grouped_rows.each do |study_project, rows|
unique_values = rows.pluck(index_of_column).uniq
validate_unique_values(study_project, rows, index_of_column, column_header)
end
end

# Validates the number of samples per pool for labware.
#
# This method checks if the headers for barcode and plate wells are present.
# If they are, it groups the rows by study and project, and processes each group.
# The processing involves determining if the labware is a plate or tube and
# validating the number of samples per pool accordingly.
#
# @return [void]
def validate_samples_per_pool_for_labware
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add method comments for new methods.
codepilot is pretty good at doing most of it if you highlight the method and ask it to 'add a method comment'

return if headers.index(HEADER_BARCODE).nil? && headers.index(HEADER_PLATE_WELLS).nil?

grouped_rows = group_rows_by_study_and_project
grouped_rows.each_value { |rows| process_rows(rows) }
end

private

# Validates that the specified column has unique values for each study and project.
#
# This method checks if the specified column has unique values for each study and project.
# If inconsistencies are found, an error is added to the errors collection.
#
# @param study_project [Array<String>] The study and project names.
# @param rows [Array<Array<String>>] The rows of CSV data to process.
# @param index_of_column [Integer] The index of the column to validate.
# @param column_header [String] The header of the column to validate.
# @return [void]
def validate_unique_values(study_project, rows, index_of_column, column_header)
unique_values = rows.pluck(index_of_column).uniq
return unless unique_values.size > 1

errors.add(
:spreadsheet,
"Inconsistent values for column '#{column_header}' for Study name '#{study_project[0]}' and Project name " \
"'#{study_project[1]}', all rows for a specific study and project must have the same value"
)
end

# Processes the rows to determine the type of labware and validate accordingly.
#
# This method extracts the barcodes and well locations from the rows and determines if the labware is a plate or tube.
# It then calls the appropriate validation method based on the labware type.
#
# @param rows [Array<Array<String>>] The rows of CSV data to process.
# @return [void]
# rubocop:disable Metrics/MethodLength
def process_rows(rows)
barcodes = rows.pluck(headers.index(HEADER_BARCODE))
well_locations = rows.pluck(headers.index(HEADER_PLATE_WELLS))

if plate?(barcodes, well_locations)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's not a plate or tube? e.g. we have tube racks now.
raise exception maybe

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plus add test for tube

validate_for_plates(barcodes, well_locations, rows)
elsif tube?(barcodes, well_locations)
validate_for_tubes(barcodes, rows)
else
errors.add(

Check warning on line 129 in app/models/submission/validations_by_template_name.rb

View check run for this annotation

Codecov / codecov/patch

app/models/submission/validations_by_template_name.rb#L129

Added line #L129 was not covered by tests
:spreadsheet,
'Invalid labware type. Please provide either a plate barcode with well locations or tube barcodes only'
)
end
end
# rubocop:enable Metrics/MethodLength

# Validates the number of samples per pool for plates.
#
# This method finds the plate using the provided barcodes and retrieves the wells located at the specified well
# locations.
# It then calculates the total number of samples per study and project and the number of pools.
# Finally, it validates the number of samples per pool.
#
# @param barcodes [Array<String>] The barcodes of the plates.
# @param well_locations [Array<String>] The well locations on the plate.
# @param rows [Array<Array<String>>] The rows of CSV data to process.
# @return [void]
# rubocop:disable Metrics/AbcSize
def validate_for_plates(barcodes, well_locations, rows)
plate = Plate.find_from_any_barcode(barcodes.uniq.first)
return if plate.nil?

wells = plate.wells.for_bulk_submission.located_at(well_locations)
total_number_of_samples_per_study_project = wells.map(&:samples).flatten.count.to_i
number_of_pools = rows.pluck(headers.index(HEADER_NUMBER_OF_POOLS)).uniq.first.to_i

validate_samples_per_pool(rows, total_number_of_samples_per_study_project, number_of_pools)
end
# rubocop:enable Metrics/AbcSize

# Validates the number of samples per pool for tubes.
#
# This method finds the tubes using the provided barcodes and calculates the total number of samples per study and
# project.
# It then retrieves the number of pools from the rows and validates the number of samples per pool.
#
# @param barcodes [Array<String>] The barcodes of the tubes.
# @param rows [Array<Array<String>>] The rows of CSV data to process.
# @return [void]
def validate_for_tubes(barcodes, rows)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactor this? rubocop clearly not happy
ask codepilot for ideas?
plus add more test coverage

tubes = find_tubes(barcodes)
total_number_of_samples_per_study_project = calculate_total_samples(tubes)
number_of_pools = extract_number_of_pools(rows)

validate_samples_per_pool(rows, total_number_of_samples_per_study_project, number_of_pools)
end

# Finds the tubes using the provided barcodes.
#
# This method retrieves the tubes that match the provided barcodes and raises an error if any barcodes are missing.
#
# @param barcodes [Array<String>] The barcodes of the tubes.
# @return [Array<Receptacle>] The found tubes.
def find_tubes(barcodes)
Receptacle
.on_a(Tube)
.for_bulk_submission
.with_barcode(barcodes)
.tap do |found|
missing = find_missing_barcodes(barcodes, found)
raise ActiveRecord::RecordNotFound, "Could not find Tubes with barcodes #{missing.inspect}" if missing.present?
end
end

# Finds the missing barcodes from the found tubes.
#
# This method checks which barcodes are not present in the found tubes.
#
# @param barcodes [Array<String>] The barcodes of the tubes.
# @param found [Array<Receptacle>] The found tubes.
# @return [Array<String>] The missing barcodes.
def find_missing_barcodes(barcodes, found)
barcodes.reject { |barcode| found.any? { |tube| tube.any_barcode_matching?(barcode) } }
end

# Calculates the total number of samples from the tubes.
#
# This method calculates the total number of samples by flattening the samples from the tubes and counting them.
#
# @param tubes [Array<Receptacle>] The tubes to calculate samples from.
# @return [Integer] The total number of samples.
def calculate_total_samples(tubes)
tubes.map(&:samples).flatten.count.to_i
end

# Extracts the number of pools from the rows.
#
# This method retrieves the number of pools from the specified column in the rows.
#
# @param rows [Array<Array<String>>] The rows of CSV data to process.
# @return [Integer] The number of pools.
def extract_number_of_pools(rows)
rows.pluck(headers.index(HEADER_NUMBER_OF_POOLS)).uniq.first.to_i
end

# Determines if the labware is a plate based on the presence of barcodes and well locations.
#
# This method checks if both barcodes and well locations are present to determine if the labware is a plate.
#
# @param barcodes [Array<String>] The barcodes of the labware.
# @param well_locations [Array<String>] The well locations on the labware.
# @return [Boolean] Returns true if both barcodes and well locations are present, indicating the labware is a plate.
def plate?(barcodes, well_locations)
barcodes.present? && well_locations.none?(&:nil?)
end

# Determines if the labware is a tube based on the presence of barcodes and absence of well locations.
#
# This method checks if barcodes are present and well locations are absent to determine if the labware is a tube.
#
# @param barcodes [Array<String>] The barcodes of the labware.
# @param well_locations [Array<String>] The well locations on the labware.
# @return [Boolean] Returns true if barcodes are present and well locations are absent, indicating the labware is a
# tube.
def tube?(barcodes, well_locations)
barcodes.present? && well_locations.all?(&:nil?)
end

# Validates the number of samples per pool.
#
# This method calculates the number of samples per pool by dividing the total number of samples by the number of
# pools.
# It then iterates through each pool and checks if the number of samples per pool is within the allowed range.
# If the number of samples per pool is less than the minimum or greater than the maximum allowed, an error is added.
#
# @param rows [Array<Array<String>>] The rows of CSV data to process.
# @param total_samples [Integer] The total number of samples.
# @param number_of_pools [Integer] The number of pools.
# @return [void]
# rubocop:disable Metrics/MethodLength
def validate_samples_per_pool(rows, total_samples, number_of_pools)
int_division = total_samples / number_of_pools
remainder = total_samples % number_of_pools

number_of_pools.times do |pool_number|
samples_per_pool = int_division
samples_per_pool += 1 if pool_number < remainder
next unless samples_per_pool > SAMPLES_PER_POOL[:max] || samples_per_pool < SAMPLES_PER_POOL[:min]

next unless unique_values.size > 1
errors.add(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add test

:spreadsheet,
"Inconsistent values for column '#{column_header}' for Study name '#{study_project[0]}' and Project name " \
"'#{study_project[1]}', " \
'all rows for a specific study and project must have the same value'
"Number of samples per pool for Study name '#{rows.first[headers.index(HEADER_STUDY_NAME)]}' " \
"and Project name '#{rows.first[headers.index(HEADER_PROJECT_NAME)]}' " \
"is less than 5 or greater than 25 for pool number #{pool_number}"
)
end
end
# rubocop:enable Metrics/MethodLength,Metrics/AbcSize
# rubocop:enable Metrics/MethodLength
end
# rubocop:enable Metrics/ModuleLength
18 changes: 9 additions & 9 deletions config/bulk_submission_excel/columns.yml
Original file line number Diff line number Diff line change
Expand Up @@ -269,26 +269,26 @@ requested_flowcell_type:
conditional_formattings:
empty_cell:
is_error:
number_of_samples_per_pool:
heading: scRNA Core Number of Samples per Pool
attribute: :number_of_samples_per_pool
number_of_pools:
heading: scRNA Core Number of Pools
attribute: :number_of_pools
unlocked: true
type: :integer
validation:
options:
type: :whole
operator: :between
formula1: "5"
formula2: "25"
formula1: "1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just checking 1 is the minimum, I thought it was 2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you have 2-8 in the ranges below

formula2: "8"
allowBlank: true
showInputMessage: true
promptTitle: "Samples per Pool"
prompt: "The requested number of samples per pool (between 5 and 25 inclusive)"
promptTitle: "Number of pools"
prompt: "The requested number pools (between 1 and 8 inclusive)"
conditional_formattings:
empty_cell:
is_text:
is_num_samples_per_pool_in_valid_range:
is_num_samples_per_pool_outside_valid_range:
is_num_of_pools_in_valid_range:
is_num_of_pools_outside_valid_range:
cells_per_chip_well:
heading: scRNA Core Cells per Chip Well
attribute: :cells_per_chip_well
Expand Down
Loading
Loading