Setting a relationship explicitly to null that uses a CPK triggers odd and incorrect validation error #612

erikbelusic · 2023-11-08T17:50:46Z

We are on rails 6.1 and CPK 13, however this persists in rails 7.0 and CPK 14 and rails 7.1 without CPK

class CreateOrganizations < ActiveRecord::Migration[6.1]
  def change
    create_table :organizations do |t|
      t.text :name
      t.timestamps
    end
  end
end

class CreateV1PmtUsers < ActiveRecord::Migration[6.1]
  def change
    create_table :v1_pmt_users do |t|
      t.belongs_to :organization, null: false
      t.text :name
      t.timestamps
    end
  end
end

class CreateV1Issues < ActiveRecord::Migration[6.1]
  def change
    create_table :v1_issues do |t|
      t.belongs_to :organization, null: false
      t.belongs_to :reporter_pmt_user
      t.text :description
      t.timestamps
    end
  end
end


class CreateV2Issues < ActiveRecord::Migration[6.1]
  def change
    create_table :v2_issues, primary_key: [:organization_id, :pmt_id, :external_id] do |t|
      t.integer :organization_id, null: false
      t.integer :pmt_id, null: false
      t.text :external_id, null: false

      t.text :reporter_pmt_user_external_id
      t.text :title, null: false

      t.timestamps
    end
  end
end

class CreateV2PmtUsers < ActiveRecord::Migration[6.1]
  def change
    create_table :v2_pmt_users, primary_key: [:organization_id, :pmt_id, :external_id] do |t|
      t.integer :organization_id, null: false
      t.integer :pmt_id, null: false
      t.text :external_id, null: false
      t.text :name, null: false

      t.timestamps
    end
  end
end

class Organization < ApplicationRecord
end

class V1Issue < ApplicationRecord
  belongs_to :organization
  belongs_to :reporter_pmt_user, class_name: "V1PmtUser", optional: true
end

class V1PmtUser < ApplicationRecord
  belongs_to :organization
end

class V2Issue < ApplicationRecord
  self.primary_keys = :organization_id, :pmt_id, :external_id

  belongs_to :organization
  belongs_to :reporter_pmt_user, class_name: "V2PmtUser", foreign_key: %i[organization_id pmt_id reporter_pmt_user_external_id], optional: true
end

class V2PmtUser < ApplicationRecord
  self.primary_keys = :organization_id, :pmt_id, :external_id

  belongs_to :organization

end

Tests for non CPK models in rspec - all pass

require 'rails_helper'

describe "models without CPK - saving relationships" do
  before(:each) do
    @organization = Organization.create!(name: "test org")
  end

  it "creates without setting reporter" do
    issue = V1Issue.find_or_initialize_by(organization_id: @organization.id, description: "test1")
    expect do
      issue.save!
    end.to change(V1Issue, :count).by(1)
  end

  it "creates with setting reporter" do
    issue = V1Issue.find_or_initialize_by(organization_id: @organization.id, description: "test1")
    expect do
      reporter_pmt_user = V1PmtUser.create!(organization_id: @organization.id, name: "test1")
      issue.update!(reporter_pmt_user: reporter_pmt_user)
    end.to change(V1Issue, :count).by(1).and change(V1PmtUser, :count).by(1)
  end

  it "creates with explicitly setting reporter to nil" do
    issue = V1Issue.find_or_initialize_by(organization_id: @organization.id, description: "test1")
    expect do
      issue.update!(reporter_pmt_user: nil)
    end.to change(V1Issue, :count).by(1)
  end
end

Tests for CPK models - last test fails

require 'rails_helper'

describe "models with CPK - saving relationships" do
  before(:each) do
    @organization = Organization.create!(name: "test org")
    @pmt_id = 1
    @issue_external_id = "1"
  end

  it "creates without setting reporter" do
    issue = V2Issue.find_or_initialize_by(organization_id: @organization.id, pmt_id: @pmt_id, external_id: @issue_external_id)
    expect do
      issue.update!(title: "test")
    end.to change(V2Issue, :count).by(1)
  end

  it "creates with setting reporter" do
    issue = V2Issue.find_or_initialize_by(organization_id: @organization.id, pmt_id: @pmt_id, external_id: @issue_external_id)
    reporter_pmt_user = nil # init
    expect do
      reporter_pmt_user = V2PmtUser.create!(organization_id: @organization.id, pmt_id: @pmt_id, external_id: "abc", name: "test1")
      issue.update!(title: "test", reporter_pmt_user: reporter_pmt_user)
    end.to change(V2Issue, :count).by(1).and change(V2PmtUser, :count).by(1)

    expect(issue).to have_attributes(title: "test", reporter_pmt_user: reporter_pmt_user)
  end

  # The failing spec
  it "creates with explicitly setting reporter to nil" do
    issue = V2Issue.find_or_initialize_by(organization_id: @organization.id, pmt_id: @pmt_id, external_id: @issue_external_id)
    expect do
      issue.update!(reporter_pmt_user: nil)
    end.to change(V2Issue, :count).by(1)
  end
end

failure:

1) models with CPK - saving relationships creates with explicitly setting reporter to nil
     Failure/Error: issue.update!(title: "test", reporter_pmt_user: nil)

     ActiveRecord::RecordInvalid:
       Validation failed: Organization must exist

The expected behavior is that i can null out the relationship the same way i can in the non CPK example

The text was updated successfully, but these errors were encountered:

erikbelusic · 2023-11-08T17:55:26Z

After posting this, I have realized what the issue is. I am going to close.

erikbelusic · 2023-11-08T18:03:26Z

Maybe this should actually be re-opened as a feature request. The issue here is that we are sharing columns between PK and relationship "FK". So when assigning the null relationship, we end up nulling out the organization_id accidently.

What is odd and made me think this was a bug is that on rails 6.0 and cpk 12 I do not believe this was an issue, as we only saw issues when upgrading to 6.1 and 13.

Would love your thoughts. Sharing something like a tennant_id in CPK is a fairly common strategy

cfis · 2023-11-10T00:28:11Z

Sure, feel free to reopen. Would be good to have a test case based on the CPK test data if you have time to put one together.

erikbelusic · 2023-11-10T00:44:30Z

Can you elaborate on what you mean by the "CPK test data". Happy to help out, but I am not sure on what you feel the correct behavior should be in this scenario.

I know there is a similar thread on rails mainline i found rails/rails#49671 where this is being discussed a bit as well.

In our project we have patched _write_attribute as seen here: rails/rails#49671 (comment) to solve our short term issue.

cfis · 2023-11-14T08:56:40Z

The CPK test data are the models and fixtures found here - https://github.com/composite-primary-keys/composite_primary_keys/tree/master/test/fixtures.

Hopefully there are models setup close to what you need?

erikbelusic · 2023-11-14T17:44:19Z

I can take a look, but the part I am not clear on is what additional test you would like to see here? Are you agreeing that setting a relationship should never overwrite parts of the primary key?

cfis · 2023-11-15T05:38:41Z

That's a good question. If you don't update the part of the primary key shared with the relationship, then you can't set the relationship. So what is supposed to happen, throw an error?

erikbelusic · 2023-11-15T06:08:11Z

So, I think depending on the design of your primary keys, you can still set the relationship. I think the only time columns in a relationship and the PK would be shared is if you primary key design is what I have seen referred to as a tennant-key design, where some tennant id is in every models primary key, and data is strictly segmented by those values. In this case skipping the shared columns when setting the relationship, would still set the non-shared columns, which should result in the same end state.

In the project we are using this gem on, we allow each of the PK fields to be updated either directly, or if the column is currently null. In the example above, you can see the expectation is that once the PK fields are set, they are not changed inadvertently by setting a relationship that shares one of those columns with the PK.

# The failing spec
  it "creates with explicitly setting reporter to nil" do
    issue = V2Issue.find_or_initialize_by(organization_id: @organization.id, pmt_id: @pmt_id, external_id: @issue_external_id)
    expect do
      issue.update!(reporter_pmt_user: nil)
    end.to change(V2Issue, :count).by(1)
  end

While here this makes sense for us I am not sure it would make sense for everyones' use cases, but because our CPK contains a column that is essentially a tenant key, setting the relationship with a VcsUser that belongs to another org should never happen anyway.

We also have an example where we do want the relationship to set part of the PK, and thats the following

class Commit < BaseRecord
    belongs_to :organization
   
    has_many :commit_issue_associations, foreign_key: %i[organization_id commit_vcs_id commit_repository_external_id commit_oid]
end

class CommitIssueAssociation < BaseRecord
    belongs_to :issue, foreign_key: %i[organization_id issue_pmt_id issue_external_id]
    belongs_to :commit, foreign_key: %i[organization_id commit_vcs_id commit_repository_external_id commit_oid]
  end

commit.commit_issue_associations.find_or_create_by!(issue: issue)

and so here we allow the relationship being set to set part of the PK, but only because its null at that time.

I think some options are:

take our approach as the solution - disallow changing of the PK with the relationship
make our approach an opt-in/configurable feature, either globally or per model
- something like cpk_overwrite_primary_key_columns false as a class method called in the AR definition
more configurable
- something like cpk_protected_columns primary_key or cpk_protected_columns :col1, :col2 as a class method called in the model definition

PS flexible on the method names. I tend to prefer verbose clear methods, and I am unsure if prefixing with something for a namespacing effect is needed here or not.

cfis · 2023-11-23T08:47:55Z

Thanks for the explanation. I do have a hard time seeing why there would be a setup like this. The tenant_id kinda makes sense, but why would the tennant_id need to be part of the primary key when there is likely already a unique autogenerated key?

But anyway, I can convince myself the current behavior is correct. If you update a relationship that happens to to share a column with a primary key then it should update the primary key too. If you don't allow that, how else could you change the relationship?

I guess the choice after that (your choice #1) is if updating a relationship causes a primary key value to change, then disallow it and throw an error. Not seeing the null case you mention because databases (well postgresql and mysql) don't allow null values in primary keys. Maybe the record hasn't been created yet?

I'm not sure this is worth adding to CPK though. Could you implement a one-off solution using ActiveRecord callbacks for the cases where you want to disallow updates to the PK?

erikbelusic · 2023-11-29T15:19:10Z

I do have a hard time seeing why there would be a setup like this. The tenant_id kinda makes sense, but why would the tennant_id need to be part of the primary key when there is likely already a unique autogenerated key?

Our scenario is that we do not use auto incrementing ids - we are using natural primary keys. Because two customers could create data with the same natural components, the tennant_id becomes a requirement. e.g. data only needs to be unique within a customers dataset.

If you update a relationship that happens to to share a column with a primary key then it should update the primary key too. If you don't allow that, how else could you change the relationship?

Right, so this is where it gets interesting. In our scenario, changing a primary key column via setting a relationship NEVER makes sense, because it could be saying move this data from customer X to customer Y. I recognize this may not make sense for all use cases. But I do think setting part of the PK to null when trying to null out a relationship should also never happen, as the PK cannot have null component in most RDBMS, if not all.

Not seeing the null case you mention because databases (well postgresql and mysql) don't allow null values in primary keys. Maybe the record hasn't been created yet?

This comes into play for join models using CPK - where the primary key is a combination of the two related objects "FKs".

I'm not sure this is worth adding to CPK though. Could you implement a one-off solution using ActiveRecord callbacks for the cases where you want to disallow updates to the PK?

We have implemented a one off solution, however, we had to monkey patch and not use callbacks, as many relations could attempt to mutate a PK column, and the callbacks only have access to the original and current values, and you could lose data depending on what you are doing. For us it made more sense to overwrite the attr writer.

If you don't feel preventing changing any part of the PK should be part of the CPK behavior, thats understandable. But I would say disallowing a PK column value to be set as null should probably be prevented.

cfis · 2023-12-01T08:01:30Z

Thanks for the example of moving data between two tennants - that makes sense.

Is your monkey patch suitable for a PR - or could be made to be suitable? I'm ok with this idea as long as its opt-in and doesn't overcomplicate the code.

erikbelusic closed this as completed Nov 8, 2023

cfis reopened this Nov 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setting a relationship explicitly to null that uses a CPK triggers odd and incorrect validation error #612

Setting a relationship explicitly to null that uses a CPK triggers odd and incorrect validation error #612

erikbelusic commented Nov 8, 2023

erikbelusic commented Nov 8, 2023

erikbelusic commented Nov 8, 2023

cfis commented Nov 10, 2023

erikbelusic commented Nov 10, 2023

cfis commented Nov 14, 2023

erikbelusic commented Nov 14, 2023

cfis commented Nov 15, 2023

erikbelusic commented Nov 15, 2023

cfis commented Nov 23, 2023

erikbelusic commented Nov 29, 2023

cfis commented Dec 1, 2023

Setting a relationship explicitly to null that uses a CPK triggers odd and incorrect validation error #612

Setting a relationship explicitly to null that uses a CPK triggers odd and incorrect validation error #612

Comments

erikbelusic commented Nov 8, 2023

erikbelusic commented Nov 8, 2023

erikbelusic commented Nov 8, 2023

cfis commented Nov 10, 2023

erikbelusic commented Nov 10, 2023

cfis commented Nov 14, 2023

erikbelusic commented Nov 14, 2023

cfis commented Nov 15, 2023

erikbelusic commented Nov 15, 2023

cfis commented Nov 23, 2023

erikbelusic commented Nov 29, 2023

cfis commented Dec 1, 2023