Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot partition a table based on a foreign key field (one-to-many) #237

Open
victorhaynes opened this issue Mar 6, 2024 · 1 comment
Open

Comments

@victorhaynes
Copy link

victorhaynes commented Mar 6, 2024

Tool versions:

  • Django version 4.2.7
  • PostgreSQL 14.3
  • django-postgres-extra-2.0.8

Scenario:

I am refactoring an existing app I created to support table partitioning. The table that needs partitioning will have 100s of millions into the billions of rows and have substantial row size. Right now I am trying to implement this simple example below in a totally separate/unrelated sandbox environment.

from django.db import models
from psqlextra.types import PostgresPartitioningMethod
from psqlextra.models import PostgresPartitionedModel

# This represents a table of codes. In production there will be ~10 , not 3, and they are all unique
class Platform(models.Model):
    code_options = [("na1","na1"), ("euw1","euw1"), ("br1","br1")]
    code = models.CharField(choices=code_options, primary_key=True)

# This is a table of events occurring, in prod there will be up to billions. A Platform will have many Matches
# Want to partition by platform which is a foreign key
class Match(PostgresPartitionedModel):
    class PartitioningMeta:
        method = PostgresPartitioningMethod.LIST
        key = ["platform"]
    
    matchId = models.CharField(max_length=20)
    platform = models.ForeignKey(Platform, on_delete=models.CASCADE)

Problem:

I want to partition Matches based on the platform it belongs to. There are a fixed number of options so I want to use list partitioning. This seems reasonable to me because I want "parent" level & also partition level reporting. I don't think what I'm trying to do is controversial and I would like to not make ~10 separate tables separate that are almost identical.

I run python3 manage.py pgmakemigrations and my migration is created:

Click to expand
# Generated by Django 4.2.7 on 2024-03-06 18:11

from django.db import migrations, models
import django.db.models.deletion
import psqlextra.backend.migrations.operations.add_default_partition
import psqlextra.backend.migrations.operations.create_partitioned_model
import psqlextra.manager.manager
import psqlextra.models.partitioned
import psqlextra.types


class Migration(migrations.Migration):

    initial = True

    dependencies = [
    ]

    operations = [
        psqlextra.backend.migrations.operations.create_partitioned_model.PostgresCreatePartitionedModel(
            name='Match',
            fields=[
                ('id', models.BigAutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
                ('matchId', models.CharField(max_length=20)),
            ],
            options={
                'abstract': False,
                'base_manager_name': 'objects',
            },
            partitioning_options={
                'method': psqlextra.types.PostgresPartitioningMethod['LIST'],
                'key': ['platform'],
            },
            bases=(psqlextra.models.partitioned.PostgresPartitionedModel,),
            managers=[
                ('objects', psqlextra.manager.manager.PostgresManager()),
            ],
        ),
        psqlextra.backend.migrations.operations.add_default_partition.PostgresAddDefaultPartition(
            model_name='Match',
            name='default',
        ),
        migrations.CreateModel(
            name='Platform',
            fields=[
                ('code', models.CharField(choices=[('na1', 'na1'), ('euw1', 'euw1'), ('br1', 'br1')], primary_key=True, serialize=False)),
            ],
        ),
        migrations.AddField(
            model_name='match',
            name='platform',
            field=models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='wrs_api.platform'),
        ),
    ]

Error when I migrate:

  Applying wrs_api.0001_initial...Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/models/options.py", line 681, in get_field
    return self.fields_map[field_name]
KeyError: 'platform'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/psqlextra/backend/schema.py", line 534, in _partitioning_properties_for_model
    model._meta.get_field(field_name)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/models/options.py", line 683, in get_field
    raise FieldDoesNotExist(
django.core.exceptions.FieldDoesNotExist: Match has no field named 'platform'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/victorhaynes/Development/riot/playground/backend/manage.py", line 22, in <module>
    main()
  File "/Users/victorhaynes/Development/riot/playground/backend/manage.py", line 18, in main
    execute_from_command_line(sys.argv)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line
    utility.execute()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/core/management/__init__.py", line 436, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/core/management/base.py", line 412, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/core/management/base.py", line 458, in execute
    output = self.handle(*args, **options)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/core/management/base.py", line 106, in wrapper
    res = handle_func(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/core/management/commands/migrate.py", line 356, in handle
    post_migrate_state = executor.migrate(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/migrations/executor.py", line 135, in migrate
    state = self._migrate_all_forwards(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/migrations/executor.py", line 167, in _migrate_all_forwards
    state = self.apply_migration(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/migrations/executor.py", line 252, in apply_migration
    state = migration.apply(state, schema_editor)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/migrations/migration.py", line 132, in apply
    operation.database_forwards(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/psqlextra/backend/migrations/operations/create_partitioned_model.py", line 47, in database_forwards
    schema_editor.create_partitioned_model(model)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/psqlextra/backend/schema.py", line 162, in create_partitioned_model
    meta = self._partitioning_properties_for_model(model)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/psqlextra/backend/schema.py", line 536, in _partitioning_properties_for_model
    raise ImproperlyConfigured(
django.core.exceptions.ImproperlyConfigured: Model 'Match' is not properly configured to be partitioned. Field 'platform' in partitioning key ['platform'] is not a valid field on 'Match'.

Analysis / Trying to Fix:

I think this is happening because Django/PSQL thinks that my Match model does not have the field platform yet when it tries to create the partition. I noticed when I create the migration the order of operations is:

Migrations for 'wrs_api':
  wrs_api/migrations/0001_initial.py
    - Create partitioned model Match
    - Creates default partition 'default' on Match
    - Create model Platform
    - Add field platform to match

The platform field is not getting added to the Match model/table until after Match tries to partition itself by platform. It makes sense why this is an error. If I change my partion key to key = ["foobar"] I get the the exact same error language....

...but if I manually change my migration to create my generic model Platform first & I manually add the platform field with a foreign key constraint to Match in the .PostgresCreatePartitionedModel() config like so:

Click to see manually altered migration
# Generated by Django 4.2.7 on 2024-03-06 20:48

from django.db import migrations, models
import django.db.models.deletion
import psqlextra.backend.migrations.operations.add_default_partition
import psqlextra.backend.migrations.operations.create_partitioned_model
import psqlextra.manager.manager
import psqlextra.models.partitioned
import psqlextra.types


class Migration(migrations.Migration):

    initial = True

    dependencies = [
    ]

    operations = [
        migrations.CreateModel(
            name='Platform',
            fields=[
                ('code', models.CharField(choices=[('na1', 'na1'), ('euw1', 'euw1'), ('br1', 'br1')], primary_key=True, serialize=False)),
            ],
        ),
        psqlextra.backend.migrations.operations.create_partitioned_model.PostgresCreatePartitionedModel(
            name='Match',
            fields=[
                ('id', models.BigAutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
                ('matchId', models.CharField(max_length=20)),
                ('platform', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='wrs_api.platform')),

            ],
            options={
                'abstract': False,
                'base_manager_name': 'objects',
            },
            partitioning_options={
                'method': psqlextra.types.PostgresPartitioningMethod['LIST'],
                'key': ['platform'],
            },
            bases=(psqlextra.models.partitioned.PostgresPartitionedModel,),
            managers=[
                ('objects', psqlextra.manager.manager.PostgresManager()),
            ],
        ),
        psqlextra.backend.migrations.operations.add_default_partition.PostgresAddDefaultPartition(
            model_name='Match',
            name='default',
        ),
        # migrations.AddField(
        #     model_name='match',
        #     name='platform',
        #     field=models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='wrs_api.platform'),
        # ),
    ]

I get this error instead:

 Applying wrs_api.0001_initial...Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/backends/utils.py", line 87, in _execute
    return self.cursor.execute(sql)
psycopg2.errors.UndefinedColumn: column "platform" named in key does not exist
LINE 1: ...har(20) NOT NULL, "platform_id" varchar NOT NULL, PRIMARY KE...
                                                             ^


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/victorhaynes/Development/riot/playground/backend/manage.py", line 22, in <module>
    main()
  File "/Users/victorhaynes/Development/riot/playground/backend/manage.py", line 18, in main
    execute_from_command_line(sys.argv)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line
    utility.execute()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/core/management/__init__.py", line 436, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/core/management/base.py", line 412, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/core/management/base.py", line 458, in execute
    output = self.handle(*args, **options)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/core/management/base.py", line 106, in wrapper
    res = handle_func(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/core/management/commands/migrate.py", line 356, in handle
    post_migrate_state = executor.migrate(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/migrations/executor.py", line 135, in migrate
    state = self._migrate_all_forwards(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/migrations/executor.py", line 167, in _migrate_all_forwards
    state = self.apply_migration(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/migrations/executor.py", line 252, in apply_migration
    state = migration.apply(state, schema_editor)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/migrations/migration.py", line 132, in apply
    operation.database_forwards(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/psqlextra/backend/migrations/operations/create_partitioned_model.py", line 47, in database_forwards
    schema_editor.create_partitioned_model(model)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/psqlextra/backend/schema.py", line 189, in create_partitioned_model
    self.execute(sql, params)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/backends/postgresql/schema.py", line 45, in execute
    return super().execute(sql, params)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/backends/base/schema.py", line 201, in execute
    cursor.execute(sql, params)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/backends/utils.py", line 102, in execute
    return super().execute(sql, params)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/backends/utils.py", line 80, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/backends/utils.py", line 84, in _execute
    with self.db.wrap_database_errors:
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/utils.py", line 91, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/django/db/backends/utils.py", line 87, in _execute
    return self.cursor.execute(sql)
django.db.utils.ProgrammingError: column "platform" named in key does not exist
LINE 1: ...har(20) NOT NULL, "platform_id" varchar NOT NULL, PRIMARY KE...

TL;DR:

isn't this a normal use case? And if so how do you partition a table by a foreign key field

@arshdoda
Copy link

Can you please try with key = "platform_id"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants