Skip to content

Character Sets and Collations

Laurents Meyer edited this page Jan 26, 2022 · 1 revision

Character Sets and Collations

There are a couple of common approaches you might want to choose from for charset and collation handling in Pomelo.EntityFrameworkCore.MySql. If you don't have specific requirements, the default approach should work best, which is to explicitly apply charsets and collations to objects recursively.

If you do not specify anything explicitly, the following default option is implicitly used:

modelBuilder.HasCharSet("utf8mb4"); // same as: modelBuilder.HasCharSet("utf8mb4", DelegationModes.ApplyToAll);

Explicitly apply global charset and collation recursively

You set a global charset and/or collation and let Pomelo recursively apply it explicitly to the database, all tables and all columns. Pomelo will also explicitly apply it to new objects that you might add later.

You can override this behavior on any level (database, table, column) for specific objects. If you remove an override later, the default value from the parent level will be applied again explicitly.

For example, you can make your whole database use latin1, but make the table LegacyAsciiStuff use the ascii charset for all text-based columns except its column LegacyUnicodeTranslation that will be set to the deprecated utf8mb3 character set:

// The database, tables and columns will explicitly be set to "latin1" by default.
modelBuilder.HasCharSet("latin1");

modelBuilder.Entity<LegacyAsciiStuff>(entity => {
    // The "LegacyAsciiStuff" table and all its columns will explicitly set to "ascii" by default.
    entity.HasCharSet("ascii");

    // The "LegacyUnicodeTranslation" column uses the deprecated "utf8mb3" character set.
    entity.Property(e => e.LegacyUnicodeTranslation)
        .HasCharSet("utf8mb3");
});

Using this approach ensures, that your database charsets and collations are always in a known and expected state.

This is the default approach in releases >= 5.0.

Apply global charset and collation only to database itself

You set a global charset and/or collation and let Pomelo apply it only to the database itself.

Any tables created later, will inherit the charset and collation that the database has at that point in time. If you later change the database charset and/or collation, existing tables will still use the previous charset and collation unless you explicitly change those individually:

// The database will explicitly be set to "latin1".
modelBuilder.HasCharSet("latin1", DelegationModes.ApplyToDatabases);

Using this approach can lead to unexpected charsets and/or collations if you change the charset or collation later and are not carefull, because the order in which DDL statements are executed now becomes essential. In those cases, using DbContext.Database.EnsureCreated() could result in a different database model (in regards to charsets and collations) than applying the migrations that represent the same EF Core model.

This was the default approach in releases <= 3.2.

Apply global charset and collation only to columns

You set a global charset and/or collation and let Pomelo explicitly apply it only to all columns. Any column created later, will explicitly be set to this charset and collation by Pomelo as well.

You can override this behavior on any level (database, table, column) for specific objects. If you remove an override later, the default value from the parent level will be applied again explicitly.

// Neither the database nor tables will be explicitly set to a character set or collation.
// But all columns will be set to "latin1".
modelBuilder.HasCharSet("latin1", DelegationModes.ApplyToColumns);

This will effectively ignore any database or table level defaults and just set the charset and collation explicitly for every single column.

Do not apply any charsets and collations

If you don't want Pomelo to interfere in any way with charset and collation handling, because you want/need to handle it manually from the database side, you can use the following code to disable it altogether:

modelBuilder.HasCharSet(null, DelegationModes.ApplyToDatabases);