This is the quarterly roll-up release, containing many new features and bug fixes.

Interfaces will not be stable until 1.0.

Merged PRs

go-mysql-server

1861: chore: remove refs to deprecated io/ioutil
1860: chore: unnecessary use of fmt.Sprintf
1859: chore: use copy(to, from) instead of a loop
1856: Support IPV6 loopback address for looking up user credentials
Map "::1" and "127.0.0.1" to localhost when looking up users.
There don't appear to be tests for this code path. TBD if I'll add some.
Related to: dolthub/dolt#6239
1854: Prevent loops in stored procedures from returning multiple result sets
The query in dolthub/dolt#6230 was causing rows from many result sets to be returned from a stored procedure. We already have code that limits BEGIN/END blocks to return the last SELECTed result set; this PR extends that logic to loop constructs as well.
Fixes: dolthub/dolt#6230
Dolt CI Checks: dolthub/dolt#6245
1853: chore: slice replace loop
1852: Alter stored procedure execution to deal with statements that commit transactions
This change adds checks to begin a new transaction whenever there isn't one during stored procedure execution. This lets things like dolt_commit() execute correctly in stored procedures.

1851: memo.Literal has different type than lookup
This panics on dolt:

CREATE TABLE tab2(pk INTEGER PRIMARY KEY, col0 INTEGER, col1 FLOAT, col2 TEXT, col3 INTEGER, col4 FLOAT, col5 TEXT);
CREATE UNIQUE INDEX idx_tab2_0 ON tab2 (col1 DESC,col4 DESC);
CREATE INDEX idx_tab2_1 ON tab2 (col1,col0);
CREATE INDEX idx_tab2_2 ON tab2 (col4,col0);
CREATE INDEX idx_tab2_3 ON tab2 (col3 DESC);
INSERT INTO tab2 VALUES(0,344,171.98,'nwowg',833,149.54,'wjiif');
INSERT INTO tab2 VALUES(1,353,589.18,'femmh',44,621.85,'qedct');
SELECT pk FROM tab2 WHERE ((((((col0 IN (SELECT col3 FROM tab2 WHERE ((col1 = 672.71)) AND col4 IN (SELECT col1 FROM tab2 WHERE ((col4 > 169.88 OR col0 > 939 AND ((col3 > 578))))) AND col0 >= 377) AND col4 >= 817.87 AND (col4 > 597.59)) OR col4 >= 434.59 AND ((col4 < 158.43)))))) AND col0 < 303) OR ((col0 > 549)) AND (col4 BETWEEN 816.92 AND 983.96) OR (col3 BETWEEN 421 AND 96);

The PutField function expects the value to match the tuple descriptor exactly, and will panic if it does not.
The section of code in memo that creates a new range uses the type from the expression, but in other places it uses the index column expression types.
An alternative solution would be to have some logic in dolt to convert to the corresponding sql.Type based off the val.Enc

1848: IntDiv.Type() should always return either uint64 or int64
Previously, our IntDiv.convertLeftRight() used IntDiv.Type() to determine the larger type between IntDiv.Left.Type() and IntDiv.Right.Type() to avoid precision loss when doing internal calculations. Now, that logic is moved from IntDiv.Type() to IntDiv.convertLeftRight(), and IntDiv.Type() can only return uint64 or int64.
This should fix the sql correctness regression from #1834
1847: Fix TargetSchema.Resolved() to check targetSchema column default expressions
A couple SchemaTarget implementations weren't checking if the targetSchema was resolved as part of the Resolved() method. Added tests, audited the other implementations, and simplified the logic to use a new method on Schema to check that column default expressions are resolved.
Fixes: dolthub/dolt#6206
Dolt CI Run: dolthub/dolt#6213
1846: update information_schema.processlist to correctly display status of processes and databases
We used to hardcode "Query", now we reference process.Command
Additionally, we now get the database from the current session and use that variable.
fix for: dolthub/dolt#6023
1844: fix panic for group by binary type
We made a bad type assertion for sql.StringType.
Additionally, this fixes a issue where UnaryExpressions with GetFields would incorrectly throw a functional dependency error with ONLY_FULL_GROUP_BY enabled.
Fix for second part of: dolthub/dolt#6179
1843: Improvements to CAST and CONVERT functions
This PR adds support for casting/converting to FLOAT and DOUBLE types with the CAST and CONVERT functions. It also adds support for length (aka precision) and scale type constraints (e.g. CAST(1.2345 AS DECIMAL(3,2))).
Parser support for DOUBLE and FLOAT with CAST and CONVERT: dolthub/vitess#249
Fixes: dolthub/dolt#5835
1841: adding version and version_comment values
@@version now returns 8.0.11
@@version_comment now returns "Dolt"; in mysql, this appears to be dependent on OS / method of install
- Some people get MySQL Community Server - GPL
- Others get Homebrew
  Fix for first part of: dolthub/dolt#6179
1840: deduplicate (hash) intuple for and queries
This PR was originally supposed to fix it: original fix: #1677, but AND statements weren't covered.
fix for: dolthub/dolt#6189
1839: Slow degenerate semi join, hoist select opt
This enables recursive subquery decorrelations, and adds a hash join execution option for semi joins that is equivalent to cached subquery existence checks.
1838: resolve aliases in subqueries in function arguments
The rule reorderProjection also replaces subqueries with getfields in projections when they are used by subqueries, but it did not check for function expressions.
This meant that aliases in subqueries as arguments to functions threw a "x" could not be found error.
This PR just has the section of reorderProjection that is supposed to find deferredColumns also look at the arguments of functions recursively (because we can nest functions).
Additionally, there was another schema type bug:
```
tmp> select 0 as foo, if((select foo), 123, 456);
+-----+----------------------------+
| foo | if((select foo), 123, 456) |
+-----+----------------------------+
| 0   | 127                        |
+-----+----------------------------+
1 row in set (0.00 sec)
```
MySQL returns an Integer type for if statement, and if either argument is a String, it always returns a String.
fix for: dolthub/dolt#6174
1836: update cached table count in prepared statements
Prepared statements were caching table counts. We need to update the table count when finalizing prepared statements to bring table count up to date with any intermediate edits.
1834: fix expected schema for sum(literal)
The code path we take when print rows to shell is different than spooling from server.
In the sql case, we ignore the schema we get from analysis.
In the server case, we actually read the schema, and ensure that the rows are of that type.
When doing sum(literal), we use the type of the literal. In this issue, the literal was 1, so an INT8, which caps out at 127.
sum() is always supposed to return a float64, so I made a change to do that.
I checked by starting mysql with --column-type-info option, and it does appear that any columns coming from sum() has a DECIMAL type.
Fix for: dolthub/dolt#6120
1830: Use SO_REUSEADDR and SO_REUSEPORT options when creating the sql server on Unix
This prevents a transient error we've been seeing where the server sometimes fails to start, and the OS claims port already in use, even though we've already confirmed that the port is not in use prior to running dolt sql-server.
1829: plan.TableCountLookup short circuits count()
In many cases it is unnecessary to read an entire table to report count(*). We can use the RowCount() interface to jump to the answer.

1828: Consolidated collation maps
Main file to check is the generate/main.go file. After running the updated generation program, these are the consolidated files:

common_utf8mb4_es_0900_ai_ci_Weights: [utf8mb4_es_0900_ai_ci_Weights, utf8mb4_es_trad_0900_ai_ci_Weights]
common_utf8mb4_es_0900_as_cs_Weights: [utf8mb4_es_0900_as_cs_Weights, utf8mb4_es_trad_0900_as_cs_Weights]
common_utf_croatian_ci_Weights: [utf16_croatian_ci_Weights, utf32_croatian_ci_Weights, utf8mb3_croatian_ci_Weights, utf8mb4_croatian_ci_Weights]
common_utf_czech_ci_Weights: [utf16_czech_ci_Weights, utf32_czech_ci_Weights, utf8mb3_czech_ci_Weights, utf8mb4_czech_ci_Weights]
common_utf_danish_ci_Weights: [utf16_danish_ci_Weights, utf32_danish_ci_Weights, utf8mb3_danish_ci_Weights, utf8mb4_danish_ci_Weights]
common_utf_esperanto_ci_Weights: [utf16_esperanto_ci_Weights, utf32_esperanto_ci_Weights, utf8mb3_esperanto_ci_Weights, utf8mb4_esperanto_ci_Weights]
common_utf_estonian_ci_Weights: [utf16_estonian_ci_Weights, utf32_estonian_ci_Weights, utf8mb3_estonian_ci_Weights, utf8mb4_estonian_ci_Weights]
common_utf_german2_ci_Weights: [utf16_german2_ci_Weights, utf32_german2_ci_Weights, utf8mb3_german2_ci_Weights, utf8mb4_german2_ci_Weights]
common_utf_hungarian_ci_Weights: [utf16_hungarian_ci_Weights, utf32_hungarian_ci_Weights, utf8mb3_hungarian_ci_Weights, utf8mb4_hungarian_ci_Weights]
common_utf_icelandic_ci_Weights: [utf16_icelandic_ci_Weights, utf32_icelandic_ci_Weights, utf8mb3_icelandic_ci_Weights, utf8mb4_icelandic_ci_Weights]
common_utf_latvian_ci_Weights: [utf16_latvian_ci_Weights, utf32_latvian_ci_Weights, utf8mb3_latvian_ci_Weights, utf8mb4_latvian_ci_Weights]
common_utf_lithuanian_ci_Weights: [utf16_lithuanian_ci_Weights, utf32_lithuanian_ci_Weights, utf8mb3_lithuanian_ci_Weights, utf8mb4_lithuanian_ci_Weights]
common_utf_persian_ci_Weights: [utf16_persian_ci_Weights, utf32_persian_ci_Weights, utf8mb3_persian_ci_Weights, utf8mb4_persian_ci_Weights]
common_utf_polish_ci_Weights: [utf16_polish_ci_Weights, utf32_polish_ci_Weights, utf8mb3_polish_ci_Weights, utf8mb4_polish_ci_Weights]
common_utf_roman_ci_Weights: [utf16_roman_ci_Weights, utf32_roman_ci_Weights, utf8mb3_roman_ci_Weights, utf8mb4_roman_ci_Weights]
common_utf_romanian_ci_Weights: [utf16_romanian_ci_Weights, utf32_romanian_ci_Weights, utf8mb3_romanian_ci_Weights, utf8mb4_romanian_ci_Weights]
common_utf_sinhala_ci_Weights: [utf16_sinhala_ci_Weights, utf32_sinhala_ci_Weights, utf8mb3_sinhala_ci_Weights, utf8mb4_sinhala_ci_Weights]
common_utf_slovak_ci_Weights: [utf16_slovak_ci_Weights, utf32_slovak_ci_Weights, utf8mb3_slovak_ci_Weights, utf8mb4_slovak_ci_Weights]
common_utf_slovenian_ci_Weights: [utf16_slovenian_ci_Weights, utf32_slovenian_ci_Weights, utf8mb3_slovenian_ci_Weights, utf8mb4_slovenian_ci_Weights]
common_utf_spanish2_ci_Weights: [utf16_spanish2_ci_Weights, utf16_spanish_ci_Weights, utf32_spanish2_ci_Weights, utf32_spanish_ci_Weights, utf8mb3_spanish2_ci_Weights, utf8mb3_spanish_ci_Weights, utf8mb4_spanish2_ci_Weights, utf8mb4_spanish_ci_Weights]
common_utf_swedish_ci_Weights: [utf16_swedish_ci_Weights, utf32_swedish_ci_Weights, utf8mb3_swedish_ci_Weights, utf8mb4_swedish_ci_Weights]
common_utf_turkish_ci_Weights: [utf16_turkish_ci_Weights, utf32_turkish_ci_Weights, utf8mb3_turkish_ci_Weights, utf8mb4_turkish_ci_Weights]
common_utf_unicode_520_ci_Weights: [utf16_unicode_520_ci_Weights, utf32_unicode_520_ci_Weights, utf8mb4_unicode_520_ci_Weights]
common_utf_unicode_ci_Weights: [utf16_unicode_ci_Weights, utf32_unicode_ci_Weights, utf8mb3_unicode_ci_Weights, utf8mb4_unicode_ci_Weights]
common_utf_vietnamese_ci_Weights: [utf16_vietnamese_ci_Weights, utf32_vietnamese_ci_Weights, utf8mb3_vietnamese_ci_Weights, utf8mb4_vietnamese_ci_Weights]

1827: Remove db-specific transaction interfaces / logic
1825: implement create spatial ref sys
This implements the create spatial reference system ..., which lets users add custom SRID to the information schema.
MySQL docs: https://dev.mysql.com/doc/refman/8.0/en/create-spatial-reference-system.html
MySQL is much more restrictive when it comes to what is a valid DEFINITION for an entry in this table, and the rules are unclear, so we are much more permissive for now.
Additionally, this information persist in MySQL between server restarts, which we do not do. However, MySQL does throw a warning stating that updating may discard any changes the user makes.
Lastly, the values persist between test runs, and we don't support deleting from information_schema, so some tests are modified.
fix for: dolthub/dolt#6002
1823: Trim spaces and empty statements to the right in planbuilder.Parse
1822: join filter closure and constant join lookups
This PR adds a set of join planning improvements.
1. Table aliases can accept multi column indexes
  We have never been able to choose a multi-expression range scan through table aliases.
  Before:
```
tmp2> explain select * from t alias where a = 1 and b = 1 and c = 1;
+-----------------------------------------------------------+
| plan                                                      |
+-----------------------------------------------------------+
| Filter                                                    |
|  ├─ (((alias.a = 1) AND (alias.b = 1)) AND (alias.c = 1)) |
|  └─ TableAlias(alias)                                     |
|      └─ IndexedTableAccess(t)                             |
|          ├─ index: [t.a]                                  |
|          ├─ filters: [{[1, 1]}]                           |
|          └─ columns: [a b c]                              |
+-----------------------------------------------------------+
```
After:
```
tmp2> explain select * from t alias where a = 1 and b = 1 and c = 1;
+-----------------------------------------------------------+
| plan                                                      |
+-----------------------------------------------------------+
| Filter                                                    |
|  ├─ (((alias.a = 1) AND (alias.b = 1)) AND (alias.c = 1)) |
|  └─ TableAlias(alias)                                     |
|      └─ IndexedTableAccess(t)                             |
|          ├─ index: [t.a,t.b,t.c]                          |
|          ├─ filters: [{[1, 1], [1, 1], [1, 1]}]           |
|          └─ columns: [a b c]                              |
+-----------------------------------------------------------+
```
This has silently been impacting join performance in particular, where table aliases are more common. This is a small change but I'd expect this to have a broad positive impact for customers.
2. Join equivalence closure
A join like select * from xy join uv on x = u join ab on u = a has two initial join edges, x = u and u = a. Those edges create expression groupings xy x uv, uv x ab, xy x uv x ab. There misses a transitive edge, x = a, with a corresponding join group xy x ab. We should generate plans for most transitive edges now (transitive edges in apply joins are harder).
For joins with many tables this will unlock many potential join paths.
3. Use functional dependencies to find more lookup and merge joins
We can use constants and aggregated equivalency sets (equal filters) to be more aggressive with lookup join selection. Previously we only searched the current join ON equal conditions for expressions that match an index prefix for a lookup join, but constants are also valid lookup keys.
Refer to dolthub/dolt#5993 and dolthub/dolt#3797 for in-depth examples.
4. Use functional dependencies to do better lookup join costing
Even though we do not have index statistics, we can still use functional dependencies on indexes to detect whether a lookup will have MAX_1_ROW. Two examples where we can detect MAX_1_ROW: our lookup index is the primary key, and our lookup key provides a constant or equals expression for every pk column; our lookup index is unique, our lookup key has constants or equal expressions for every column, and we can prove that every key expression is non-nullable.
MAX_1_ROW lookups are a rare binary condition, most of the time selectivity is in the continuous range 0-1. When they do occur they are usually the most efficient access pattern. Many of the test changes from HASH_JOIN or MERGE_JOIN to LOOKUP_JOIN are a result of this improvement. The issues linked above in (3) have practical examples.
1821: Bug fix: The result schema of SELECT INTO should always be an empty schema
SELECT INTO currently returns it's child node's schema as its result schema, but it doesn't actually return row data in that schema. This causes a problem over a SQL connection when clients see a result schema and then see row data that doesn't match that schema. This causes clients to freak out and close the connection from their side. Since SELECT INTO always sends its results to a file or SQL vars (and NOT over the SQL connection), its result schema should always be the empty schema.
Fixes: dolthub/dolt#6105
1819: Lazy load large character set encoding weight maps.
Improves dolt startup times substantially.
1818: Throw correct error for non-existent qualified column on table function
Fix for: dolthub/dolt#6101
1817: Ignore FULLTEXT in CREATE TABLE
This change allows us to ignore any FULLTEXT keys when using CREATE TABLE. This should unblock customers who just need their statements to parse, but don't actually need the functionality of FULLTEXT. We still error when trying to manually add a FULLTEXT index using ALTER or CREATE INDEX. Since this isn't really "correct" behavior, I did not add any tests.
1816: Support TableFunction aliasing
Added string field to expression.UnresolvedTableFunction to so an alias can be specified.
Removed the rule disambiguate_table_functions, TableFunctions will default to using function name as table name when alias isn't provided.
As a result we now throw error ambiguous table name on
```
select * from dolt_diff(...) join dolt_diff(...);
```
which is also what happens with
```
select * from t join t;
```
Aliases are required for the this to correctly execute
```
select * from dolt_diff(...) a join dolt_diff(...) b;
```
Companion PR: dolthub/vitess#244
Fix for: dolthub/dolt#5928
1814: Add filters to memo
Scalar expressions added to memo along with scalar properties, expression ids, filter closures. Goal here is equivalent behavior to before, just with filters represented differently. Filter organization mostly mirrors the plan package, except scalar and relational expressions are both represented as expression groups here. Done in a rush, still back and forth on whether there should be an interface there.
Additionally:
scalar expressions added to memo along with scalar properties, expression id
rewrites join planning and costing to use bitset representations of filters
refactors codegen so definition files are yaml, source is compiled independently from target code
The organization is bit wonky b/c this should be using my name resolution symbol tables, and the entire tree should be memoized not just the join tree (used temporary solutions for the problems created by both of these).
Re: dolthub/dolt#5993
1812: Support load data ignore/replace
Vitess PR: dolthub/vitess#243
Here are the docs for load data with ignore/replace modifiers: https://dev.mysql.com/doc/refman/8.0/en/load-data.html#load-data-error-handling
This also changes LOCAL to have the same effect as IGNORE to match mysql
1811: Hoist subquery filters bug
Hoist filters is supposed to move filters that do not reference tables in the current sope upwards. We did not descend subqueries when checking for that condition, mistakenly hoisting filters in some cases.
Re: dolthub/dolt#6089
1810: Added many collations and character sets
Added a lot of missing collations and character sets. I had to manually type in all of the new entries in charactersets.go and collations.go, so it's possible I made a mistake. I double checked them all, but I still might have overlooked one.
1809: Add support for JSON -> and ->> operators
MySQL column->path Documentation
MySQL column->>path Documentation
Fixes: dolthub/dolt#5662
1808: escape special characters in strings
1807: add support for more JSON_TABLE() functionality
This PR adds support for:
- FOR ORDINALITY columns, which is just an auto increment
- DEFAULT <value> ON ERROR/EMPTY , which fills in values when encountering either an error or a missing value
- when this isn't specified, NULL is used
- ERROR <value> on ERROR/EMPTY, which throws an error when encountering either an error or a missing value
- when this isn't specified, we ignore errors and fill in values with NULL
- NESTED columns, which is a way to extract data from objects within objects and so on
- when there are multiple NESTED columns, they are "sibling" nested, they take turns being NULL
  Note: there is a skipped test highlighting a bug in either our jsonpath implementation or it's something here...
  Companion PR: dolthub/vitess#240
  MySQL docs: https://dev.mysql.com/doc/refman/8.0/en/json-table-functions.html
1806: improve conversion from JsonDocument to string
1804: Reworked implicit PK handling for referenced foreign keys
Builds on #1798, introduces an interface which exposes internal (and implicit) primary keys on indexes. Not all integrators will have the functionality, so it's an optional interface to expand compatibility. Also quite a bit simpler.
1803: Added "utf8mb3_czech_ci" collation, fixed missing collation check for enum/set
Fixes #1801
Adds the requested collation, and fixes the panic. The panic came from an oversight when checking for a collation's implementation. enum and set use the collation during type creation, which occurs before we've verified the collation's implementation. The other string types do not use the collation during type creation, so we return the appropriate error as a result.
1802: Update README.md to sync with _example/main.go
1799: show all indexes, and prevent creating indexes named 'primary'
We had some logic in SHOW CREATE TABLE to prevent PRIMARY KEYs from showing up twice in because they are within tables IndexCollection. This logic relied on checking if all the columns in the index were part of the Primary Key. MySQL allows and shows SECONDARY INDEXES that are identical to the primary key. It appears to differentiate them by naming the PK index "PRIMARY". Additionally, MySQL prevents users from creating SECONDARY INDEXES named "primary".
MySQL names PK index Primary
fix for: dolthub/dolt#6049
1798: include primary key in index mapping for foreign keys
MySQL does some behind the scenes magic and appends the primary key columns to secondary indexes.
This creates not obvious prefixes on secondary indexes for foreign keys.
A plus is that we save on creating secondary indexes when we don't need to.
Note: it appears this special prefix matching applies when looking for a secondary index on the referenced table, but not on the child table itself.
Fix for: dolthub/dolt#6038
1797: Add filters to memo
Scalar expressions added to memo along with scalar properties, expression ids, filter closures. Goal here is equivalent behavior to before, just with filters represented differently. Filter organization mostly mirrors the plan package, except scalar and relational expressions are both represented as expression groups here. Done in a rush, still back and forth on whether there should be an interface there.
Additionally:
- scalar expressions added to memo along with scalar properties, expression id
- rewrites join planning and costing to use bitset representations of filters
- refactors codegen so definition files are yaml, source is compiled independently from target code
  The organization is bit wonky b/c this should be using my name resolution symbol tables, and the entire tree should be memoized not just the join tree (used temporary solutions for the problems created by both of these).
  Re: dolthub/dolt#5993
1796: Added a method to SystemVariable to let them be compared using their underlying types, used to this to fix least / greatest not working with system vars
Fixes dolthub/dolt#6022
1795: qualify json_table columns
Likely due to improved aliasing code, it's simple to qualify columns for json_table.
This PR
1. adds skipped tests for currently unsupported functionality for json_tables
- for ordinality
- nested paths
- default
- error
1. adds prepared tests for existing json table script and query tests
2. reorganizes json tests
  MySQL docs for missing functionality: https://dev.mysql.com/doc/refman/8.0/en/json-table-functions.html
  Fix for: dolthub/dolt#6004
1791: Functional dependencies
Functional dependencies track 1) key uniqueness, constant propagation, column equivalence, nullability sets.
This information is built bottom-up from tables scans through projections, and is used to answer certain questions about relational nodes:
1. What is the equivalence closure for a join condition?
2. Are a set of filters redundant?
3. Do a set of index expressions comprise a strict key for a LOOKUP_JOIN?
4. Does a subquery decorrelation scope have a strict/null-safe key for an ANTI_JOIN?
5. Are the grouping columns a strict key of the table (only_full_group_by is unnecessary)
6. Is the relation sorted on a given column set? (is a Sort already enforced)
7. Is a relation constant? (Max1Row)
  Questions (1) and (3) contribute towards fixing this issue: dolthub/dolt#5993. Question (2) contributes to filter pruning. Question (4) is relevant for this issue: dolthub/dolt#5954.
  This master's thesis explains how to build the derivation graph starting at page 113: https://cs.uwaterloo.ca/research/tr/2000/11/CS-2000-11.thesis.pdf. The graph is composed of (determinant) -> (dependent) relationships on columns to track these properties. They color edges and nodes to differentiate constant, nullability, equivalence attributes. Any set of of columns uniquely determines the value of constants, so they have empty determinants: () -> (colSet). We differentiate strict keys (set of columns unique and non-nullable index) from lax keys (index that maybe be non-unique or nullable).
  Cockroach implemented a version that uses flattened to/from sets rather than individual nodes for determinant/dependents, and makes optimizations for quickly computing candidate keys: https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/opt/props/func_dep.go.
  My encoding is a little different. First, I assume that attributes trickle down from nullability -> constant -> equivalence -> functional dependencies. An FD built in this order simplifies the upstream additions in a way that avoids having to recompute dependency closures (ex: nullability, constant, and equiv columns don't recompute keys). Second, I assume FDs will be limited to primary and secondary key indexes; keys will have either strict or lax determinants, and the dependents are always assumed to be the rest of the table. So far this drops LEFT_JOIN right-equivalence relations that translate to lax-keys after the join, which could opportunistically be converted back to strict keys by downstream operators. If this was a mistake we can undo that, add back in dependent column sets.
  We need to support a handful of operators to use FDs in the join memo:
- Table scan
- Cross join
- Inner join
- Left join
- Project (Distinct)
- Filter
  Missing:
- Full outer join
- Synthesized columns
  Additionally:
- the memo needs to embed equal filters in a format with expression ids
- join reordering should compute equivalence closures for join edges
- join selection should use functional dependencies to check if lookup expressions are valid
  Missing practical considerations:
- when we determine a lookup expression comprises a strict key for a table, we need a way to backfill constants and equivalences used to make that decision
- filters should maybe be represented in memo selection nodes to support redundancy elimination
1790: Reverting @@system_time_zone support for returning system timezone
I'm not sure yet why this change causes Dolt's MySQL Connector J integ test to fail, but after testing commits yesterday and testing this fix this morning, this is definitely causing some problem. In CI, but not locally, the MySQL Connector J integ test is failing with a generic/empty SQLException each time, immediately after executing the first query. Dolt logs look fine, but there's obviously something happening that the MySQL Connector J can't deal with. I have a few more repro ideas that I'll follow up with, but wanted to get the GMS -> Dolt pipeline cleared up first.
Until then, this PR reverts @@system_time_zone from returning the actual timezone in use by the runtime and OS.
This change was able to get the Dolt MySQL Connector integ suite passing again: https://github.com/dolthub/dolt/actions/runs/5059419161/jobs/9081665231
1789: foreign key index match should be on prefix not subset
Turns out the logic for using an existing secondary index for foreign keys was matching on a subset instead of a prefix.
For example:
```
fks> create table parent (fk1 int, fk2 int, primary key (fk1, fk2));
fks> create table child (id int unique, fk1 int, fk2, primary key (fk2, fk1, id), foreign key (fk1, fk2) references parent (fk1, fk2));
```
This should produce a child table that creates a new secondary index over the columns (fk1, fk2).
But, we don't do that and instead reference some other index (not sure what), causing errors when calling dolt constraints verify --all
This PR ensures that we only pick an existing secondary index as the underlying index for a foreign key if that index is a prefix for the foreign key.
1788: Divide pushdown
Pushdown currently performs two sets of transforms.
1. Push filters as low in the tree as possible:
```
Filter(x=1) -> Join(Tablescan(x), Tablescan(y), x=y)
=>
Join(Filter(x=1)->Tablescan(x), Tablescan(y), x=y)
```
1. Convert filters into static index scans:
```
Filter(x=1)->Tablescan(x)
=>
Indexscan(x, x in [1,1])
```
There are now two rules, pushFilters and generateIndexScans. Running pushFilters
before join planning lets join planning make better use of functional dependencies.
Also:
- Small rule reorganizations
- Small correctness changes exposed by refactor
- Remove pushdown tests
- Filter pushdown appears to work in more cases, see plan changes
- sql.FilteredTable interface is now unused, marked as deprecated but did not delete related functions
1787: Changes to USE and prepared statements
This introduces two changes to how databases are resolved:
1. USE statements now are handled by the Session with a new interface
2. Tables in prepared statements now retain a copy of their Database implementation, rather than re-resolving it by name during execution.
  Both of these changes are to support Dolt's new database name semantics.

1785: Copy MySQL information_schema.st_spatial_reference_systems table
This PR essentially copies the information_schema.st_spatial_reference_systems table from MySQL.
Small refactor to move constants and help function to types package from functions/spatial package.
Additionally, changes show create table statement to print SRIDs inside a MySQL special comment.
Example:

tmp> show create table t2;
+-------+------------------------------------------------------------------+
| Table | Create Table                                                     |
+-------+------------------------------------------------------------------+
| t2    | CREATE TABLE `t2` (                                              |
|       |   `p` point /*!80003 SRID 2000 */,                               |
|       |   `l` linestring /*!80003 SRID 2001 */                           |
|       | ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_bin |
+-------+------------------------------------------------------------------+
1 row in set (0.00 sec)

Partially Addresses this: dolthub/dolt#5973

1781: Small aggregation perf improvements
- We used to make a new error object for every cache.Get() miss. Use shared object instead.
- Add reuse pool for collation weight buffers rather than making thousands of [4]byte.
  before:
after:
1779: Add extra filters to AntiJoin to guarentee correct behavior around NULLs.
This is a correctness fix: generating AntiJoins is not currently equivalent to MySql if either column being used in the not in expression contains NULL.
This will break a lot of regression tests. If this doesn't break Turbine's tests, we should submit this while we work on a fix.

1777: Add SRID 3857 to list of "supported" SRIDs
The ogr2ogr tool performs some information gathering queries beforehand, which change the queries it generates later on to populate MySQL tables.
When running ogr2ogr -f MySQL "MYSQL:ogr,user=root,port=3307,host=127.0.0.1" "data.gpkg", ogr2ogr sends these queries

SELECT VERSION();
SHOW TABLES;
SELECT SRS_ID FROM INFORMATION_SCHEMA.ST_SPATIAL_REFERENCE_SYSTEMS WHERE ORGANIZATION = 'EPSG' AND ORGANIZATION_COORDSYS_ID = 3857;
SELECT SRS_ID FROM INFORMATION_SCHEMA.ST_SPATIAL_REFERENCE_SYSTEMS WHERE DEFINITION = 'PROJCS["WGS 84 / Pseudo-Mercator",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.25722356
3,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Mercator_1SP"],PARAMETER["central_meridian",0],PARAMETER["scale_factor",1],PARAMETER["false_easting",0],PARAMETER[
"false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],EXTENSION["PROJ4","+proj=merc +a=6378137 +b=6378137 +lat_ts=0 +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null +wktext +no_defs"],AUTHORITY["EPSG","3857"]]'

However, Dolt's information_schema.st_spatial_reference_system only contains SRID 0 and 4326, as a result both queries ogr2ogr makes return Empty Set.
Consequently, this causes ogr2ogr to generate the create table queries like so

CREATE TABLE `aeropuertos` ( fid INT UNIQUE NOT NULL AUTO_INCREMENT, geom GEOMETRY NOT NULL)

When it should be

CREATE TABLE `aeropuertos` ( fid INT UNIQUE NOT NULL AUTO_INCREMENT, geom GEOMETRY NOT NULL /*!80003 SRID 3857 */)

On initial insert, this causes ogr2ogr to insert with SRID 0
On further inserts/appends, this causes ogr2ogr to just not fill in SRID, leaving it as -2 aka not initialized
This PR adds SRID 3857 to dolt's information_schema.st_spatial_reference_system.
Now, the ogr2ogr command loudly fails with a useful error:

ERROR 1: MySQL error message:unsupported feature: unsupported SRID value Description: CREATE TABLE `aeropuertos` (    fid INT UNIQUE NOT NULL AUTO_INCREMENT,    geom GEOMETRY NOT NULL /*!80003 SRID 3857 */)
ERROR 1: Terminating translation prematurely after failed
translation of layer aeropuertos (use -skipfailures to skip errors)

Addresses these issues:

1776: Returning @@system_time_zone based on OS time zone
The @@system_time_zone global system variable is a read-only system variable that shows the timezone that the server is running in. MySQL and Golang both load the system timezone from the OS, typically using the TZ env var or /etc/timezone. This change exposes that system timezone information through the @@system_time_zone variable, which was previously hardcoded to UTC.
Because timezone settings can change while the system is running (thanks daylight savings time!), we need to check with the runtime for the system timezone whenever it is requested. I gave SystemVariable the ability to include a function that gets executed to return the sys var value and moved the uptime sys var over to that, too.
1775: Error for out of range SRIDs
This PR fixes an error inconsistency between Dolt and MySQL.
For SRIDs that are out of range [0, MAX_UINT32], MySQL throws a ERROR: 1690: SRID value is out of range in '<func_name>'.
For SRIDS that are in range, but just don't exist, MySQL throws a ERROR: 3548: There's no spatial reference system with SRID <invalid srid>.
Dolt used to only throw ERROR 3548, and would incorrectly report the negative numbers as overflowed positive uint32s
partial fix for: dolthub/dolt#5948
1774: Added utf8mb3_tolower_ci
1773: Add plan option for AntiJoins to be LeftJoin
This PR adds a plan option to convert ANTI_JOINs to LEFT_JOINs with an additional filter condition.
In some cases, the LEFT_JOIN performs better than the ANTI_JOIN.
Additionally, this PR adds a new join hint for LEFT_OUTER_LOOKUP_JOINs.
Fix for: dolthub/dolt#5825
1770: Union offset+limit bug
Unions were dropping limit, applying offset as limit.
1768: CREATE LIKE statements should preserve check-constraints on the new table.
1767: Timezone improvements
This PR starts handling some timezone differences between GMS and MySQL:
- The NOW() and CURTIME() functions now respect the sessions's current time zone
- Allows CONVERT_TZ to mix use of timezone names (e.g. UTC) and timezone offset strings (e.g. +00:00)
- Adds some skipped tests for timestamp input/output session timezone conversion
1763: prevent large varbinary column from being created
We don't enforce column size limit for varbinary, causing panics.
fix for: dolthub/dolt#5059
1762: Add support for ALTER TABLE <table> MODIFY COLUMN <col> <type> UNIQUE
This is a MySql syntax that is effectively syntactic sugar for
```
ALTER TABLE <table> MODIFY COLUMN <col> <type>;
ALTER TABLE <table> ADD UNIQUE INDEX `<col>` (col);
```
1760: sql/core.go: SystemVariable: Add a NotifyChanged field which can be used to be notified when the variable value changes.
1759: server/server.go: Log a message when we start accepting connections.
1756: Allow all built-in functions to be used in column default value expressions
We currently maintain an allow list of functions that can be used in column default value expressions. MySQL has changed what they support in column default value expressions over time and the current support allows all built-in functions to be used:

Literals, built-in functions (both deterministic and nondeterministic), and operators are permitted.

This PR removes our allow list so that any function registered as a built-in function can be used in a default expression. This has a couple of implications, which I think is overall a good tradeoff:
- we don't have to maintain the allow list anymore (nice, but not a huge deal, since this hasn't been updated often)
- if customers typo a function name in a column default expression, they will get a better error message (this is what motivated this change; for more details see: dolthub/dolt#5887)
- any Dolt functions that we register as built-ins will be eligible for column default value expressions. I think this makes sense and might be useful for some customers (e.g. a column with a default value that inserts active_branch() to record the branch a row was created on) Related, any functions other GMS integrators register will also be eligible for use in column default value expressions.
  My only concern with this change is that we don't currently distinguish between built-in functions and customer functions, which should not be allowed in column default value expressions. However, that seems like a moot point right now, since we don't support CREATE FUNCTION in GMS so it doesn't seem like there's a way to trigger that. I also added a TODO to the code to point out that when we do have support for user created functions, we need to check for that when in the resolve column defaults code.
  Fixes dolthub/dolt#5887
1754: support ALTER EVENT statement
Support for ALTER EVENT statements excluding case for moving events across databases using RENAME TO clause.
Depends on dolthub/vitess#233
1751: Unions finalize subquery expressions
The SUBQUERY -> UNION -> SUBQUERY_EXPR pattern was subject to an edge case where the outer SUBQUERY disabled recursively finalizing the inner SUBQUERY_EXPR. The intervening UNION node blocked the outer reach, but failed to re-enable finalizeSubquery during finalizeUnion. This PR edits the finalizeUnionSelector to explicitly re-enable finalize nested subqueries.
This PR also touches hoistOutOfScopeFilters for the same edge case.
Issue: dolthub/dolt#5875
1747: Added latin1_spanish_ci and latin1_danish_ci collations
Fixes dolthub/dolt#5866
1746: update json_set support for edge cases
updated logic of json_set to support more input edge cases
1744: Adding tests for altering keyless tables
Added tests to cover basic keyless table column alterations, to test Dolt change in dolthub/dolt#5867
1743: Optimize division with integers by using floats internally instead of decimal.Decimal
Fix for dolthub/dolt#5832
Seems unlikely to break anything in Dolt, but I went ahead and ran Dolt CI tests: dolthub/dolt#5857
1742: convert sort over pks with index
This PR fixes some issues with the implementation of the rule replaceSortPk. Namely, the rule now works with both TableAliases, ColumnAliases, and SubqueriyAliases.
Additionally, the rule is able to be applied when sorting in Descending order.
Overally, this means queries like select * from t order by pk desc limit 1 will just be a Limit over an IndexTabledAccess
The rule does skip for Distinct and Join nodes, but it might be possible to get it to work for those.
The plans with Distinct causes issues with IndexedTableAccess, but it definitely could work as PKs are distinct by definition.
It is also possible to get this optimization to work with Joins, but I'm not convinced on their correctness, especially across different kinds of joins, so I decided to not implement it here.
Fix for: dolthub/dolt#5812
1741: allow renaming views with RENAME TABLE statement
This is the same PR as #1712 (it was reverted)
- allows renaming views using RENAME TABLE statement
- fails on renaming views using ALTER TABLE ... RENAME ... statement
1739: Populating the Decimal property in Field response metadata
Fixes dolthub/dolt#5834
In addition to the unit tests here, I'm also working on a Dolt PR to update the mysql connector library integration tests to test the Ruby mysql2 library.
1738: ICU Regex Implementation
To try and prevent memory leaks, I'm having the regex only work under a callback. The idea is that we'll do all of our matches under the callback, using a node placed by an analyzer rule. I think this approach should work, and it'll expand to any other functions within the regex that need to hold memory that will be freed later.
For now though, this portion works, and I have a small test showing such.
1737: Coalesce.Type() needs to handle type types.Null
Coalesce.Type() only checks if its arguments to have type nil. But NULL constants have type types.Null, and we need to check for that too.
1736: order multi-alter statements
When running multi-alter statements, MySQL reorders the alters.
This PR adds a sort function to organize these statements; the precedence order is currently:
1. RENAME COLUMN
2. DROP COLUMN
3. ADD COLUMN
4. ALTER INDEX
  Note: When this is supposed to work, it does, but when it doesn't our error messages are different than the ones returned by MySQL. There are existing alter tests that also return the incorrect error, but I don't think this is worth fixing right now.
  I also unskipped one BrokenScriptTest and fix the expected results for the others.
  Companion PR: dolthub/dolt#5831
1735: only hash expression.UnresolvedColumn for OnDuplicateExpressions
Special logic is used to qualify columns for InsertInto.OnDuplicateExprs. Since qualify uses transform.OneNodeExprWithNode it will try to qualify all expressions (Literals, Tuples, etc). This change makes it so that we only try to qualify Columns.
fix for: dolthub/dolt#5799
1731: sql/parse: expose utility functions to convert parsed index and check constraint definitions
1728: hashjoin indexing
When we have plan of the following format:
```
InSubquery
...
CrossJoin
Left: Table
Right: SubqueryAlias
OuterScopeVisibility: true
...
HashJoin
HashLookup
source: ...
target: TableAlias
...
```
The indexes we assign to GetFields during analysis don't align with the indexes of the actual columns in each row during execution time. This is a result of StripNode, PrependNode, and the nested Joins with SubqueryAlias.
This error wasn't caught sooner as the incorrect indexes are too low, so they never threw IndexOutOfBounds errors and just returned potentially incorrect results instead.
The fix was to correct these indexes at analysis time.
Firstly, SubqueryAlias nodes with OuterScopeVisibilty = true inside joins need to see the left sibling node (in addition to the parent nodes). So Scope was modified to include some new fields, specifically for sibling nodes. Additionally, the file finalizeSubquery was changed to track the parent as well, so we could detect when we're analyzing a SubqueryAlias on the right side of a join, and add the left child to the scope.
Additionally, pushdownFilters was modified to not undo all the changes to the Analyzer for HashLookups.
At runtime, the PrependRow nodes cache the rows outside the InSubquery, while the buildJoinIter for CrossJoin would include both the outside and the left row. This meant that some parts of the inner HashJoin would receive extra columns while others didn't. The fix here was to alter the scope.InJoin depending on which parts of HashJoin we were building.
Lastly, to have these changes not break for PreparedStatements, we just needed to not redo finalizeUnions in postPrepared, as we don't replan joins in postPrepared, so we don't know if we're in a join or not, and the correct indexes are set in prePrepared.
Along the way, we discovered a query that panics, but the cause is different than the purpose of this fix, and it panicked the same way before these changes, so it is left as a skipped test.
Fix for: dolthub/dolt#5714
1727: implement json_set function
implements the json_set function with a few edge cases outstanding
fixes: dolthub/dolt#5680
1726: subquery indexing tests
1724: Fix README.md
Replace spaces with tabs in code indentation.
1722: go.mod: Use dolthub/flatbuffers/v23 instead of google/flatbuffers.
1721: go.mod: Move oliveagle/jsonpath -> dolthub/jsonpath.
1720: memory: extract rangeFilterExpr into expression package
rangeFilterExpr contains a complex set of logic to build a sql expression given the list of sql ranges and the list of expressions on the index.
Extract the majority of this function into NewRangeFilterExpr in the expression package. Replace the Or and And helper functions with JoinOr and JoinAnd.
Update the call in the memory/ package to call the new expression function.
Fix JoinOr and JoinAnd to check if the expressions are nil.
Remove ineffective nil check in JoinAnd.
Move nil checks into NewOr and NewAnd.
Add comment explaining parameters to NewFilterRangeExpr.
1719: Added serving tray and bowtie

vitess

251: Fixed missing support for collations as strings
Fixes issue dolthub/dolt#6192
We only allowed collations to be declared after the CREATE TABLE portion in their non-string form. This adds support for the string form.
249: Various small parser improvements
Various small parser improvements:
- Allow column definitions to use the MySQL INVISIBLE keyword. The implementation still ignores the INVISIBLE keyword, but it will no longer cause a parser error.
- Support DOUBLE and FLOAT in the CAST and CONVERT functions.
- Allow a trigger body to be a single CALL statement.
248: Support for index hint in foreign key definition
247: Support FK definitions inline in column definitions
Adds support for declaring FK references inline in column definitions. Does not support ON DELETE and ON UPDATE yet. Example: ALTER TABLE t ADD COLUMN col2 int REFERENCES other_table(id);
Also cleaned up a few rules around non-reserved keywords to enable event to be used unquoted in ALTER TABLE statements.
246: support CREATE SPATIAL REFERENCE SYSTEM ... syntax
Syntax for: dolthub/dolt#6002
245: allow event as table and column name
The PR allows EVENT non-reserved keyword to be used as table and column name without quoting.
The missing edge case includes using EVENT for user name or host name.
244: parse table_functions with aliases
Syntax support for: dolthub/dolt#5928
243: Add ignore/replace modifiers to load data
242: allow EVENTS to be parsed as non-reserved keyword
Transferred EVENTS keywords into non_reserved_keyword list, allowing statements using information_schema.events table to parse.
For some reason EVENT cannot be transferred into non_reserved_keyword, causing shift/reduce and reduce/reduce conflicts.
241: Walking sub-nodes for SHOW TABLE statements
When preparing a SHOW TABLES statement with a bound variable in the filter clause (e.g. SHOW TABLES FROM mydb WHERE Tables_in_mydb = ?;) GMS and Vitess were identifying the bound variable parameters differently and causing the SQL client on the other end to panic. Vitess code in conn.go walks the parsed tree and looks for SQLVal instances to identify the parameters and then returns that metadata over the SQL connection. The SHOW TABLES statement above fails because the sqlparser AST wasn't including all the members of SHOW TABLES node in the walk. This case is a little tricky to test directly in go-mysql-server, because it only repros in a running sql-server when running over a Vitess conn.
The GMS and Vitess layers are both calculating bind variable metadata, with two different techniques, and whenever they get out of sync, we will see issues like this that only appear when running over a SQL connection. Longer term, we may want consider allowing GMS to return its bind variable metadata and avoid Vitess needing to re-calculate it, if we see more instances of this problem.
Fixes: #1793

240: Support more JSON_TABLE functionality
Source: https://dev.mysql.com/doc/refman/8.0/en/json-table-functions.html

JSON_TABLE(
expr,
path COLUMNS (column_list)
)   [AS] alias
column_list:
column[, column][, ...]
column:
name FOR ORDINALITY
|  name type PATH string path [on_empty] [on_error]
|  name type EXISTS PATH string path
|  NESTED [PATH] path COLUMNS (column_list)
on_empty:
{NULL | DEFAULT json_string | ERROR} ON EMPTY
on_error:
{NULL | DEFAULT json_string | ERROR} ON ERROR

Note: the MySQL docs indicate that PATH is optional in the NESTED case, but it doesn't seem that way.
I chose to follow what they say rather than what they do.

238: Fix for charset introducers in default values
Fixes dolthub/dolt#5970 by adding an additional default expression rule that handles charset introducers.
237: Small fix for DEFAULT CHARSET when creating tables
Super small fix for dolthub/dolt#5749. We were referencing the wrong variable.
235: Add support for INSERT INTO <table> VALUE ...;
Adds support for VALUE as a synonym of VALUES in INSERT INTO <table> VALUES ..., to match MySQL's syntax.
Fixes #1750
234: go/netutil/conn.go: Avoid panicing when ConnWithTimeouts has a Set{,}Deadline method called.
Clients does not expect setting deadlines on connections to panic. In particular, the standard library's TLS implementation adopts an existing net.Conn and will call SetWriteDeadline on it in certain cases.
It makes more sense to allow the deadlines to be managed by the client when they see fit. This changes the behavior to simply forward the deadlines along as soon as the client code has shown an interest in managing the deadlines.
233: support parsing ALTER EVENT statements
Supports parsing ALTER EVENT statements
231: support 'show events' statement parsing
- Added support for SHOW EVENTS statement parsing
- Added support for SHOW CREATE EVENT statement parsing
- Removed FULL option from SHOW TRIGGERS as it's not supported in MySQL.

Closed Issues

1800: Extracting update query columns
1801: Creating an enum column with collation utf8_czech_ci causes panic
1793: PrepareStatement got incorrect paramsCount in response packet from mock mysql server.
1749: [Feature Request] Support for JSON operator
1771: utf8mb3_tolower_ci not implemented
1750: [Feature Request] Support for INSERT INTO ... VALUE

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Q3 2023 Release

Merged PRs

go-mysql-server

vitess

Closed Issues