The following data types are supported:
BOOLEAN
(BOOL
): logical truth values, i.e. true and false.FLOAT
(DOUBLE
): 64-bit signed floating point numbers, using IEEE 754binary64
encoding. Supports magnitudes of 10⁻³⁰⁷ to 10³⁰⁸ with 53-bit precision (~15 significant figures), as well as the special values infinity and NaN.INTEGER
(INT
): 64-bit signed integer numbers with a range of ±2⁶³-1.STRING
(TEXT
,VARCHAR
): UTF-8 encoded strings.
In addition, the special NULL
value is used for an unknown value, following the rules of three-valued logic.
Numeric types are not interchangable; a float value (even without a fractional part) cannot be stored in an integer column and vice-versa.
Keywords are reserved words with special meaning in SQL statements. They are case-insensitive, and must be quoted with "
to be used as identifiers. The complete list is:
AS
, ASC
, AND
, BEGIN
, BOOL
, BOOLEAN
, BY
, COMMIT
, CREATE
, CROSS
, DEFAULT
,DELETE
, DESC
, DOUBLE
, DROP
, EXISTS
, EXPLAIN
, FALSE
, FLOAT
, FROM
, GROUP
, HAVING
, IF
, INDEX
, INFINITY
, INNER
, INSERT
, INT
, INTEGER
, INTO
, IS
, JOIN
, KEY
, LEFT
, LIKE
, LIMIT
, NAN
, NOT
, NULL
, OF
, OFFSET
, ON
, ONLY
, OR
, ORDER
, OUTER
, PRIMARY
, READ
, REFERENCES
, RIGHT
, ROLLBACK
, SELECT
, SET
, STRING
, SYSTEM
, TABLE
, TEXT
, TIME
, TRANSACTION
, TRUE
, UNIQUE
, UPDATE
, VALUES
, VARCHAR
, WHERE
, WRITE
Identifiers are names for database objects such as tables and columns. Unless quoted with "
, they must begin with a Unicode letter followed by any combination of letters, numbers, and _
, and cannot be reserved keywords. ""
can be used to escape a double quote character. They are always converted to lowercase.
The following keywords evaluate to constants:
FALSE
: the boolean false value.INFINITY
: the floating-point value for infinity.NAN
: the floating-point value for NaN (not a number).NULL
: an unknown value.TRUE
: the boolean true value.
String literals are surrounded by single quotes '
, and can contain any valid UTF-8 character. Single quotes must be escaped by an additional single quote, i.e. ''
, no other escape sequences are supported. For example:
'A string with ''quotes'' and emojis 😀'
Sequences of digits 0-9
are parsed as a 64-bit signed integer. Numbers with decimal points or in scientific notation are parsed as 64-bit floating point numbers. The following pattern is supported:
999[.[999]][e[+-]999]
The -
prefix operator can be used to take negative numbers.
Expressions can be used wherever a value is expected, e.g. as SELECT
columns nd INSERT
values. They are made up of constants, a column references, an operator invocations, and a function calls.
Column references can either be unqualified, e.g. name
, or prefixed with the relation identifier separated by .
, e.g. person.name
. Unqualified identifiers must be unambiguous.
Logical operators apply standard logic operations on boolean operands.
AND
: the logical conjunction, e.g.TRUE AND TRUE
yieldsTRUE
.OR
: the logical disjunction, e.g.TRUE OR FALSE
yieldsTRUE
.NOT
: the logical negation, e.g.NOT TRUE
yieldsFALSE
.
The complete truth tables are:
AND |
TRUE |
FALSE |
NULL |
---|---|---|---|
TRUE |
TRUE |
FALSE |
NULL |
FALSE |
FALSE |
FALSE |
FALSE |
NULL |
NULL |
FALSE |
NULL |
OR |
TRUE |
FALSE |
NULL |
---|---|---|---|
TRUE |
TRUE |
TRUE |
TRUE |
FALSE |
TRUE |
FALSE |
NULL |
NULL |
TRUE |
NULL |
NULL |
NOT |
|
---|---|
TRUE |
FALSE |
FALSE |
TRUE |
NULL |
NULL |
Comparison operators compare values of the same data type, and return TRUE
if the comparison holds or FALSE
otherwise. INTEGER
and FLOAT
values are interchangeable. STRING
comparisons use the string's byte values, i.e. case-sensitive with 'B' < 'a'
due to their UTF-8 code points. FALSE
is considered lesser than TRUE
. Comparison with NULL
always yields NULL
(even NULL = NULL
).
Binary operators:
=
: equality, e.g.1 = 1
yieldsTRUE
.!=
: inequality, e.g.1 != 2
yieldsTRUE
.>
: greater than, e.g.2 > 1
yieldsTRUE
.>=
: greater than or equal, e.g.1 >= 1
yieldsTRUE
.<
: lesser than, e.g.1 < 2
yieldsTRUE
.<=
: lesser than or equal, e.g.1 <= 1
yieldsTRUE
.
Unary operators:
IS NULL
: checks if the value isNULL
, e.g.NULL IS NULL
yieldsTRUE
.IS NOT NULL
: checks if the value is notNULL
, e.g.TRUE IS NOT NULL
yieldsTRUE
.IS NAN
: checks if the value is a floatNAN
, e.g.NAN IS NAN
yieldsTRUE
. Errors on non-float datatypes, exceptNULL
which yieldsNULL
.IS NOT NAN
: checks if the value is not a floatNAN
, e.g.3.14 IS NOT NAN
yieldsTRUE
.
Mathematical operators apply standard math operations on numeric (INTEGER
or FLOAT
) operands. If either operand is a FLOAT
, both operands are converted to FLOAT
and the result is a FLOAT
. If either operand is NULL
, the result is NULL
. The special values INFINITY
and NAN
are handled according to the IEEE 754 spec.
For INTEGER
operands, failure conditions such as overflow and division by zero yield an error. For FLOAT
operands, these return INFINITY
or NAN
as appropriate.
Binary operators:
+
: addition, e.g.1 + 2
yields3
.-
: subtraction, e.g.3 - 2
yields1
.*
: multiplication, e.g.3 * 2
yields6
./
: division, e.g.6 / 2
yields3
.^
: exponentiation, e.g.2 ^ 4
yields16
.%
: remainder, e.g.8 % 3
yields2
. Unlike modulo, the result has the sign of the dividend.
Unary operators:
+
(prefix): identity, e.g.+1
yields1
.-
(prefix): negation, e.g.- -2
yields2
.!
(postfix): factorial, e.g.5!
yields15
.
String operators operate on string operands.
LIKE
: compares a string with the given pattern, using%
as multi-character wildcard and_
as single-character wildcard, returningTRUE
if the string matches the pattern - e.g.'abc' LIKE 'a%'
yieldsTRUE
.
The operator precedence (order of operations) is as follows:
Precedence | Operator | Associativity |
---|---|---|
10 | + , - (prefix) |
Right |
9 | ! (postfix) |
Left |
8 | ^ |
Right |
7 | * , / , % |
Left |
6 | + , - |
Left |
5 | > , >= , < , <= |
Left |
4 | = , != , LIKE , IS |
Left |
3 | NOT |
Right |
2 | AND |
Left |
1 | OR |
Left |
Precedence can be overridden by wrapping an expression in parentheses, e.g. (1 + 2) * 3
.
sqrt(expr)
: returns the square root of a numerical argument.
Aggregate function aggregate an expression across all rows, optionally grouped into buckets given by GROUP BY
, and results can be filtered via HAVING
.
-
AVG(expr)
: returns the average of numerical values. -
COUNT(expr)
: returns the number of rows for whichexpr
evaluates to a non-NULL
value.COUNT(*)
can be used to count all rows. -
MAX(expr)
: returns the maximum value, according to the datatype's ordering. -
MIN(expr)
: returns the minimum value, according to the datatype's ordering. -
SUM(expr)
: returns the sum of numerical values.
Starts a new transaction.
BEGIN [ TRANSACTION ] [ READ ONLY | READ WRITE ] [ AS OF SYSTEM TIME txn_id ]
txn_id
: A past transaction ID to run a read-only transaction for, for time-travel queries.
Commits an active transaction.
Creates a new table.
CREATE TABLE table_name ( [ column_name data_type [ column_constraint [ ... ] ] [ INDEX ] [, ... ] ] ) where column_constraint is: { NOT NULL | NULL | PRIMARY KEY | DEFAULT expr | REFERENCES ref_table | UNIQUE }
-
table_name
: The name of the table. Must be a valid identifier. Errors if a table with this name already exists. -
column_name
: The name of the column. Must be a valid identifier, and unique within the table. -
data_type
: The data type of the column, see data types for valid types. -
NOT NULL
: The column may not containNULL
values. -
NULL
: The column may containNULL
values. This is the default. -
PRIMARY KEY
: The column should act as a primary key, i.e. the main row identifier. A table must have exactly one primary key column, and it must be unique and non-nullable. -
DEFAULT
expr
: Specifies a default value for the column whenINSERT
statements do not give a value.expr
can be any constant expression of an appropriate data type, e.g.'abc'
or1 + 2 * 3
. For nullable columns, the default value isNULL
unless specified otherwise. -
REFERENCES
ref_table
: The column is a foreign key toref_table
's primary key, enforcing referential integrity. -
UNIQUE
: The column may only contain unique (distinct) values.NULL
values are not considered equal, thus aUNIQUE
column which allowsNULL
may contain multipleNULL
values.PRIMARY KEY
columns are implicitlyUNIQUE
. -
INDEX
: Create an index for the column.
CREATE TABLE movie (
id INTEGER PRIMARY KEY,
title STRING NOT NULL,
release_year INTEGER INDEX,
imdb_id STRING INDEX UNIQUE,
bluray BOOLEAN NOT NULL DEFAULT TRUE
)
Deletes rows in a table.
DELETE FROM table_name [ WHERE predicate ]
Deletes rows where predicate
evaluates to TRUE
, or all rows if no WHERE
clause is given.
-
table_name
: the table to delete from. Errors if it does not exist. -
predicate
: an expression which determines which rows to delete by evaluting toTRUE
. Must evaluate to aBOOLEAN
orNULL
, otherwise an error is returned.
DELETE FROM movie
WHERE release_year < 2000 AND bluray = FALSE
Deletes a table and all contained data. Errors if the table does not
exist, unless IF EXISTS
is given.
DROP TABLE [ IF EXISTS ] table_name
table_name
: the table to delete.
Outputs the execution plan for the given statement.
EXPLAIN [ statement ]
Inserts rows into a table.
INSERT INTO table_name [ ( column_name [, ... ] ) ] VALUES ( expression [, ... ] ) [, ... ]
If column names are given, an identical number of values must be given. If no column names are given, values must be given in the table's column order. Omitted columns will get a default value if specified, otherwise an error will be returned.
-
table_name
: the table to insert into. Errors if it does not exist. -
column_name
: a column to insert into in the given table. Errors if it does not exist. -
expression
: an expression to insert into the corresponding column. Must be a constant expression, i.e. it cannot refer to table columns.
INSERT INTO movie
(id, title, release_year)
VALUES
(1, 'Sicario', 2015),
(2, 'Stalker', 1979),
(3, 'Her', 2013)
Rolls back an active transaction.
Selects rows from a table.
SELECT [ * | expression [ [ AS ] output_name [, ...] ] ] [ FROM from_item [, ...] ] [ WHERE predicate ] [ GROUP BY group_expr [, ...] ] [ HAVING having_expr ] [ ORDER BY order_expr [ ASC | DESC ] [, ...] ] [ LIMIT count ] [ OFFSET start ] where from_item is one of: table_name [ [ AS ] alias ] from_item join_type from_item [ ON join_predicate ] where join_type is one of: CROSS JOIN [ INNER ] JOIN LEFT [ OUTER ] JOIN RIGHT [ OUTER ] JOIN
Fetches rows or expressions, either from table table_name
(if given) or generated.
-
expression
: expression to fetch (can be a simple column name). -
output_name
: output column identifier, defaults to column name (if single column) otherwise nothing (displayed as?
). -
table_name
: table to fetch rows from. -
alias
: table alias. -
predicate
: only return rows for which this expression evaluates toTRUE
. -
group_expr
: an expression to group aggregates by. Non-aggregateSELECT
expressions must either reference a column given ingroup_expr
, be idential with agroup_expr
, or have anoutput_name
that is referenced by agroup_expr
column. -
having_expr
: only return aggregate results for which this expression evaluates toTRUE
. -
order_expr
: order rows by this expression (can be a simple column name). -
count
: maximum number of rows to return. Must be a constant integer expression. -
start
: number of rows to skip. Must be a constant integer expression. -
join_predicate
: only return rows for which this expression evaluates toTRUE
.
Join types:
-
CROSS JOIN
: returns the Carthesian product of the joined tables. Does not accept a join predicate (ON
clause). -
INNER JOIN
: returns the rows of the tables' Carthesian product for whichjoin_predicate
evaluates toTRUE
. -
LEFT OUTER JOIN
: returns the rows joined on thejoin_predicate
, or for any rows in the left table that does not have a match in the right table a single row is returned with the right table's columns set toNULL
. -
RIGHT OUTER JOIN
: the same as aLEFT OUTER JOIN
but with the left and right tables switched.
SELECT id, title, 2020 - released AS age
FROM movies
WHERE released >= 2000 AND ultrahd
ORDER BY released DESC, title ASC
LIMIT 10
OFFSET 10
Updates rows in a table.
UPDATE table_name SET column_name = expression | DEFAULT [, ... ] [ WHERE predicate ]
Updates columns given by column_name
to the corresponding expression
for all rows where predicate
evaluates to TRUE
. If no WHERE
clause is given, all rows are updated.
-
table_name
: the table to update. Errors if it does not exist. -
column_name
: a column to update. Errors if it does not exist. -
expression
: an expression whose evaluated value will be set for the corresponding column and row. Expressions can refer to column values, and must evaluate to the same datatype as the updated column. UsingDEFAULT
will set the column's default value, if any. -
predicate
: an expression which determines which rows to update by evaluting toTRUE
. Must evaluate to aBOOLEAN
orNULL
, otherwise an error is returned.
UPDATE movie
SET bluray = TRUE
WHERE release_year >= 2000 AND bluray = FALSE
toyDB supports ACID transactions using MVCC-based snapshot isolation, protecting from the following anomalies: dirty writes, dirty reads, lost updates, fuzzy reads, read skew, and phantom reads. However, write skew anomalies are possible since serializable snapshot isolation is not implemented.
A new transaction is started with BEGIN
, and ended with either COMMIT
(atomically writing all changes) or ROLLBACK
(discarding all changes). If any conflicts occur between concurrent transactions, the lowest transaction ID wins and the others will fail with a serialization error and must retry.
All past data is versioned and retained, and can be queried as of a given transaction ID via BEGIN TRANSACTION READ ONLY AS OF SYSTEM TIME <txn_id>
.
A transaction is still valid for use if a contained statement returns an error. It is up to the client to take appropriate action.