scalar-labs · brfrn169 · Nov 30, 2023 · Nov 15, 2023 · Nov 24, 2023 · Nov 29, 2023
diff --git a/docs/api-guide.md b/docs/api-guide.md
@@ -295,6 +295,25 @@ boolean ifExist = true;
 admin.dropCoordinatorTables(ifExist);
 ```
 
+### Import a table
+
+You can import an existing table to ScalarDB as follows:
+
+```java
+// Import the table "ns.tbl". If the table is already managed by ScalarDB, the target table does not
+// exist, or the table does not meet the requirement of ScalarDB table, an exception will be thrown.
+admin.importTable("ns", "tbl");
-admin.importTable("ns", "tbl");
+admin.importTable("ns", "tbl", options);
-admin.importTable("ns", "tbl");
+admin.importTable("ns", "tbl", options);
+```
+
+{% capture notice--warning %}
+**Attention**
+
+You should carefully plan to import a table to ScalarDB in production because it will add transaction metadata columns to your database tables and the ScalarDB metadata tables. There would also be several differences between your database and ScalarDB and limitations. See also the following document.
+
+- [Importing existing tables to ScalarDB using ScalarDB Schema Loader](./schema-loader-import.md)
+
+{{ notice--warning | markdownify }}
+
 ## Transactional API
 
 This section explains how to execute transactional operations by using the Transactional API in ScalarDB.
@@ -621,9 +640,21 @@ You can't specify clustering-key boundaries and orderings in `Scan` by using a s
 
 <div class="notice--info">{{ notice--info | markdownify }}</div>
 
-##### Execute `Scan` without specifying a partition key to retrieve all the records of a table
+##### Execute cross-partition `Scan` without specifying a partition key to retrieve all the records of a table
+
+You can execute a `Scan` operation across all partitions without specifying a partition key by enabling the following property in the ScalarDB configuration.
-You can execute a `Scan` operation across all partitions without specifying a partition key by enabling the following property in the ScalarDB configuration.
+You can execute a `Scan` operation across all partitions, which we call cross-partition scan, without specifying a partition key by enabling the following property in the ScalarDB configuration.
-You can execute a `Scan` operation across all partitions without specifying a partition key by enabling the following property in the ScalarDB configuration.
+You can execute a `Scan` operation across all partitions, which we call cross-partition scan, without specifying a partition key by enabling the following property in the ScalarDB configuration.
+
+```properties
+scalar.db.cross_partition_scan.enabled=true
+```
+
+{% capture notice--warning %}
+**Attention**
+
+Except for JDBC databases, we do not recommend enabling cross-partition scan with serializable isolation because it could make the isolation level lower (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.
-Except for JDBC databases, we do not recommend enabling cross-partition scan with serializable isolation because it could make the isolation level lower (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.
+We do not recommend enabling cross-partition scan with serializable isolation for non-JDBC databases because transactions could be executed with lower isolation (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.
-Except for JDBC databases, we do not recommend enabling cross-partition scan with serializable isolation because it could make the isolation level lower (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.
+We do not recommend enabling cross-partition scan with serializable isolation for non-JDBC databases because transactions could be executed with lower isolation (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.
+{% endcapture %}
 
-You can execute a `Scan` operation without specifying a partition key.
+<div class="notice--warning">{{ notice--warning | markdownify }}</div>
 
 Instead of calling the `partitionKey()` method in the builder, you can call the `all()` method to scan a table without specifying a partition key as follows:
 
@@ -645,11 +676,71 @@ List<Result> results = transaction.scan(scan);
 {% capture notice--info %}
 **Note**
 
-You can't specify clustering-key boundaries and orderings in `Scan` without specifying a partition key.
+You can't specify any filtering conditions and orderings in cross-partition `Scan` except for JDBC databases. See the following section to use cross-partition `Scan` with filtering or ordering for JDBC databases.
+{% endcapture %}
+
+<div class="notice--info">{{ notice--info | markdownify }}</div>
+
+##### Execute cross-partition `Scan` with filtering and ordering
+
+By enabling the cross-partition scan option with filtering and ordering for JDBC databases as follows, you can execute a cross-partition `Scan` operation with flexible conditions and orderings.
+
+```properties
+scalar.db.cross_partition_scan.enabled=true
+scalar.db.cross_partition_scan.filtering.enabled=true
+scalar.db.cross_partition_scan.ordering.enabled=true
+```
+
+You can call the `where()` and `ordering()` methods after calling the `all()` method to specify arbitrary conditions and orderings as follows:
+
+```java
+// Create a `Scan` operation with arbitrary conditions and orderings.
+Scan scan =
+    Scan.newBuilder()
+        .namespace("ns")
+        .table("tbl")
+        .all()
+        .where(ConditionBuilder.column("c1").isNotEqualToInt(10))
+        .projections("c1", "c2", "c3", "c4")
+        .orderings(Scan.Ordering.desc("c3"), Scan.Ordering.asc("c4"))
+        .limit(10)
+        .build();
+
+// Execute the `Scan` operation.
+List<Result> results = transaction.scan(scan);
+```
+
+As an argument of the `where()` method, you can specify a condition, an and-wise condition set, or an or-wise condition set. After calling the `where()` method, you can add more conditions or condition sets using the `and()` method or `or()` method as follows:
+
+```java
+// Create a `Scan` operation with condition sets.
+Scan scan =
+    Scan.newBuilder()
+        .namespace("ns")
+        .table("tbl")
+        .all()
+        .where(
+            ConditionSetBuilder.condition(ConditionBuilder.column("c1").isLessThanInt(10))
+                .or(ConditionBuilder.column("c1").isGreaterThanInt(20))
+                .build())
+        .and(
+            ConditionSetBuilder.condition(ConditionBuilder.column("c2").isLikeText("a%"))
+                .or(ConditionBuilder.column("c2").isLikeText("b%"))
+                .build())
+        .limit(10)
+        .build();
+```
+
+{% capture notice--info %}
+**Note**
+
+In the `where()` condition method chain, the conditions must be an and-wise junction of `ConditionalExpression` or `OrConditionSet`  (so-called conjunctive normal form) like the above example or an or-wise junction of `ConditionalExpression` or `AndConditionSet`  (so-called disjunctive normal form).
-In the `where()` condition method chain, the conditions must be an and-wise junction of `ConditionalExpression` or `OrConditionSet`  (so-called conjunctive normal form) like the above example or an or-wise junction of `ConditionalExpression` or `AndConditionSet`  (so-called disjunctive normal form).
+In the `where()` condition method chain, the conditions must be an and-wise junction of `ConditionalExpression` or `OrConditionSet` (so-called conjunctive normal form) like the above example or an or-wise junction of `ConditionalExpression` or `AndConditionSet` (so-called disjunctive normal form).
-In the `where()` condition method chain, the conditions must be an and-wise junction of `ConditionalExpression` or `OrConditionSet`  (so-called conjunctive normal form) like the above example or an or-wise junction of `ConditionalExpression` or `AndConditionSet`  (so-called disjunctive normal form).
+In the `where()` condition method chain, the conditions must be an and-wise junction of `ConditionalExpression` or `OrConditionSet` (so-called conjunctive normal form) like the above example or an or-wise junction of `ConditionalExpression` or `AndConditionSet` (so-called disjunctive normal form).
 {% endcapture %}
 
 <div class="notice--info">{{ notice--info | markdownify }}</div>
 
+For more details of available conditions and condition sets, see the `ConditionBuilder` and `ConditionSetBuilder` page in the [Javadoc](https://javadoc.io/doc/com.scalar-labs/scalardb/latest/index.html) of the version of ScalarDB that you're using.
+
 #### `Put` operation
 
 `Put` is an operation to put a record specified by a primary key. The operation behaves as an upsert operation for a record, in which the operation updates the record if the record exists or inserts the record if the record does not exist.

diff --git a/docs/schema-loader-import.md b/docs/schema-loader-import.md
@@ -0,0 +1,271 @@
+# Importing existing tables to ScalarDB using ScalarDB Schema Loader
+
+You might want to use ScalarDB (e.g., for database-spanning transactions) with your existing databases. In that case, you can import those databases under the ScalarDB control using ScalarDB Schema Loader. ScalarDB Schema Loader automatically adds ScalarDB-internal metadata columns in each existing table and metadata tables to enable various ScalarDB functionalities including transaction management across multiple databases.
+
+## Before you begin
+
+{% capture notice--warning %}
+**Attention**
+
+You should carefully plan to import a table to ScalarDB in production because it will add transaction metadata columns to your database tables and the ScalarDB metadata tables. There would also be several differences between your database and ScalarDB and limitations.
+{% endcapture %}
+
+<div class="notice--warning">{{ notice--warning | markdownify }}</div>
+
+### What will be added to your databases
+
+- ScalarDB metadata tables: ScalarDB manages namespace names and table metadata in an namespace (schema or database in underlying databases) called 'scalardb'.
+- Transaction metadata columns: the Consensus Commit transaction manager requires metadata (for example, transaction ID, record version, and transaction status) stored along with the actual records to handle transactions properly. Thus, this tool adds the metadata columns if you use the Consensus Commit transaction manager.
+
+### Requirements
+
+- [JDBC databases](./scalardb-supported-databases.md#jdbc-databases) except for SQLite are importable.
+- Each table must have primary key column(s) (composite primary keys can be available) .
+- Target tables must only have columns with supported data type (see also [here](#data-type-mapping-from-jdbc-databases-to-scalardb)).
+
+### Set up Schema Loader
+
+See the [ScalarDB Schema Loader](./schema-loader.md#set-up-schema-loader) document to set up Schema Loader for importing existing tables.
+
+## Run Schema Loader for importing existing tables
+
+You can import an existing table in JDBC databases to ScalarDB using `--import` option and an import-specific schema file. To import tables, run the following command, replacing the contents in the angle brackets as described:
+
+```console
+$ java -jar scalardb-schema-loader-<VERSION>.jar --config <PATH_TO_SCALARDB_PROPERTIES_FILE> -f <PATH_TO_SCHEMA_FILE> --import
+```
+
+- `<VERSION>`: version of ScalarDB Schema Loader that you set up.
+- `<PATH_TO_SCALARDB_PROPERTIES_FILE>`: path to a property file of ScalarDB. For a sample properties file, see [`database.properties`](https://github.com/scalar-labs/scalardb/blob/master/conf/database.properties).
+- `<PATH_TO_SCHEMA_FILE>`: path to an import schema file. See also a [sample](#sample-import-schema-file) in the next section.
+
+If you use the Consensus Commit transaction manager after importing existing tables, run the following command separately.
+
+```console
+$ java -jar scalardb-schema-loader-<VERSION>.jar --config <PATH_TO_SCALARDB_PROPERTIES_FILE> --coordinator
+```
+
+## Sample import schema file
+
+The following is a sample schema for importing tables. For the sample schema file, see [`import_schema_sample.json`](https://github.com/scalar-labs/scalardb/blob/master/schema-loader/sample/import_schema_sample.json).
+
+```json
+{
+  "sample_namespace1.sample_table1": {
+    "transaction": true
+  },
+  "sample_namespace1.sample_table2": {
+    "transaction": true
+  },
+  "sample_namespace2.sample_table3": {
+    "transaction": false
+  }
+}
+```
+
+The import table schema consists of a namespace name, a table name, and a `transaction` field. The `transaction` field indicates whether the table will be imported for transactions or not. If you set the `transaction` field to `true` or don't specify the `transaction` field, this tool creates a table with transaction metadata if needed. If you set the `transaction` field to `false`, this tool imports a table without adding transaction metadata (that is, for a table with [Storage API](storage-abstraction.md)).
+
+## Data-type mapping from JDBC databases to ScalarDB
+
+The following table shows the supported data types in each JDBC database and their mapping to the ScalarDB data types. Select your database and check if your existing tables are importable or not.
+
+<div id="tabset-1">
+<div class="tab">
+  <button class="tablinks" onclick="openTab(event, 'MySQL', 'tabset-1')" id="defaultOpen-1">MySQL</button>
+  <button class="tablinks" onclick="openTab(event, 'PostgreSQL', 'tabset-1')">PostgreSQL</button>
+  <button class="tablinks" onclick="openTab(event, 'Oracle', 'tabset-1')">Oracle</button>
+  <button class="tablinks" onclick="openTab(event, 'SQLServer', 'tabset-1')">SQL Server</button>
+</div>
+
+<div id="MySQL" class="tabcontent" markdown="1">
+
+| MySQL        | ScalarDB | Notes                 |
+|--------------|----------|-----------------------|
+| bigint       | BIGINT   | [*1](#warn-data-size) |
+| binary       | BLOB     |                       |
+| bit          | BOOLEAN  |                       |
+| blob         | BLOB     | [*2](#warn-data-size) |
+| char         | TEXT     | [*2](#warn-data-size) |
+| double       | DOUBLE   |                       |
+| float        | FLOAT    |                       |
+| int          | INT      |                       |
+| int unsigned | BIGINT   | [*2](#warn-data-size) |
+| integer      | INT      |                       |
+| longblob     | BLOB     |                       |
+| longtext     | TEXT     |                       |
+| mediumblob   | BLOB     | [*2](#warn-data-size) |
+| mediumint    | INT      | [*2](#warn-data-size) |
+| mediumtext   | TEXT     | [*2](#warn-data-size) |
+| smallint     | INT      | [*2](#warn-data-size) |
+| text         | TEXT     | [*2](#warn-data-size) |
+| tinyblob     | BLOB     | [*2](#warn-data-size) |
+| tinyint      | INT      | [*2](#warn-data-size) |
+| tinyint(1)   | BOOLEAN  |                       |
+| tinytext     | TEXT     | [*2](#warn-data-size) |
+| varbinary    | BLOB     | [*2](#warn-data-size) |
+| varchar      | TEXT     | [*2](#warn-data-size) |
+
+Data types not listed in the above are not supported. The typical examples are shown below.
+
+- bigint unsigned
+- bit(n) (n > 1)
+- date
+- datetime
+- decimal
+- enum
+- geometry
+- json
+- numeric
+- set
+- time
+- timestamp
+- year
+
+</div>
+
+<div id="PostgreSQL" class="tabcontent" markdown="1">
+
+| PostgreSQL        | ScalarDB | Notes                 |
+|-------------------|----------|-----------------------|
+| bigint            | BIGINT   | [*1](#warn-data-size) |
+| boolean           | BOOLEAN  |                       |
+| bytea             | BLOB     |                       |
+| character         | TEXT     | [*2](#warn-data-size) |
+| character varying | TEXT     | [*2](#warn-data-size) |
+| double precision  | DOUBLE   |                       |
+| integer           | INT      |                       |
+| real              | FLOAT    |                       |
+| smallint          | INT      | [*2](#warn-data-size) |
+| text              | TEXT     |                       |
+
+Data types not listed in the above are not supported. The typical examples are shown below.
+
+- bigserial
+- bit
+- box
+- cidr
+- circle
+- date
+- inet
+- interval
+- json
+- jsonb
+- line
+- lseg
+- macaddr
+- macaddr8
+- money
+- numeric
+- path
+- pg_lsn
+- pg_snapshot
+- point
+- polygon
+- smallserial
+- serial
+- time
+- timestamp
+- tsquery
+- tsvector
+- txid_snapshot
+- uuid
+- xml
+
+</div>
+
+<div id="Oracle" class="tabcontent" markdown="1">
+
+| Oracle        | ScalarDB        | Notes                 |
+|---------------|-----------------|-----------------------|
+| binary_double | DOUBLE          |                       |
+| binary_float  | FLOAT           |                       |
+| blob          | BLOB            | [*3](#warn-data-size) |
+| char          | TEXT            | [*2](#warn-data-size) |
+| clob          | TEXT            |                       |
+| float         | DOUBLE          | [*4](#warn-data-size) |
+| long          | TEXT            |                       |
+| long raw      | BLOB            |                       |
+| nchar         | TEXT            | [*2](#warn-data-size) |
+| nclob         | TEXT            |                       |
+| number        | BIGINT / DOUBLE | [*5](#warn-data-size) |
+| nvarchar2     | TEXT            | [*2](#warn-data-size) |
+| raw           | BLOB            | [*2](#warn-data-size) |
+| varchar2      | TEXT            | [*2](#warn-data-size) |
+
+Data types not listed in the above are not supported. The typical examples are shown below.
+
+- date
+- timestamp
+- interval
+- rowid
+- urowid
+- bfile
+- json
+
+</div>
+
+<div id="SQLServer" class="tabcontent" markdown="1">
+
+| SQL Server | ScalarDB | Notes                 |
+|------------|----------|-----------------------|
+| bigint     | BIGINT   | [*1](#warn-data-size) |
+| binary     | BLOB     | [*2](#warn-data-size) |
+| bit        | BOOLEAN  |                       |
+| char       | TEXT     | [*2](#warn-data-size) |
+| float      | DOUBLE   |                       |
+| image      | BLOB     |                       |
+| int        | INT      |                       |
+| nchar      | TEXT     | [*2](#warn-data-size) |
+| ntext      | TEXT     |                       |
+| nvarchar   | TEXT     | [*2](#warn-data-size) |
+| real       | FLOAT    |                       |
+| smallint   | INT      | [*2](#warn-data-size) |
+| text       | TEXT     |                       |
+| tinyint    | INT      | [*2](#warn-data-size) |
+| varbinary  | BLOB     | [*2](#warn-data-size) |
+| varchar    | TEXT     | [*2](#warn-data-size) |
+
+Data types not listed in the above are not supported. The typical examples are shown below.
+
+- cursor
+- date
+- datetime
+- datetime2
+- datetimeoffset
+- decimal
+- geography
+- geometry
+- hierarchyid
+- money
+- numeric
+- rowversion
+- smalldatetime
+- smallmoney
+- sql_variant
+- time
+- uniqueidentifier
+- xml
+
+</div>
+
+</div>
+
+{% capture notice--warning %}
+**Attention**
+
+1. The value range of `BIGINT` in ScalarDB is from -2^53 to 2^53, regardless of the size of `bigint` in the underlying database. Thus, if the data out of this range exists in the imported table, ScalarDB cannot read it.
+2. For certain data types noted above, ScalarDB may map a data type larger than that of the underlying database. In that case, You will see errors when putting a value with a size larger than the size specified in the underlying database.
+3. The maximum size of `BLOB` in ScalarDB is about 2GB (precisely 2^31-1 bytes). In contrast, Oracle `blob` can have (4GB-1)*(number of blocks). Thus, if the data larger than 2GB exists in the imported table, ScalarDB cannot read it.
+4. ScalarDB does not support Oracle `float` columns that have higher precision than ScalarDB's `DOUBLE`.
+5. ScalarDB does not support Oracle `numeric(p, s)` (`p` is precision and `s` is scale) columns, where `p` is larger than 15 due to the maximum size of the data type in ScalarDB. Note that ScalarDB maps the column to `BIGINT` if `s` is zero; else it maps `DOUBLE`. For the latter case, be aware that round-up or round-off can happen in the underlying database since the floating-point value will be cast to a fixed-point one.
+
+{% endcapture %}
+
+<div class="notice--warning" id="warn-data-size">{{ notice--warning | markdownify }}</div>
+
+## Use import function in your application
+
+You can use import function in your application with the following interfaces.
+
+- [ScalarDB Admin API](./api-guide.md#import-a-table)
+- [ScalarDB Schema Loader API](./schema-loader.md#use-schema-loader-in-your-application)