Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documents for cross-partition scan and import feature #1301

Merged
merged 8 commits into from
Nov 30, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
97 changes: 94 additions & 3 deletions docs/api-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -295,6 +295,25 @@ boolean ifExist = true;
admin.dropCoordinatorTables(ifExist);
```

### Import a table

You can import an existing table to ScalarDB as follows:

```java
// Import the table "ns.tbl". If the table is already managed by ScalarDB, the target table does not
// exist, or the table does not meet the requirement of ScalarDB table, an exception will be thrown.
jnmt marked this conversation as resolved.
Show resolved Hide resolved
admin.importTable("ns", "tbl");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have added Map<String, String> options as 3rd argument to this method:

Suggested change
admin.importTable("ns", "tbl");
admin.importTable("ns", "tbl", options);

And we didn't add the argument for the 3 branch. To avoid diverging this doc between master and 3, so we should probably apply the same change for the 3 branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thank you! Fixed in e3dd00d.

And we didn't add the argument for the 3 branch. To avoid diverging this doc between master and 3, so we should probably apply the same change for the 3 branch.

I'm OK to use admin.importTable("ns", "tbl", options) even in v3.x, but might be confusing for users. If we can just apply it without any other concerns, I can handle it since it's not a big diverge. Should I or not?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the importTable() method was already introduced in 3.10, but it was treated as an experiment feature. Therefore, I think we can also add the options argument to the 3 branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, right. Got it. Thank you!

```

{% capture notice--warning %}
**Attention**

You should carefully plan to import a table to ScalarDB in production because it will add transaction metadata columns to your database tables and the ScalarDB metadata tables. There would also be several differences between your database and ScalarDB and limitations. See also the following document.

- [Importing existing tables to ScalarDB using ScalarDB Schema Loader](./schema-loader-import.md)
jnmt marked this conversation as resolved.
Show resolved Hide resolved

{{ notice--warning | markdownify }}
jnmt marked this conversation as resolved.
Show resolved Hide resolved

## Transactional API

This section explains how to execute transactional operations by using the Transactional API in ScalarDB.
Expand Down Expand Up @@ -621,9 +640,21 @@ You can't specify clustering-key boundaries and orderings in `Scan` by using a s

<div class="notice--info">{{ notice--info | markdownify }}</div>

##### Execute `Scan` without specifying a partition key to retrieve all the records of a table
##### Execute cross-partition `Scan` without specifying a partition key to retrieve all the records of a table

You can execute a `Scan` operation across all partitions without specifying a partition key by enabling the following property in the ScalarDB configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can execute a `Scan` operation across all partitions without specifying a partition key by enabling the following property in the ScalarDB configuration.
You can execute a `Scan` operation across all partitions, which we call cross-partition scan, without specifying a partition key by enabling the following property in the ScalarDB configuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e3dd00d.


```properties
scalar.db.cross_partition_scan.enabled=true
```

{% capture notice--warning %}
**Attention**

Except for JDBC databases, we do not recommend enabling cross-partition scan with serializable isolation because it could make the isolation level lower (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Except for JDBC databases, we do not recommend enabling cross-partition scan with serializable isolation because it could make the isolation level lower (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.
We do not recommend enabling cross-partition scan with serializable isolation for non-JDBC databases because transactions could be executed with lower isolation (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.

The expression it could make the isolation level lower sounds a bit unclear to me.

BTW, sorry, I don't fully remember the discussion we had before.
So, we decided to only warn in case users use the cross-partition scan with serializable isolation for backward compatibility instead of throwing a runtime exception, don't we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment. Fixed in e3dd00d.

BTW, sorry, I don't fully remember the discussion we had before.
So, we decided to only warn in case users use the cross-partition scan with serializable isolation for backward compatibility instead of throwing a runtime exception, don't we?

At least, we didn't choose to completely disable the cross-partition scan with serializable isolation in v4.0. Disabling it is one idea, but I think it might be useful in some cases regardless of backward compatibility; e.g., users want to basically run transactions in a serializable manner but sometimes run read-only cross-partition scans without changing the setting.

@brfrn169 Do you have any idea?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thank you! So, for now, we need to enable it in 3.x for backward-compatibility, and we haven't decided to do so in 4.x (we need to think what we should do for 4.x). Is my understanding correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost, yes. My understanding is that we will keep warning it as the same as v3.x unless we make a decision to stop the feature in v4.x explicitly.

{% endcapture %}

You can execute a `Scan` operation without specifying a partition key.
<div class="notice--warning">{{ notice--warning | markdownify }}</div>

Instead of calling the `partitionKey()` method in the builder, you can call the `all()` method to scan a table without specifying a partition key as follows:

Expand All @@ -645,11 +676,71 @@ List<Result> results = transaction.scan(scan);
{% capture notice--info %}
**Note**

You can't specify clustering-key boundaries and orderings in `Scan` without specifying a partition key.
You can't specify any filtering conditions and orderings in cross-partition `Scan` except for JDBC databases. See the following section to use cross-partition `Scan` with filtering or ordering for JDBC databases.
jnmt marked this conversation as resolved.
Show resolved Hide resolved
{% endcapture %}

<div class="notice--info">{{ notice--info | markdownify }}</div>

##### Execute cross-partition `Scan` with filtering and ordering

By enabling the cross-partition scan option with filtering and ordering for JDBC databases as follows, you can execute a cross-partition `Scan` operation with flexible conditions and orderings.
jnmt marked this conversation as resolved.
Show resolved Hide resolved

```properties
scalar.db.cross_partition_scan.enabled=true
scalar.db.cross_partition_scan.filtering.enabled=true
scalar.db.cross_partition_scan.ordering.enabled=true
```
jnmt marked this conversation as resolved.
Show resolved Hide resolved

You can call the `where()` and `ordering()` methods after calling the `all()` method to specify arbitrary conditions and orderings as follows:

```java
// Create a `Scan` operation with arbitrary conditions and orderings.
Scan scan =
Scan.newBuilder()
.namespace("ns")
.table("tbl")
.all()
.where(ConditionBuilder.column("c1").isNotEqualToInt(10))
.projections("c1", "c2", "c3", "c4")
.orderings(Scan.Ordering.desc("c3"), Scan.Ordering.asc("c4"))
.limit(10)
.build();

// Execute the `Scan` operation.
List<Result> results = transaction.scan(scan);
```

As an argument of the `where()` method, you can specify a condition, an and-wise condition set, or an or-wise condition set. After calling the `where()` method, you can add more conditions or condition sets using the `and()` method or `or()` method as follows:
jnmt marked this conversation as resolved.
Show resolved Hide resolved

```java
// Create a `Scan` operation with condition sets.
Scan scan =
Scan.newBuilder()
.namespace("ns")
.table("tbl")
.all()
.where(
ConditionSetBuilder.condition(ConditionBuilder.column("c1").isLessThanInt(10))
.or(ConditionBuilder.column("c1").isGreaterThanInt(20))
.build())
.and(
ConditionSetBuilder.condition(ConditionBuilder.column("c2").isLikeText("a%"))
.or(ConditionBuilder.column("c2").isLikeText("b%"))
.build())
.limit(10)
.build();
```

{% capture notice--info %}
**Note**

In the `where()` condition method chain, the conditions must be an and-wise junction of `ConditionalExpression` or `OrConditionSet` (so-called conjunctive normal form) like the above example or an or-wise junction of `ConditionalExpression` or `AndConditionSet` (so-called disjunctive normal form).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In the `where()` condition method chain, the conditions must be an and-wise junction of `ConditionalExpression` or `OrConditionSet` (so-called conjunctive normal form) like the above example or an or-wise junction of `ConditionalExpression` or `AndConditionSet` (so-called disjunctive normal form).
In the `where()` condition method chain, the conditions must be an and-wise junction of `ConditionalExpression` or `OrConditionSet` (so-called conjunctive normal form) like the above example or an or-wise junction of `ConditionalExpression` or `AndConditionSet` (so-called disjunctive normal form).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 6726530.

{% endcapture %}

<div class="notice--info">{{ notice--info | markdownify }}</div>

For more details of available conditions and condition sets, see the `ConditionBuilder` and `ConditionSetBuilder` page in the [Javadoc](https://javadoc.io/doc/com.scalar-labs/scalardb/latest/index.html) of the version of ScalarDB that you're using.
jnmt marked this conversation as resolved.
Show resolved Hide resolved

#### `Put` operation

`Put` is an operation to put a record specified by a primary key. The operation behaves as an upsert operation for a record, in which the operation updates the record if the record exists or inserts the record if the record does not exist.
Expand Down
271 changes: 271 additions & 0 deletions docs/schema-loader-import.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,271 @@
# Importing existing tables to ScalarDB using ScalarDB Schema Loader
jnmt marked this conversation as resolved.
Show resolved Hide resolved

You might want to use ScalarDB (e.g., for database-spanning transactions) with your existing databases. In that case, you can import those databases under the ScalarDB control using ScalarDB Schema Loader. ScalarDB Schema Loader automatically adds ScalarDB-internal metadata columns in each existing table and metadata tables to enable various ScalarDB functionalities including transaction management across multiple databases.

## Before you begin

{% capture notice--warning %}
**Attention**

You should carefully plan to import a table to ScalarDB in production because it will add transaction metadata columns to your database tables and the ScalarDB metadata tables. There would also be several differences between your database and ScalarDB and limitations.
jnmt marked this conversation as resolved.
Show resolved Hide resolved
{% endcapture %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After adding the holistic migration guide, I will add a reference for it around here.


<div class="notice--warning">{{ notice--warning | markdownify }}</div>

### What will be added to your databases
jnmt marked this conversation as resolved.
Show resolved Hide resolved

- ScalarDB metadata tables: ScalarDB manages namespace names and table metadata in an namespace (schema or database in underlying databases) called 'scalardb'.
- Transaction metadata columns: the Consensus Commit transaction manager requires metadata (for example, transaction ID, record version, and transaction status) stored along with the actual records to handle transactions properly. Thus, this tool adds the metadata columns if you use the Consensus Commit transaction manager.

### Requirements

- [JDBC databases](./scalardb-supported-databases.md#jdbc-databases) except for SQLite are importable.
- Each table must have primary key column(s) (composite primary keys can be available) .
- Target tables must only have columns with supported data type (see also [here](#data-type-mapping-from-jdbc-databases-to-scalardb)).
jnmt marked this conversation as resolved.
Show resolved Hide resolved

### Set up Schema Loader

See the [ScalarDB Schema Loader](./schema-loader.md#set-up-schema-loader) document to set up Schema Loader for importing existing tables.
jnmt marked this conversation as resolved.
Show resolved Hide resolved

## Run Schema Loader for importing existing tables

You can import an existing table in JDBC databases to ScalarDB using `--import` option and an import-specific schema file. To import tables, run the following command, replacing the contents in the angle brackets as described:
jnmt marked this conversation as resolved.
Show resolved Hide resolved

```console
$ java -jar scalardb-schema-loader-<VERSION>.jar --config <PATH_TO_SCALARDB_PROPERTIES_FILE> -f <PATH_TO_SCHEMA_FILE> --import
```

- `<VERSION>`: version of ScalarDB Schema Loader that you set up.
- `<PATH_TO_SCALARDB_PROPERTIES_FILE>`: path to a property file of ScalarDB. For a sample properties file, see [`database.properties`](https://github.com/scalar-labs/scalardb/blob/master/conf/database.properties).
- `<PATH_TO_SCHEMA_FILE>`: path to an import schema file. See also a [sample](#sample-import-schema-file) in the next section.
jnmt marked this conversation as resolved.
Show resolved Hide resolved

If you use the Consensus Commit transaction manager after importing existing tables, run the following command separately.
jnmt marked this conversation as resolved.
Show resolved Hide resolved

```console
$ java -jar scalardb-schema-loader-<VERSION>.jar --config <PATH_TO_SCALARDB_PROPERTIES_FILE> --coordinator
```

## Sample import schema file

The following is a sample schema for importing tables. For the sample schema file, see [`import_schema_sample.json`](https://github.com/scalar-labs/scalardb/blob/master/schema-loader/sample/import_schema_sample.json).

```json
{
"sample_namespace1.sample_table1": {
"transaction": true
},
"sample_namespace1.sample_table2": {
"transaction": true
},
"sample_namespace2.sample_table3": {
"transaction": false
}
}
```

The import table schema consists of a namespace name, a table name, and a `transaction` field. The `transaction` field indicates whether the table will be imported for transactions or not. If you set the `transaction` field to `true` or don't specify the `transaction` field, this tool creates a table with transaction metadata if needed. If you set the `transaction` field to `false`, this tool imports a table without adding transaction metadata (that is, for a table with [Storage API](storage-abstraction.md)).
jnmt marked this conversation as resolved.
Show resolved Hide resolved

## Data-type mapping from JDBC databases to ScalarDB

The following table shows the supported data types in each JDBC database and their mapping to the ScalarDB data types. Select your database and check if your existing tables are importable or not.
jnmt marked this conversation as resolved.
Show resolved Hide resolved

<div id="tabset-1">
<div class="tab">
<button class="tablinks" onclick="openTab(event, 'MySQL', 'tabset-1')" id="defaultOpen-1">MySQL</button>
<button class="tablinks" onclick="openTab(event, 'PostgreSQL', 'tabset-1')">PostgreSQL</button>
<button class="tablinks" onclick="openTab(event, 'Oracle', 'tabset-1')">Oracle</button>
<button class="tablinks" onclick="openTab(event, 'SQLServer', 'tabset-1')">SQL Server</button>
</div>

<div id="MySQL" class="tabcontent" markdown="1">

| MySQL | ScalarDB | Notes |
|--------------|----------|-----------------------|
| bigint | BIGINT | [*1](#warn-data-size) |
| binary | BLOB | |
| bit | BOOLEAN | |
| blob | BLOB | [*2](#warn-data-size) |
| char | TEXT | [*2](#warn-data-size) |
| double | DOUBLE | |
| float | FLOAT | |
| int | INT | |
| int unsigned | BIGINT | [*2](#warn-data-size) |
| integer | INT | |
| longblob | BLOB | |
| longtext | TEXT | |
| mediumblob | BLOB | [*2](#warn-data-size) |
| mediumint | INT | [*2](#warn-data-size) |
| mediumtext | TEXT | [*2](#warn-data-size) |
| smallint | INT | [*2](#warn-data-size) |
| text | TEXT | [*2](#warn-data-size) |
| tinyblob | BLOB | [*2](#warn-data-size) |
| tinyint | INT | [*2](#warn-data-size) |
| tinyint(1) | BOOLEAN | |
| tinytext | TEXT | [*2](#warn-data-size) |
| varbinary | BLOB | [*2](#warn-data-size) |
| varchar | TEXT | [*2](#warn-data-size) |

Data types not listed in the above are not supported. The typical examples are shown below.
jnmt marked this conversation as resolved.
Show resolved Hide resolved

- bigint unsigned
- bit(n) (n > 1)
- date
- datetime
- decimal
- enum
- geometry
- json
- numeric
- set
- time
- timestamp
- year

</div>

<div id="PostgreSQL" class="tabcontent" markdown="1">

| PostgreSQL | ScalarDB | Notes |
|-------------------|----------|-----------------------|
| bigint | BIGINT | [*1](#warn-data-size) |
| boolean | BOOLEAN | |
| bytea | BLOB | |
| character | TEXT | [*2](#warn-data-size) |
| character varying | TEXT | [*2](#warn-data-size) |
| double precision | DOUBLE | |
| integer | INT | |
| real | FLOAT | |
| smallint | INT | [*2](#warn-data-size) |
| text | TEXT | |

Data types not listed in the above are not supported. The typical examples are shown below.
jnmt marked this conversation as resolved.
Show resolved Hide resolved

- bigserial
- bit
- box
- cidr
- circle
- date
- inet
- interval
- json
- jsonb
- line
- lseg
- macaddr
- macaddr8
- money
- numeric
- path
- pg_lsn
- pg_snapshot
- point
- polygon
- smallserial
- serial
- time
- timestamp
- tsquery
- tsvector
- txid_snapshot
- uuid
- xml

</div>

<div id="Oracle" class="tabcontent" markdown="1">

| Oracle | ScalarDB | Notes |
|---------------|-----------------|-----------------------|
| binary_double | DOUBLE | |
| binary_float | FLOAT | |
| blob | BLOB | [*3](#warn-data-size) |
| char | TEXT | [*2](#warn-data-size) |
| clob | TEXT | |
| float | DOUBLE | [*4](#warn-data-size) |
| long | TEXT | |
| long raw | BLOB | |
| nchar | TEXT | [*2](#warn-data-size) |
| nclob | TEXT | |
| number | BIGINT / DOUBLE | [*5](#warn-data-size) |
| nvarchar2 | TEXT | [*2](#warn-data-size) |
| raw | BLOB | [*2](#warn-data-size) |
| varchar2 | TEXT | [*2](#warn-data-size) |

Data types not listed in the above are not supported. The typical examples are shown below.
jnmt marked this conversation as resolved.
Show resolved Hide resolved

- date
- timestamp
- interval
- rowid
- urowid
- bfile
- json

</div>

<div id="SQLServer" class="tabcontent" markdown="1">

| SQL Server | ScalarDB | Notes |
|------------|----------|-----------------------|
| bigint | BIGINT | [*1](#warn-data-size) |
| binary | BLOB | [*2](#warn-data-size) |
| bit | BOOLEAN | |
| char | TEXT | [*2](#warn-data-size) |
| float | DOUBLE | |
| image | BLOB | |
| int | INT | |
| nchar | TEXT | [*2](#warn-data-size) |
| ntext | TEXT | |
| nvarchar | TEXT | [*2](#warn-data-size) |
| real | FLOAT | |
| smallint | INT | [*2](#warn-data-size) |
| text | TEXT | |
| tinyint | INT | [*2](#warn-data-size) |
| varbinary | BLOB | [*2](#warn-data-size) |
| varchar | TEXT | [*2](#warn-data-size) |

Data types not listed in the above are not supported. The typical examples are shown below.
jnmt marked this conversation as resolved.
Show resolved Hide resolved

- cursor
- date
- datetime
- datetime2
- datetimeoffset
- decimal
- geography
- geometry
- hierarchyid
- money
- numeric
- rowversion
- smalldatetime
- smallmoney
- sql_variant
- time
- uniqueidentifier
- xml

</div>

</div>

{% capture notice--warning %}
**Attention**

1. The value range of `BIGINT` in ScalarDB is from -2^53 to 2^53, regardless of the size of `bigint` in the underlying database. Thus, if the data out of this range exists in the imported table, ScalarDB cannot read it.
jnmt marked this conversation as resolved.
Show resolved Hide resolved
2. For certain data types noted above, ScalarDB may map a data type larger than that of the underlying database. In that case, You will see errors when putting a value with a size larger than the size specified in the underlying database.
3. The maximum size of `BLOB` in ScalarDB is about 2GB (precisely 2^31-1 bytes). In contrast, Oracle `blob` can have (4GB-1)*(number of blocks). Thus, if the data larger than 2GB exists in the imported table, ScalarDB cannot read it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto. Do we observe null, some value, or an error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question again. The 2GB limit is due to Java's byte array limits. I don't test it, but I guess the Oracle JDBC driver throws an SQLException. Or, we might see an OOM error if the heap size is not correctly configured. Apart from that, we might be able to handle such large objects better by using JDBC Blob getBlob(...) and offset-based access instead of byte[] getBytes(...).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. I understood that it depends on Java side.
Thank you!

jnmt marked this conversation as resolved.
Show resolved Hide resolved
4. ScalarDB does not support Oracle `float` columns that have higher precision than ScalarDB's `DOUBLE`.
jnmt marked this conversation as resolved.
Show resolved Hide resolved
5. ScalarDB does not support Oracle `numeric(p, s)` (`p` is precision and `s` is scale) columns, where `p` is larger than 15 due to the maximum size of the data type in ScalarDB. Note that ScalarDB maps the column to `BIGINT` if `s` is zero; else it maps `DOUBLE`. For the latter case, be aware that round-up or round-off can happen in the underlying database since the floating-point value will be cast to a fixed-point one.
jnmt marked this conversation as resolved.
Show resolved Hide resolved

{% endcapture %}

<div class="notice--warning" id="warn-data-size">{{ notice--warning | markdownify }}</div>

## Use import function in your application

You can use import function in your application with the following interfaces.
jnmt marked this conversation as resolved.
Show resolved Hide resolved

- [ScalarDB Admin API](./api-guide.md#import-a-table)
- [ScalarDB Schema Loader API](./schema-loader.md#use-schema-loader-in-your-application)
Loading