Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documents for cross-partition scan and import feature #1301

Merged
merged 8 commits into from
Nov 30, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
97 changes: 94 additions & 3 deletions docs/api-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -295,6 +295,25 @@ boolean ifExist = true;
admin.dropCoordinatorTables(ifExist);
```

### Import a table

You can import an existing table to ScalarDB as follows:

```java
// Import the table "ns.tbl". If the table is already managed by ScalarDB, the target table does not
// exist, or the table does not meet the requirement of ScalarDB table, an exception will be thrown.
jnmt marked this conversation as resolved.
Show resolved Hide resolved
admin.importTable("ns", "tbl");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have added Map<String, String> options as 3rd argument to this method:

Suggested change
admin.importTable("ns", "tbl");
admin.importTable("ns", "tbl", options);

And we didn't add the argument for the 3 branch. To avoid diverging this doc between master and 3, so we should probably apply the same change for the 3 branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thank you! Fixed in e3dd00d.

And we didn't add the argument for the 3 branch. To avoid diverging this doc between master and 3, so we should probably apply the same change for the 3 branch.

I'm OK to use admin.importTable("ns", "tbl", options) even in v3.x, but might be confusing for users. If we can just apply it without any other concerns, I can handle it since it's not a big diverge. Should I or not?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the importTable() method was already introduced in 3.10, but it was treated as an experiment feature. Therefore, I think we can also add the options argument to the 3 branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, right. Got it. Thank you!

```

{% capture notice--warning %}
**Attention**

You should carefully plan to import a table to ScalarDB in production because it will add transaction metadata columns to your database tables and the ScalarDB metadata tables. There would also be several differences between your database and ScalarDB and limitations. See also the following document.

- [Importing existing tables to ScalarDB using ScalarDB Schema Loader](./schema-loader-import.md)
jnmt marked this conversation as resolved.
Show resolved Hide resolved

{{ notice--warning | markdownify }}
jnmt marked this conversation as resolved.
Show resolved Hide resolved

## Transactional API

This section explains how to execute transactional operations by using the Transactional API in ScalarDB.
Expand Down Expand Up @@ -621,9 +640,21 @@ You can't specify clustering-key boundaries and orderings in `Scan` by using a s

<div class="notice--info">{{ notice--info | markdownify }}</div>

##### Execute `Scan` without specifying a partition key to retrieve all the records of a table
##### Execute cross-partition `Scan` without specifying a partition key to retrieve all the records of a table

You can execute a `Scan` operation across all partitions without specifying a partition key by enabling the following property in the ScalarDB configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can execute a `Scan` operation across all partitions without specifying a partition key by enabling the following property in the ScalarDB configuration.
You can execute a `Scan` operation across all partitions, which we call cross-partition scan, without specifying a partition key by enabling the following property in the ScalarDB configuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e3dd00d.


```properties
scalar.db.cross_partition_scan.enabled=true
```

{% capture notice--warning %}
**Attention**

Except for JDBC databases, we do not recommend enabling cross-partition scan with serializable isolation because it could make the isolation level lower (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Except for JDBC databases, we do not recommend enabling cross-partition scan with serializable isolation because it could make the isolation level lower (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.
We do not recommend enabling cross-partition scan with serializable isolation for non-JDBC databases because transactions could be executed with lower isolation (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.

The expression it could make the isolation level lower sounds a bit unclear to me.

BTW, sorry, I don't fully remember the discussion we had before.
So, we decided to only warn in case users use the cross-partition scan with serializable isolation for backward compatibility instead of throwing a runtime exception, don't we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment. Fixed in e3dd00d.

BTW, sorry, I don't fully remember the discussion we had before.
So, we decided to only warn in case users use the cross-partition scan with serializable isolation for backward compatibility instead of throwing a runtime exception, don't we?

At least, we didn't choose to completely disable the cross-partition scan with serializable isolation in v4.0. Disabling it is one idea, but I think it might be useful in some cases regardless of backward compatibility; e.g., users want to basically run transactions in a serializable manner but sometimes run read-only cross-partition scans without changing the setting.

@brfrn169 Do you have any idea?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thank you! So, for now, we need to enable it in 3.x for backward-compatibility, and we haven't decided to do so in 4.x (we need to think what we should do for 4.x). Is my understanding correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost, yes. My understanding is that we will keep warning it as the same as v3.x unless we make a decision to stop the feature in v4.x explicitly.

{% endcapture %}

You can execute a `Scan` operation without specifying a partition key.
<div class="notice--warning">{{ notice--warning | markdownify }}</div>

Instead of calling the `partitionKey()` method in the builder, you can call the `all()` method to scan a table without specifying a partition key as follows:

Expand All @@ -645,11 +676,71 @@ List<Result> results = transaction.scan(scan);
{% capture notice--info %}
**Note**

You can't specify clustering-key boundaries and orderings in `Scan` without specifying a partition key.
You can't specify any filtering conditions and orderings in cross-partition `Scan` except for JDBC databases. See the following section to use cross-partition `Scan` with filtering or ordering for JDBC databases.
jnmt marked this conversation as resolved.
Show resolved Hide resolved
{% endcapture %}

<div class="notice--info">{{ notice--info | markdownify }}</div>

##### Execute cross-partition `Scan` with filtering and ordering

By enabling the cross-partition scan option with filtering and ordering for JDBC databases as follows, you can execute a cross-partition `Scan` operation with flexible conditions and orderings.
jnmt marked this conversation as resolved.
Show resolved Hide resolved

```properties
scalar.db.cross_partition_scan.enabled=true
scalar.db.cross_partition_scan.filtering.enabled=true
scalar.db.cross_partition_scan.ordering.enabled=true
```
jnmt marked this conversation as resolved.
Show resolved Hide resolved

You can call the `where()` and `ordering()` methods after calling the `all()` method to specify arbitrary conditions and orderings as follows:

```java
// Create a `Scan` operation with arbitrary conditions and orderings.
Scan scan =
Scan.newBuilder()
.namespace("ns")
.table("tbl")
.all()
.where(ConditionBuilder.column("c1").isNotEqualToInt(10))
.projections("c1", "c2", "c3", "c4")
.orderings(Scan.Ordering.desc("c3"), Scan.Ordering.asc("c4"))
.limit(10)
.build();

// Execute the `Scan` operation.
List<Result> results = transaction.scan(scan);
```

As an argument of the `where()` method, you can specify a condition, an and-wise condition set, or an or-wise condition set. After calling the `where()` method, you can add more conditions or condition sets using the `and()` method or `or()` method as follows:
jnmt marked this conversation as resolved.
Show resolved Hide resolved

```java
// Create a `Scan` operation with condition sets.
Scan scan =
Scan.newBuilder()
.namespace("ns")
.table("tbl")
.all()
.where(
ConditionSetBuilder.condition(ConditionBuilder.column("c1").isLessThanInt(10))
.or(ConditionBuilder.column("c1").isGreaterThanInt(20))
.build())
.and(
ConditionSetBuilder.condition(ConditionBuilder.column("c2").isLikeText("a%"))
.or(ConditionBuilder.column("c2").isLikeText("b%"))
.build())
.limit(10)
.build();
```

{% capture notice--info %}
**Note**

In the `where()` condition method chain, the conditions must be an and-wise junction of `ConditionalExpression` or `OrConditionSet` (so-called conjunctive normal form) like the above example or an or-wise junction of `ConditionalExpression` or `AndConditionSet` (so-called disjunctive normal form).
jnmt marked this conversation as resolved.
Show resolved Hide resolved
{% endcapture %}

<div class="notice--info">{{ notice--info | markdownify }}</div>

For more details of available conditions and condition sets, see the `ConditionBuilder` and `ConditionSetBuilder` page in the [Javadoc](https://javadoc.io/doc/com.scalar-labs/scalardb/latest/index.html) of the version of ScalarDB that you're using.
jnmt marked this conversation as resolved.
Show resolved Hide resolved

#### `Put` operation

`Put` is an operation to put a record specified by a primary key. The operation behaves as an upsert operation for a record, in which the operation updates the record if the record exists or inserts the record if the record does not exist.
Expand Down
18 changes: 18 additions & 0 deletions docs/configurations.md
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,24 @@ For details about using multiple storages, see [Multi-Storage Transactions](mult

For details about client configurations, see the ScalarDB Cluster [client configurations (redirects to the Enterprise docs site)](https://scalardb.scalar-labs.com/docs/latest/scalardb-cluster/developer-guide-for-scalardb-cluster-with-java-api/#client-configurations).

## Cross-partition scan configurations

By enabling the cross-partition scan option below, `Scan` operation can retrieve all records across partitions. In addition, you can specify arbitrary conditions and orderings in the cross-partition `Scan` operation by enabling `cross_partition_scan.filtering` and `cross_partition_scan.ordering`, respectively. Currently, the cross-partition scan with filtering and ordering is available only for JDBC databases. Note that `scalar.db.cross_partition_scan.enabled` must be `true` to enable them. See [Java API Guide - Scan operation](./api-guide.md#scan-operation) for how to use the cross-partition scan.
jnmt marked this conversation as resolved.
Show resolved Hide resolved

{% capture notice--warning %}
**Attention**

Except for JDBC databases, we do not recommend enabling cross-partition scan with serializable isolation because it could make the isolation level lower (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e3dd00d.

{% endcapture %}

<div class="notice--warning">{{ notice--warning | markdownify }}</div>

| Name | Description | Default |
|----------------------------------------------------|-----------------------------------------------|---------|
| `scalar.db.cross_partition_scan.enabled` | Enable the cross partition scan. | `false` |
| `scalar.db.cross_partition_scan.filtering.enabled` | Enable filtering in the cross partition scan. | `false` |
| `scalar.db.cross_partition_scan.ordering.enabled` | Enable ordering in the cross partition scan. | `false` |
jnmt marked this conversation as resolved.
Show resolved Hide resolved

## Other ScalarDB configurations

The following are additional configurations available for ScalarDB:
Expand Down
Loading