Update documents for cross-partition scan and import feature #1301

jnmt · 2023-11-21T00:03:05Z

Description

This PR adds documents related to cross-partition scan (so-called the relational scan before) and the table import feature. It depends on #1294, which is still under review, but PTAL in parallel since reviewing and revising the docs might take a long time.

Related issues and/or PRs

Changes made

Add cross-partition scan documents
Add table import documents

Checklist

I have commented my code, particularly in hard-to-understand areas.
I have updated the documentation to reflect the changes.
Any remaining open issues linked to this PR are documented and up-to-date (Jira, GitHub, etc.).
Tests (unit, integration, etc.) have been added for the changes.
My changes generate no new warnings.
Any dependent changes in other PRs have been merged and published. (Add cross-partition scan options #1294 should be merged together)

Additional notes (optional)

This PR basically focuses on the specification of the functions and how to use each function. We will provide a holistic guide for the migration to ScalarDB and its sample in other PRs.

Release notes

Added documents for cross-partition scan and table import.

jnmt · 2023-11-21T00:05:19Z

docs/schema-loader-import.md

+You should carefully plan to import a table to ScalarDB in production because it will add transaction metadata columns to your database tables and the ScalarDB metadata tables. There would also be several differences between your database and ScalarDB and limitations.
+{% endcapture %}


After adding the holistic migration guide, I will add a reference for it around here.

kota2and3kan

Thank you for the updates!
I left some suggestions and questions.
Please take a look when you have time!

kota2and3kan · 2023-11-22T04:53:20Z

docs/api-guide.md

+{% capture notice--info %}
+**Note**
+
+In the `where()` condition method chain, the conditions must be an and-wise junction of `ConditionalExpression` or `OrConditionSet`  (so-called conjunctive normal form) like the above example or an or-wise junction of `ConditionalExpression` or `AndConditionSet`  (so-called disjunctive normal form).


Suggested change

In the `where()` condition method chain, the conditions must be an and-wise junction of `ConditionalExpression` or `OrConditionSet` (so-called conjunctive normal form) like the above example or an or-wise junction of `ConditionalExpression` or `AndConditionSet` (so-called disjunctive normal form).

In the `where()` condition method chain, the conditions must be an and-wise junction of `ConditionalExpression` or `OrConditionSet` (so-called conjunctive normal form) like the above example or an or-wise junction of `ConditionalExpression` or `AndConditionSet` (so-called disjunctive normal form).

Fixed in 6726530.

docs/schema-loader-import.md

kota2and3kan · 2023-11-22T10:01:02Z

docs/schema-loader-import.md

+
+1. The value range of `BIGINT` in ScalarDB is from -2^53 to 2^53, regardless of the size of `bigint` in the underlying database. Thus, if the data out of this range exists in the imported table, ScalarDB cannot read it.
+2. For certain data types noted above, ScalarDB may map a data type larger than that of the underlying database. In that case, You will see errors when putting a value with a size larger than the size specified in the underlying database.
+3. The maximum size of `BLOB` in ScalarDB is about 2GB (precisely 2^31-1 bytes). In contrast, Oracle `blob` can have (4GB-1)*(number of blocks). Thus, if the data larger than 2GB exists in the imported table, ScalarDB cannot read it.


Ditto. Do we observe null, some value, or an error?

Good question again. The 2GB limit is due to Java's byte array limits. I don't test it, but I guess the Oracle JDBC driver throws an SQLException. Or, we might see an OOM error if the heap size is not correctly configured. Apart from that, we might be able to handle such large objects better by using JDBC Blob getBlob(...) and offset-based access instead of byte[] getBytes(...).

Ah, I see. I understood that it depends on Java side.
Thank you!

docs/api-guide.md

[skip ci]

jnmt · 2023-11-24T05:56:49Z

@kota2and3kan Thanks for the feedback! Fixed based on the feedback (though the exception-related part is left as is for now), so PTAL when you get a chance.

kota2and3kan

LGTM! Thank you!

feeblefakie

Overall, looking good! Thank you!
Left some comments and suggestions. PTAL!

feeblefakie · 2023-11-28T07:02:06Z

docs/api-guide.md

+{% capture notice--warning %}
+**Attention**
+
+Except for JDBC databases, we do not recommend enabling cross-partition scan with serializable isolation because it could make the isolation level lower (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.


Suggested change

Except for JDBC databases, we do not recommend enabling cross-partition scan with serializable isolation because it could make the isolation level lower (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.

We do not recommend enabling cross-partition scan with serializable isolation for non-JDBC databases because transactions could be executed with lower isolation (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.

The expression it could make the isolation level lower sounds a bit unclear to me.

BTW, sorry, I don't fully remember the discussion we had before.
So, we decided to only warn in case users use the cross-partition scan with serializable isolation for backward compatibility instead of throwing a runtime exception, don't we?

Thanks for the comment. Fixed in e3dd00d.

BTW, sorry, I don't fully remember the discussion we had before.
So, we decided to only warn in case users use the cross-partition scan with serializable isolation for backward compatibility instead of throwing a runtime exception, don't we?

At least, we didn't choose to completely disable the cross-partition scan with serializable isolation in v4.0. Disabling it is one idea, but I think it might be useful in some cases regardless of backward compatibility; e.g., users want to basically run transactions in a serializable manner but sometimes run read-only cross-partition scans without changing the setting.

@brfrn169 Do you have any idea?

OK, thank you! So, for now, we need to enable it in 3.x for backward-compatibility, and we haven't decided to do so in 4.x (we need to think what we should do for 4.x). Is my understanding correct?

Almost, yes. My understanding is that we will keep warning it as the same as v3.x unless we make a decision to stop the feature in v4.x explicitly.

feeblefakie · 2023-11-28T07:03:48Z

docs/api-guide.md

-##### Execute `Scan` without specifying a partition key to retrieve all the records of a table
+##### Execute cross-partition `Scan` without specifying a partition key to retrieve all the records of a table
+
+You can execute a `Scan` operation across all partitions without specifying a partition key by enabling the following property in the ScalarDB configuration.


Suggested change

You can execute a `Scan` operation across all partitions without specifying a partition key by enabling the following property in the ScalarDB configuration.

You can execute a `Scan` operation across all partitions, which we call cross-partition scan, without specifying a partition key by enabling the following property in the ScalarDB configuration.

Fixed in e3dd00d.

feeblefakie · 2023-11-28T07:06:52Z

docs/configurations.md

+{% capture notice--warning %}
+**Attention**
+
+Except for JDBC databases, we do not recommend enabling cross-partition scan with serializable isolation because it could make the isolation level lower (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.


Fixed in e3dd00d.

docs/schema-loader-import.md

brfrn169

Overall, LGTM. Left a couple of comments. Please take a look when you have time!

brfrn169 · 2023-11-29T01:32:00Z

docs/api-guide.md

+```java
+// Import the table "ns.tbl". If the table is already managed by ScalarDB, the target table does not
+// exist, or the table does not meet the requirement of ScalarDB table, an exception will be thrown.
+admin.importTable("ns", "tbl");


We have added Map<String, String> options as 3rd argument to this method:

Suggested change

admin.importTable("ns", "tbl");

admin.importTable("ns", "tbl", options);

And we didn't add the argument for the 3 branch. To avoid diverging this doc between master and 3, so we should probably apply the same change for the 3 branch.

Good catch, thank you! Fixed in e3dd00d.

And we didn't add the argument for the 3 branch. To avoid diverging this doc between master and 3, so we should probably apply the same change for the 3 branch.

I'm OK to use admin.importTable("ns", "tbl", options) even in v3.x, but might be confusing for users. If we can just apply it without any other concerns, I can handle it since it's not a big diverge. Should I or not?

Actually, the importTable() method was already introduced in 3.10, but it was treated as an experiment feature. Therefore, I think we can also add the options argument to the 3 branch.

Ahh, right. Got it. Thank you!

brfrn169 · 2023-11-29T01:41:30Z

docs/schema-loader.md

+    // Import tables.
+    // You can also use a Properties object instead of configFilePath and a serialized-schema JSON
+    // string instead of schemaFilePath.
+    SchemaLoader.load(configFilePath, schemaFilePath, tableCreationOptions, createCoordinatorTables);


importTables, right?

Suggested change

SchemaLoader.load(configFilePath, schemaFilePath, tableCreationOptions, createCoordinatorTables);

SchemaLoader.importTables(configFilePath, schemaFilePath, tableCreationOptions);

Thank you! Fixed in e3dd00d.

One more thing, it looks like the importTables method doesn't receive the createCoordinatorTables argument.

And maybe tableCreationOptions should be renamed to something like tableImportOptions option?

Oops, sorry about that. I fixed it throughout the sample in 148d9e2. PTAL!

[skip ci]

brfrn169

LGTM! Thank you!

komamitsu

LGTM! Thank you!

komamitsu · 2023-11-30T00:58:55Z

docs/api-guide.md

+{% capture notice--warning %}
+**Attention**
+
+We do not recommend enabling the cross-partition scan with serializable isolation for non-JDBC databases because transactions could be executed with lower isolation (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.


Suggested change

We do not recommend enabling the cross-partition scan with serializable isolation for non-JDBC databases because transactions could be executed with lower isolation (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.

We do not recommend enabling the cross-partition scan with `SERIALIZABLE` isolation level for non-JDBC databases because transactions could be executed with lower isolation (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.

If you meant scalar.db.consensus_commit.isolation_level, I think it should be capitalized https://scalardb.scalar-labs.com/docs/latest/configurations/#basic-configurations

Fixed in de406a2. Thank you for the feedback!

feeblefakie

LGTM! Thank you!

[skip ci]

josh-wong

I've added some comments and suggestions. PTAL!

core/src/main/java/com/scalar/db/transaction/consensuscommit/ConsensusCommitConfig.java

docs/api-guide.md

docs/schema-loader-import.md

docs/api-guide.md

Co-authored-by: Josh Wong <[email protected]>

josh-wong

LGTM! Thank you!🙇‍♂️

Co-authored-by: Josh Wong <[email protected]>

Update documents for cross-parttion scan and import feature

3bf7549

jnmt commented Nov 21, 2023

View reviewed changes

jnmt added improvement documentation labels Nov 21, 2023

jnmt self-assigned this Nov 21, 2023

jnmt requested review from brfrn169, feeblefakie, Torch3333, josh-wong and kota2and3kan November 21, 2023 00:07

kota2and3kan reviewed Nov 22, 2023

View reviewed changes

Fix based on feedback and add config docs

6726530

[skip ci]

jnmt requested a review from kota2and3kan November 24, 2023 05:56

kota2and3kan approved these changes Nov 27, 2023

View reviewed changes

feeblefakie reviewed Nov 28, 2023

View reviewed changes

brfrn169 reviewed Nov 29, 2023

View reviewed changes

jnmt added 2 commits November 29, 2023 11:17

Merge branch 'master' into update-relational-scan-and-import-docs

226b69f

Fix based on feedback

e3dd00d

[skip ci]

jnmt requested review from brfrn169 and feeblefakie November 29, 2023 03:23

Fix schema loader import sample

148d9e2

[skip ci]

brfrn169 approved these changes Nov 29, 2023

View reviewed changes

brfrn169 requested a review from komamitsu November 29, 2023 16:46

komamitsu approved these changes Nov 30, 2023

View reviewed changes

feeblefakie approved these changes Nov 30, 2023

View reviewed changes

Fix based on feedback

de406a2

[skip ci]

josh-wong requested changes Nov 30, 2023

View reviewed changes

Apply suggestions from code review

67033d9

Co-authored-by: Josh Wong <[email protected]>

jnmt requested a review from josh-wong November 30, 2023 09:32

josh-wong approved these changes Nov 30, 2023

View reviewed changes

Merge branch 'master' into update-relational-scan-and-import-docs

f026a7e

brfrn169 merged commit 980a0e2 into master Nov 30, 2023
23 checks passed

brfrn169 deleted the update-relational-scan-and-import-docs branch November 30, 2023 10:00

feeblefakie pushed a commit that referenced this pull request Nov 30, 2023

Update documents for cross-partition scan and import feature (#1301)

59fc828

Co-authored-by: Josh Wong <[email protected]>

feeblefakie mentioned this pull request Nov 30, 2023

Backport to branch(3) : Update documents for cross-partition scan and import feature #1338

Merged

brfrn169 removed the improvement label Jan 5, 2024

jnmt mentioned this pull request May 21, 2024

Fix default value of cross-partition scan in docs #1753

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update documents for cross-partition scan and import feature #1301

Update documents for cross-partition scan and import feature #1301

jnmt commented Nov 21, 2023

jnmt Nov 21, 2023

kota2and3kan left a comment

kota2and3kan Nov 22, 2023

jnmt Nov 24, 2023

kota2and3kan Nov 22, 2023

jnmt Nov 24, 2023

kota2and3kan Nov 27, 2023

jnmt commented Nov 24, 2023

kota2and3kan left a comment

feeblefakie left a comment

feeblefakie Nov 28, 2023

jnmt Nov 29, 2023

feeblefakie Nov 30, 2023

jnmt Nov 30, 2023

feeblefakie Nov 28, 2023

jnmt Nov 29, 2023

feeblefakie Nov 28, 2023

jnmt Nov 29, 2023

brfrn169 left a comment

brfrn169 Nov 29, 2023

jnmt Nov 29, 2023

brfrn169 Nov 29, 2023

jnmt Nov 29, 2023

brfrn169 Nov 29, 2023

jnmt Nov 29, 2023

brfrn169 Nov 29, 2023

jnmt Nov 29, 2023

brfrn169 left a comment

komamitsu left a comment

komamitsu Nov 30, 2023

jnmt Nov 30, 2023

feeblefakie left a comment

josh-wong left a comment

josh-wong left a comment

		You should carefully plan to import a table to ScalarDB in production because it will add transaction metadata columns to your database tables and the ScalarDB metadata tables. There would also be several differences between your database and ScalarDB and limitations.
		{% endcapture %}

	In the `where()` condition method chain, the conditions must be an and-wise junction of `ConditionalExpression` or `OrConditionSet` (so-called conjunctive normal form) like the above example or an or-wise junction of `ConditionalExpression` or `AndConditionSet` (so-called disjunctive normal form).
	In the `where()` condition method chain, the conditions must be an and-wise junction of `ConditionalExpression` or `OrConditionSet` (so-called conjunctive normal form) like the above example or an or-wise junction of `ConditionalExpression` or `AndConditionSet` (so-called disjunctive normal form).

	Except for JDBC databases, we do not recommend enabling cross-partition scan with serializable isolation because it could make the isolation level lower (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.
	We do not recommend enabling cross-partition scan with serializable isolation for non-JDBC databases because transactions could be executed with lower isolation (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.

	You can execute a `Scan` operation across all partitions without specifying a partition key by enabling the following property in the ScalarDB configuration.
	You can execute a `Scan` operation across all partitions, which we call cross-partition scan, without specifying a partition key by enabling the following property in the ScalarDB configuration.

	admin.importTable("ns", "tbl");
	admin.importTable("ns", "tbl", options);

	SchemaLoader.load(configFilePath, schemaFilePath, tableCreationOptions, createCoordinatorTables);
	SchemaLoader.importTables(configFilePath, schemaFilePath, tableCreationOptions);

	We do not recommend enabling the cross-partition scan with serializable isolation for non-JDBC databases because transactions could be executed with lower isolation (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.
	We do not recommend enabling the cross-partition scan with `SERIALIZABLE` isolation level for non-JDBC databases because transactions could be executed with lower isolation (i.e., snapshot). Use it at your own risk only if the consistency does not matter for your transactions.

Update documents for cross-partition scan and import feature #1301

Update documents for cross-partition scan and import feature #1301

Conversation

jnmt commented Nov 21, 2023

Description

Related issues and/or PRs

Changes made

Checklist

Additional notes (optional)

Release notes

Choose a reason for hiding this comment

kota2and3kan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnmt commented Nov 24, 2023

kota2and3kan left a comment

Choose a reason for hiding this comment

feeblefakie left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brfrn169 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brfrn169 left a comment

Choose a reason for hiding this comment

komamitsu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

feeblefakie left a comment

Choose a reason for hiding this comment

josh-wong left a comment

Choose a reason for hiding this comment

josh-wong left a comment

Choose a reason for hiding this comment