Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add HBase SDK for serving #127

Merged
merged 33 commits into from
Oct 2, 2024
Merged

feat: add HBase SDK for serving #127

merged 33 commits into from
Oct 2, 2024

Conversation

bayu-aditya
Copy link
Collaborator

@bayu-aditya bayu-aditya commented Sep 3, 2024

Summary

This PR adds support to use hbase as the online store.

  • Read data from BigTable and Hbase in caraml-serving by using HBase SDK by set configuration caraml.store.bigtable.isUsingHBaseSDK become true. If the value is false, then it will read data by using BigTable SDK.
  • Modifies spark job to ingest the data into HBase for stream and batch ingestion, but reuses most of the existing logic to pull data from BigTable.
  • Add additional unit tests to read data from bigtable using HBase sdk, and to read data from hbase directly
  • Refactor bigtable online store to share BaseSchemaRegistry

@bayu-aditya bayu-aditya changed the title Retrieve HBase data from local feat: add HBase SDK for serving Sep 4, 2024
@bayu-aditya bayu-aditya force-pushed the bayu/hbase branch 2 times, most recently from 4dc8df6 to fd81384 Compare September 4, 2024 08:17
Use of deprecated classes result in ImmutableHTableDescriptor returned which throws java.lang.UnsupportedOperationException: HTableDescriptor is read-only error
* Use offset and length to get rowCell values because hbase server
  returns slightly different response structure than bigtable
* This is also applied when looking up the avro schema
shydefoo and others added 9 commits September 9, 2024 11:37
* Use offset and length to get rowCell values because hbase server
  returns slightly different response structure than bigtable
* This is also applied when looking up the avro schema
[feat] Add support to use hbase online store
@bayu-aditya bayu-aditya marked this pull request as ready for review September 20, 2024 02:31
@bayu-aditya bayu-aditya requested review from tiopramayudi and removed request for tiopramayudi, bthari and deadlycoconuts September 23, 2024 06:27
@tiopramayudi
Copy link

can you update the PR description and highlight what are the primary changes?


@Bean
public OnlineRetriever getRetriever() {
// Using HBase SDK
if (isUsingHBaseSDK) {
org.apache.hadoop.conf.Configuration config =

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this block wrapped by try catch block? I saw that the connect method can throw the IllegalStateException

Copy link
Contributor

@shydefoo shydefoo Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bayu-aditya could you take a look at this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup sure

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in here 455e253 Thank you for your finding

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use one try catch and condition is in the try catch block?

@Override
public int hashCode() {
int result = tableName.hashCode();
result = 31 * result + schemaHash.hashCode();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is 31 represent for?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea 😆

caraml-store-spark/docker/Dockerfile Show resolved Hide resolved
@shydefoo shydefoo requested review from tiopramayudi and removed request for shydefoo September 26, 2024 10:26
@shydefoo
Copy link
Contributor

@tiopramayudi @mbruner poke

Copy link

@tiopramayudi tiopramayudi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add two more feedbacks, the rest is LGTM. Thanks @bayu-aditya @shydefoo

return Collections.<Feature>emptyList();
} else {
Result row = rows.get(rowKey);
return featureReferences.stream()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the chain is very long, is it possible to break the chain?

Copy link
Collaborator Author

@bayu-aditya bayu-aditya Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in 4355a1a thank you 🙏


@Bean
public OnlineRetriever getRetriever() {
// Using HBase SDK
if (isUsingHBaseSDK) {
org.apache.hadoop.conf.Configuration config =

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use one try catch and condition is in the try catch block?

@bayu-aditya bayu-aditya merged commit b343616 into main Oct 2, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants