Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement Cassandra/ScyllaDB Go Online Store #138

Merged
merged 19 commits into from
Dec 2, 2024

Conversation

msistla96
Copy link
Collaborator

@msistla96 msistla96 commented Sep 11, 2024

Cassandra/ScyllaDB

What this PR does / why we need it:

This PR is to implement the ScyllaDB online store which may incur in lower costs when compared to Redis for certain use cases. More details at https://confluence.expedia.biz/display/~msistla/Scylla+DB+Online+Store+Implementation

Change Summary:

  • Added Cassandra/ScyllaDB Go online store implementation.
  • Refactored some common code between online stores into a utils file.

Highlights

Concurrent Queries for Single OnlineRead(...) calls

When a single OnlineRead(...) requests multiple keys, this online store will generate one CQL query per key like this:

SELECT "entity_key", "feature_name", "event_ts", "value" 
FROM "scylladb"."project_fv"
WHERE "entity_key" = ? AND 
"feature_name" IN ('feature1', 'feature2')

and run them in parallel; this is to avoid the potential performance cost of having WHERE "entity_key" in ('key1', 'key2').

DataDog Tracing

By leveraging https://github.com/DataDog/dd-trace-go/blob/main/contrib/gocql/gocql/gocql.go, we're able to have a detailed trace that tells us just how much time we're spending retrieving the features and how much overhead the feature server is actually adding. Below is an image of a trace for retrieval for 3 different keys.
image

Testing

Did a test deployment of this feature server in test us-west-2 for testing retrieval and tracing.

image

Known Limitations/TODOs

  • IN is used for limiting the number of features retrieved, we may hit a query limit when fetching too many features in a single query. We may need to partition queries to retrieve less features at a time if necessary.

@msistla96 msistla96 changed the title feat: Implement scylla db online store feat: Implement ScyllaDB Online Store Sep 13, 2024
@acevedosharp acevedosharp force-pushed the implementScyllaDBOnlineStore branch from ed81825 to 95e0544 Compare November 15, 2024 23:02
Makefile Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
setup.py Outdated Show resolved Hide resolved
@acevedosharp acevedosharp changed the title feat: Implement ScyllaDB Online Store feat: Implement Cassandra/ScyllaDB Go Online Store Nov 26, 2024
@acevedosharp acevedosharp marked this pull request as ready for review November 26, 2024 21:11
// parse protocolVersion
protocolVersion, ok := onlineStoreConfig["protocol_version"]
if !ok {
protocolVersion = 4.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor thing but I remember facing some connection issues with 3.0 or without. Have you come across any issues when using any other Protocol version? Might be better to keep protocolVersion to 4.0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't come across such issues

)
}

func TestOnlineRead_RejectsDifferentFeatureViewsInSameRead(t *testing.T) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for OnlineRead with valid inputs?

@msistla96
Copy link
Collaborator Author

LGTM, added some comments to address.

}
} else {
// Return FeatureData with a null value
rowFeatures[featureName] = FeatureData{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this piece of code is repeating, would be nicer to call a helper function instead, something like :

func buildNullFeatureData(featureName, featureViewName string) FeatureData {
return FeatureData{
Reference: serving.FeatureReferenceV2{
FeatureViewName: featureViewName,
FeatureName: featureName,
},
Value: types.Value{
Val: &types.Value_NullVal{NullVal: types.Null_NULL},
},
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this else block entirely, if we don't put this FeatureData in the rowFeatures the final loop of the goroutine will fill the slot with Null

}

if err := scanner.Err(); err != nil {
errorsChannel <- errors.New("failed to scan features: " + err.Error())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we modify the error message to have more context when you look at it?

fmt.Errorf("failed to scan features for entity key '%s' ......)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this part we only have access to a serialized entity key. It's going to look something like this: 0200000064656d6f5f6b657904000000080000000000000000000000, I don't think it's useful to include it as the customer won't know what it is (they input a normal key, for the sample above it was 0).

return results, nil
}

func (c *CassandraOnlineStore) Destruct() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this Destruct() do? Maybe we should close the session here if my understanding is correct?
I see you are opening a session with "CreateTracedSession"...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was following the lead of redisonlinestore.go with empty Destruct(). TBH I guess it won't matter a whole lot, as when Destruct() needs to be called, the container is getting killed anyway, maybe the db will have a dead connection for a while.

Anyway, closing the session there now, thanks.

// parse hosts
cassandraHosts, ok := onlineStoreConfig["hosts"]
if !ok {
cassandraConfig.hosts = []string{"127.0.0.1"}
Copy link
Collaborator

@Manisha4 Manisha4 Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use constants wherever possible?
Example:

const DefaultCassandraHost = "127.0.0.1"

Wherever you're using hostnames, default values, etc.. Makes it easier to modify later instead of changeing it throughout the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would rather not, these values are only used in one place (extractCassandraConfig(...)) I think constants in this case make it harder to read without an IDE as you have to jump around.

}
cassandraConfig.hosts = cassandraHostsStr
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we can condense lines 70 to 90 or wherever you are parsing a string field for the Cassandra config? Something like the following? We could reduce the lines of code. We can even use godoc conventions to explain certain functions if it seems complicated.

func parseStringField(config map[string]any, key, defaultValue string) (string, error) {
rawValue, ok := config[key]
if !ok {
log.Warn().Msgf("%s not defined: Using default value '%s'", key, defaultValue)
return defaultValue, nil
}
value, ok := rawValue.(string)
if !ok {
return "", fmt.Errorf("failed to convert %s to string: %v", key, rawValue)
}
return value, nil
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@acevedosharp acevedosharp requested a review from Manisha4 December 2, 2024 17:55
@acevedosharp acevedosharp merged commit 1c1d1f0 into master Dec 2, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants