Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Supports nested struct columns as features, timestamp fields #153

Merged
merged 8 commits into from
Nov 7, 2024

Conversation

EXPEbdodla
Copy link
Collaborator

@EXPEbdodla EXPEbdodla commented Oct 31, 2024

What this PR does / why we need it:

  1. Supports nested fields as features using field mappings in Datasources for Spark Offline Store
  • During Spark Materialization, it applies field mappings. Nested fields are aliased based on field mappings provided
  • During Spark Kafka Streaming Ingestion, It applies field mappings and drops unused columns from DataFrame.
  • Throws exception when nested fields are used and not defined field mappings
  1. Added few test cases
  2. Prints time to execute online store ingestion for batch in Streaming Ingestion.

Which issue(s) this PR fixes:

Misc

@EXPEbdodla EXPEbdodla force-pushed the feat_nested_column_support branch from 4372720 to da670e6 Compare November 4, 2024 23:54
Copy link
Collaborator

@omirandadev omirandadev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed for functionality and compatibility with nested structs streams I worked with before. LGTM.

@EXPEbdodla EXPEbdodla merged commit cf0f2f2 into master Nov 7, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants