Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backport 1.9.latest] Add batch context object to microbatch jinja context #11064

Merged
merged 1 commit into from
Nov 27, 2024

Commits on Nov 27, 2024

  1. Add batch context object to microbatch jinja context (#11031)

    * Add `batch_id` to jinja context of microbatch batches
    
    * Add changie doc
    
    * Update `format_batch_start` to assume `batch_start` is always provided
    
    * Add "runtime only" property `batch_context` to `ModelNode`
    
    By it being "runtime only" we mean that it doesn't exist on the artifact
    and thus won't be written out to the manifest artifact.
    
    * Begin populating `batch_context` during materialization execution for microbatch batches
    
    * Fix circular import
    
    * Fixup MicrobatchBuilder.batch_id property method
    
    * Ensure MicrobatchModelRunner doesn't double compile batches
    
    We were compiling the node for each batch _twice_. Besides making microbatch
    models more expensive than they needed to be, double compiling wasn't
    causing any issue. However the first compilation was happening _before_ we
    had added the batch context information to the model node for the batch. This
    was leading to models which try to access the `batch_context` information on the
    model to blow up, which was undesirable. As such, we've now gone and skipped
    the first compilation. We've done this similar to how SavedQuery nodes skip
    compilation.
    
    * Add `__post_serialize__` method to `BatchContext` to ensure correct dict shape
    
    This is weird, but necessary, I apologize. Mashumaro handles the
    dictification of this class via a compile time generated `to_dict`
    method based off of the _typing_ of th class. By default `datetime`
    types are converted to strings. We don't want that, we want them to
    stay datetimes.
    
    * Update tests to check for `batch_context`
    
    * Update `resolve_event_time_filter` to use new `batch_context`
    
    * Stop testing for batchless compiled code for microbatch models
    
    In 45daec7 we stopped an extra compilation
    that was happening per batch prior to the batch_context being loaded. Stopping
    this extra compilation means that compiled sql for the microbatch model without
    the event time filter / batch context is no longer produced. We have discussed
    this and _believe_ it is okay given that this is a new node type that has not
    hit GA yet.
    
    * Rename `ModelNode.batch_context` to `ModelNode.batch`
    
    * Rename `build_batch_context` to `build_jinja_context_for_batch`
    
    The name `build_batch_context` was confusing as
    1) We have a `BatchContext` object, which the method was not building
    2) The method builds the jinja context for the batch
    As such it felt appropriate to rename the method to more accurately
    communicate what it does.
    
    * Rename test macro `invalid_batch_context_macro_sql` to `invalid_batch_jinja_context_macro_sql`
    
    This rename was to make it more clear that the jinja context for a
    batch was being checked, as a batch_context has a slightly different
    connotation.
    
    * Update changie doc
    
    (cherry picked from commit c3d87b8)
    QMalcolm authored and github-actions[bot] committed Nov 27, 2024
    Configuration menu
    Copy the full SHA
    3519cd8 View commit details
    Browse the repository at this point in the history