-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transformer fields set does not match our data #25
Comments
Ah @rolanddb - I read this slightly too fast. Are you trying to parse an extract from Redshift using this SDK? That's not supported - you should be pointing this at your enriched:good:archive instead. |
Ah! So it wasn't me :) |
See follow-up comment! I misunderstood your situation I think... |
@alexanderdean I'm loading the events from S3, main/shredded/good. |
Exactly, yes. |
Ok, I'll give that a try tomorrow. |
Re-opening as @rolanddb's seems to be having an ongoing issue here. |
Hello @rolanddb, We're about to publish Scala SDK 0.2.0, but still didn't manage to identify any case where transformer could not match enriched TSV data. I see only one possible cause here - possibly you tried to load events enriched with Snowplow pre-R73 (released in December 2015), which produced more columns than it produces now. If I'm wrong here, would it be possible to provide some details (error message, TSV example) that breaks transformer. |
Hi @chuwy, |
Okay great, closing and descheduling... |
Hi,
I'm trying to load the atomic.events data from Snowplow into Spark.
I'd like to do this using the EventTransformer.transform() method.
We observe a difference in the fields in our data, compared to what the transform method requires. Therefore, all events are marked as failure (unable to parse).
This is the mismatch:
Fields in our data: 128
Fields in SDK transformer: 131
In transformer but not in data: Set(derived_contexts, unstruct_event, contexts, refr_device_tstamp)
In data but not in transformer: Set(refr_dvce_tstamp)
I can make it work by forking the SDK and modifying the transform method, but ideally I'd continue to use the main branch..
Thanks!
The text was updated successfully, but these errors were encountered: