-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PIPELINE-2297 Added process to extract reported vessel info on normalization #23
PIPELINE-2297 Added process to extract reported vessel info on normalization #23
Conversation
☁️ Nx Cloud ReportCI is running/has finished running commands for commit 43afc22. As they complete they will appear below. Click to see the status, the terminal output, and the build insights. 📂 See all runs for this CI Pipeline Execution ✅ Successfully ran 2 targetsSent with 💌 from NxCloud. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #23 +/- ##
===========================================
- Coverage 76.58% 75.82% -0.76%
===========================================
Files 46 50 +4
Lines 756 844 +88
Branches 69 78 +9
===========================================
+ Hits 579 640 +61
- Misses 159 186 +27
Partials 18 18 ☔ View full report in Codecov by Sentry. |
packages/pipe-vms-ingestion/assets/feeds/reported_vessel_info.schema.json
Outdated
Show resolved
Hide resolved
packages/pipe-vms-ingestion/vms_ingestion/normalization/pipeline.py
Outdated
Show resolved
Hide resolved
packages/pipe-vms-ingestion/vms_ingestion/normalization/pipeline.py
Outdated
Show resolved
Hide resolved
packages/pipe-vms-ingestion/vms_ingestion/normalization/pipeline.py
Outdated
Show resolved
Hide resolved
packages/pipe-vms-ingestion/vms_ingestion/normalization/pipeline.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, but there were a couple of things I think we should change if we have time before merging this. I know I'm coming very late to the party, so feel free to ignore this if you don't have time for this.
Sorry for the delay.
https://globalfishingwatch.atlassian.net/browse/PIPELINE-2297
This PR includes several changes to the normalization process of VMS positions to extract relevant vessel data reported by the providers along with the positions records that are processed.
New normalization processing option to generate vessel_info
--affected_entities
was added to thenormalization
pipeline that provides the flexibility of stating which entities are produced (output) during the execution of this pipeline. Available options are:positions
andvessel_info
. Default:positions,vessel_info
(both entities are produced and stored in the provided BQ tables.vessel_info
is created if it does not exists.vessel_info
records for the given date(s) and source_tenant.ssvid
when multiple records are found then most recent will take precedence.Improvements to existing normalization
flag
andupdated_at
fields to normalized position records.