-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix flint skipping index syntax issues #1846
Fix flint skipping index syntax issues #1846
Conversation
- vpc flow - cloud trail Signed-off-by: YANGDB <[email protected]>
- vpc flow - cloud trail Signed-off-by: YANGDB <[email protected]>
- vpc flow - cloud trail Signed-off-by: YANGDB <[email protected]>
- vpc flow - cloud trail Signed-off-by: YANGDB <[email protected]>
- vpc flow - cloud trail Signed-off-by: YANGDB <[email protected]>
- vpc flow - cloud trail - multiple records protocol support Signed-off-by: YANGDB <[email protected]>
}, | ||
{ | ||
"name": "dashboards-flint-records", | ||
"label": "Dashboards & Visualizations adapted to Flint", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same label as above, will this be confusing to the user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In generally LGTM. Just left some minor thoughts.
"version": "1.0.0", | ||
"extension": "sql", | ||
"type": "query", | ||
"workflows": ["dashboards-flint"] | ||
}, | ||
{ | ||
"name": "create_mv_cloud-trail", | ||
"name": "create_mv_cloud-trail-records", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for my knowledge, what is the record referencing to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extended format for multiple records shown here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand right that we only have acceleration for single-record, and multi-record has queries without acceleration? LGTM from a technical standpoint, but I wonder how common multi-record is compared to single-record/what the impact of that is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes - this is due to the existing limitation caused by the skipping index
creation statement
@@ -58,5 +58,5 @@ CREATE EXTERNAL TABLE IF NOT EXISTS {table_name} ( | |||
accountid STRING, | |||
eventday STRING | |||
) | |||
USING json | |||
USING parquet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Unrelated to this change, but I do notice that there is an empty file called "create_mv_vpc-1.0.0.sql" in the vpc asset directory. Should we also remove that?
- vpc flow - cloud trail - multiple records protocol support Signed-off-by: YANGDB <[email protected]>
- vpc flow - cloud trail - multiple records protocol support Signed-off-by: YANGDB <[email protected]>
rec.tlsDetails.clientProvidedHostHeader AS `aws.cloudtrail.tlsDetailsclient_provided_host_header` | ||
FROM | ||
{table_name} | ||
LATERAL VIEW explode(Records) myTable AS rec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there issues with having the table alias default to myTable
? Looks like it probably won't cause issues since it's just a virtual table but I'm not sure where the virtual table actually gets stored or if this should be more descriptively named.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we discussed at one point enabling the create table query by default so it doesn't cause an error if running with only queries selected and no other flows -- would now be a good time to implement that? OTOH I don't want to rock the boat since it's not in-scope for the immediate issue.
Aside from that, there's one unescaped field still -- did a full diff with the known-working version we made yesterday and that's the only delta so I can approve after that.
`src_endpoint.ip` BLOOM_FILTER, | ||
`dst_endpoint.ip` BLOOM_FILTER, | ||
`src_endpoint.svc_name` VALUE_SET, | ||
`dst_endpoint.svc_name` VALUE_SET, | ||
traffic.bytes MIN_MAX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`traffic.bytes` ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch - thanks
- vpc flow - cloud trail - multiple records protocol support Signed-off-by: YANGDB <[email protected]>
Signed-off-by: YANGDB <[email protected]>
Signed-off-by: YANGDB <[email protected]>
`src_endpoint.ip` BLOOM_FILTER, | ||
`dst_endpoint.ip` BLOOM_FILTER, | ||
`src_endpoint.svc_name` VALUE_SET, | ||
`dst_endpoint.svc_name` VALUE_SET, | ||
`traffic.bytes` MIN_MAX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we removing the field request_processing_time MIN_MAX,
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't exist in the data schema, so it wasn't supposed to be there anyways
OPTIONS ( | ||
compression='gzip', | ||
recursivefilelookup='true', | ||
multiline 'true' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question? Should this be multiLine='true'
.
recursivefilelookup='true' | ||
PATH '{s3_bucket_location}', | ||
recursivefilelookup='true', | ||
multiline 'true' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is multiline part case sensitive. I've usually seen commands with multiLine
L captialised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PATH='{s3_bucket_location}',
recursivefilelookup='true',
multiLine='true'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think both multiline
and multiLine
are be accepted
* update flint related issues for - vpc flow - cloud trail - multiple records protocol support Signed-off-by: YANGDB <[email protected]> * update flint vega ip sankey visualization query Signed-off-by: YANGDB <[email protected]> * update flint vega ip sankey visualization query Signed-off-by: YANGDB <[email protected]> --------- Signed-off-by: YANGDB <[email protected]> (cherry picked from commit 0d2a1c7) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
The backport to
To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/dashboards-observability/backport-2.13 2.13
# Navigate to the new working tree
pushd ../.worktrees/dashboards-observability/backport-2.13
# Create a new branch
git switch --create backport/backport-1846-to-2.13
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 0d2a1c7361520e52a4f5a014e19ee3d38bb91eeb
# Push it to GitHub
git push --set-upstream origin backport/backport-1846-to-2.13
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/dashboards-observability/backport-2.13 Then, create a pull request where the |
* update flint related issues for - vpc flow - cloud trail - multiple records protocol support * update flint vega ip sankey visualization query --------- (cherry picked from commit 0d2a1c7) Signed-off-by: YANGDB <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Description
update vpc flow logs & cloud trail flint related integration content changes
Issues Resolved
VPC
CloudTrail
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.