Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miovision: Should we filter unacceptable_gaps from volumes_15min_mvt via view instead of during insert? #879

Open
gabrielwol opened this issue Feb 20, 2024 · 3 comments · May be fixed by #1024
Assignees

Comments

@gabrielwol
Copy link
Collaborator

Pros:

Cons:

  • slightly slower selects (+3s / +25% for 1 year as per below)
  • more tables/views to confuse people

Example implementation and testing:

CREATE TABLE gwolofs.unacceptable_gaps_indexed AS (
    SELECT * FROM miovision_api.unacceptable_gaps 
)

CREATE INDEX unacceptable_gaps_intersection_datetime_bin_idx
ON gwolofs.unacceptable_gaps_indexed
USING btree (intersection_uid, datetime_bin);

--DROP VIEW gwolofs.volumes_15min_mvt_filtered;
CREATE VIEW gwolofs.volumes_15min_mvt_filtered AS (
    SELECT
        v1m.volume_15min_mvt_uid,
        v1m.intersection_uid,
        v1m.datetime_bin,
        v1m.classification_uid,
        v1m.leg,
        v1m.movement_uid,
        CASE
            WHEN un.datetime_bin IS NULL THEN v1m.volume
            ELSE NULL
        END AS volume
    FROM miovision_api.volumes_15min_mvt AS v1m --this would change to _unfiltered and have zeros instead of nulls
    LEFT JOIN gwolofs.unacceptable_gaps_indexed AS un USING (intersection_uid, datetime_bin)
)

--test view performance:
--15.345 + 15.324s + 15.317s : 15.32 avg
SELECT intersection_uid, classification_uid, AVG(volume)
FROM gwolofs.volumes_15min_mvt_filtered
WHERE datetime_bin >= '2023-01-01' AND datetime_bin < '2024-01-01'
--new index useful on smaller slices
--WHERE datetime_bin >= '2023-01-01' AND datetime_bin < '2024-01-01' AND intersection_uid = 65
GROUP BY intersection_uid, classification_uid

--baseline:
--12.2s / 11.724s / 12.119s: 12.01s avg
SELECT intersection_uid, classification_uid, AVG(volume)
FROM miovision_api.volumes_15min_mvt
WHERE datetime_bin >= '2023-01-01' AND datetime_bin < '2024-01-01'
GROUP BY intersection_uid, classification_uid
@gabrielwol gabrielwol self-assigned this Feb 20, 2024
@chmnata
Copy link
Collaborator

chmnata commented Feb 22, 2024

Is volumes_15min_mvt the only table affected by this issue? Would we have to do this on volumes_15 table as well ?

@gabrielwol
Copy link
Collaborator Author

gabrielwol commented Feb 22, 2024

You're right, we'd need to add the same view/table treatment to volumes_15min which increases the complexity a bit. Also we'd need to edit many views which refer to these:

volumes_15min:

_RETURN ON aduyves.aadt_miovision_avg_daily
_RETURN ON data_requests.i0533_intersec_uoft_atr
_RETURN ON covid.miovision_hourly
_RETURN ON covid.miovision_hourly_new
_RETURN ON covid.miovision_hourly_temp
_RETURN ON rapidto.miovision_segments_comparison_daily
_RETURN ON rapidto.miovision_segments_comparison_hourly
_RETURN ON activeto.miovision_volumes_15min_adj

volumes_15min_mvt:

_RETURN ON data_requests.i0533_intersec_uoft_tmc
_RETURN ON gwolofs.open_leg_issues
_RETURN ON miovision_api.volumes_15min_tmc

plus all the insert and clear functions

@gabrielwol
Copy link
Collaborator Author

@chmnata one more option I thought of is to change the aggregation order. Instead of find_gaps -> volumes_15min* we could do volumes_15min* ->find_gaps and add an update volumes_15min* clause to find_gaps using the new inserts.
This would require much fewer changes to database than above.

@gabrielwol gabrielwol added this to the Miovision pipeline updates milestone Jul 4, 2024
gabrielwol added a commit that referenced this issue Jul 26, 2024
gabrielwol added a commit that referenced this issue Jul 26, 2024
gabrielwol added a commit that referenced this issue Jul 26, 2024
gabrielwol added a commit that referenced this issue Jul 26, 2024
gabrielwol added a commit that referenced this issue Jul 26, 2024
gabrielwol added a commit that referenced this issue Jul 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment