-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stats: Ensure that events are aggregated only once #144
Conversation
invenio_stats/aggregations.py
Outdated
@@ -225,10 +225,9 @@ def agg_iter(self, dt, previous_bookmark): | |||
) | |||
if previous_bookmark: | |||
# Let's make sure that both objects are of the same type. This is important for the test | |||
previous = float(previous_bookmark.strftime("%s")) | |||
previous = previous_bookmark.isoformat() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@psaiz this is not now a string, but down there it is compared like a number/date with <
. Are you sure that this is correct?
Are the tests catching this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this. It is a string, isn't it? Note that the comparison is with strings.
I've added these prints to make sure:
).replace(**dict.fromkeys(INTERVAL_ROUNDING[self.interval], 0))
+ print("Comparing")
+ print(aggregation["last_update"]["value_as_string"])
+ print(previous)
if aggregation["last_update"]["value_as_string"] and previous:
if aggregation["last_update"]["value_as_string"] < previous:
+ print("SKIPPING EVENT")
continue
And the test seems to be doing it properly:
Comparing
2017-06-02T11:00:00.000Z
None
Comparing
2017-06-02T11:00:00.000Z
None
Comparing
2017-07-01T05:00:00.000Z
2017-06-02T11:59:00
Comparing
2017-07-01T05:00:00.000Z
2017-06-02T11:59:00
Comparing
2017-07-01T05:00:00.000Z
2017-07-01T05:59:00
SKIPPING EVENT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I wanted to write it is now a string
, but I wrote it is not
a string, I am sorry :)
Do we really want to compare strings in this way?
https://docs.python.org/3/tutorial/datastructures.html#comparing-sequences-and-other-types
I would feel much safer if we convert everything first with datetime.fromisoformat()
, and then we use the Python date comparison. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I have no strong opinions about comparing strings or dates. The added value of comparing the dates is that there is no need to modify the incoming object.
Do you prefer this approach?
f501160
to
dcc08aa
Compare
❤️ Thank you for your contribution!
Description
Ensures that a second aggregation does not reprocess any aggregation already performed and that has not changed,
It addresses https://github.com/zenodo/rdm-project/issues/523