Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add another illegal id #17325

Merged
merged 9 commits into from
Sep 12, 2023
Merged

fix: add another illegal id #17325

merged 9 commits into from
Sep 12, 2023

Conversation

pauldambra
Copy link
Member

If we query

select team_id, any(distinct_id), any(JSONExtractRaw(properties, 'distinct_id'))
from events 
where timestamp >= now() - interval 4 day
and JSONExtractRaw(properties, 'distinct_id') = '"null"'
group by team_id

(while investigating something else

We note that some teams send "null" as distinct id (i.e. double quote null double quote as a string)

We shouldn't allow this

Comment on lines 40 to 51
const CASE_SENSITIVE_ILLEGAL_IDS = new Set([
'[object Object]',
'NaN',
'None',
'none',
'null',
'"null"',
'0',
'"0"',
'undefined',
'"undefined"',
])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we want to do all of these single or double quoted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered about that... looking in the DB we only see them double quoted, but I guess that doesn't mean we'd never see them single quoted 🤔

@@ -245,7 +259,7 @@ export class PersonState {
this.teamId,
this.timestamp
)
} else if (this.event.event === '$identify' && this.eventProperties['$anon_distinct_id']) {
} else if (this.event.event === '$identify' && '$anon_distinct_id' in this.eventProperties) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testing for every illegal id showed that even though we check for the empty string you couldn't get into this branch of the code if the anon distinct id was the empty string

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pauldambra pauldambra requested a review from tiina303 September 5, 2023 20:52
Copy link
Contributor

@tiina303 tiina303 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd update the tests & there's a failure to look into

@pauldambra pauldambra requested a review from tiina303 September 12, 2023 10:10
@pauldambra
Copy link
Member Author

@tiina303 I updated the tests... and it turns out we make assumptions about casing that clashed with my change. E.g. we do allow Nan as an illegal id.

Decided to unwind my casing change rather than go deeper into the rabbit hole. If you're happy with that I'll merge this later on today 👍

Copy link
Contributor

@tiina303 tiina303 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g. we do allow Nan as an illegal id.

why? Should we?

@pauldambra pauldambra merged commit 5003faf into master Sep 12, 2023
59 checks passed
@pauldambra pauldambra deleted the fix/add-illegal-id branch September 12, 2023 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants