Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL query insights: LIMIT seems to break beyond some arbitrary/magic number on my query #20255

Closed
1 of 4 tasks
aseemk opened this issue Feb 11, 2024 · 2 comments
Closed
1 of 4 tasks
Labels
bug Something isn't working right

Comments

@aseemk
Copy link

aseemk commented Feb 11, 2024

Bug description

I have a manual SQL query insight (due to limitations like #20251) that produces a table of stats per "question" (we have an interest quiz/questionnaire):

select JSONExtractString(arrayJoin(JSONExtractArrayRaw(properties.quizResults.answers ?? '[]')), 'text') as 'Question'

, sumIf(1, JSONExtractString(arrayJoin(JSONExtractArrayRaw(properties.quizResults.answers ?? '[]')), 'answer') = '2') as '😍'
, sumIf(1, JSONExtractString(arrayJoin(JSONExtractArrayRaw(properties.quizResults.answers ?? '[]')), 'answer') = '1') as '🙂'
, sumIf(1, JSONExtractString(arrayJoin(JSONExtractArrayRaw(properties.quizResults.answers ?? '[]')), 'answer') = '0') as '😐'
, sumIf(1, JSONExtractString(arrayJoin(JSONExtractArrayRaw(properties.quizResults.answers ?? '[]')), 'answer') = '-1') as '🙁'
, sumIf(1, JSONExtractString(arrayJoin(JSONExtractArrayRaw(properties.quizResults.answers ?? '[]')), 'answer') = '-2') as '🤮'

, count() as 'Total'

, round(median(JSONExtractInt(arrayJoin(JSONExtractArrayRaw(properties.quizResults.answers ?? '[]')), 'answer')), 1) as 'Median'
, round(avg(JSONExtractInt(arrayJoin(JSONExtractArrayRaw(properties.quizResults.answers ?? '[]')), 'answer')), 1) as 'Avg'
, round(stddevSamp(JSONExtractInt(arrayJoin(JSONExtractArrayRaw(properties.quizResults.answers ?? '[]')), 'answer')), 1) as 'Std Dev'

from events

where {filters} -- replaced with global date and property filters
and event = 'Quiz Finished'
and not (
   coalesce(person.properties.`cohort`, '') = 'PORTOLANS'
   or match(coalesce(properties.$host, ''), '^(localhost|127\\.0\\.0\\.1)($|:)')
)

group by 1
order by "Total" desc

(If this query feels messy: #20250, #20252, etc.)

Example output:

Screenshot 2024-02-11 at 9 12 34 AM

If I omit a LIMIT clause in my query (as above), PostHog displays the first 100 results and then this message below it:

Default limit of 100 rows reached. Try adding a LIMIT clause to adjust.

But if I add a LIMIT clause with a higher number, e.g. LIMIT 200, no results get returned at all:

Screenshot 2024-02-11 at 9 02 50 AM

I trial-and-error'ed (binary searched) and found that the exact number where this happens is 135. E.g. LIMIT 134 works and returns 134 results, while LIMIT 135 doesn't work / returns no results at all.

I thought maybe the bug here was LIMIT breaking past the actual total number of results. But what's wild is that if I count(distinct ...) the questions, we actually have 154 total results.

So the threshold here (135) is surprising and seems arbitrary. But the more important bug is that specifying a higher LIMIT is silently breaking the query — returning no results at all, with no error message.

How to reproduce

I thought maybe this was a general bug or limitations (no pun intended), but I can't repro this with a simple one-off query that e.g. just reads/aggregates from the events table (up to LIMIT 1000). Is there something about my query that's triggering a bug?

Here it is on our hosted cloud instance if helpful:

https://us.posthog.com/project/50689/insights/14ybw5Gn

Environment

  • PostHog Cloud US, project ID: 50689
  • PostHog Cloud EU, project ID: [please provide from https://eu.posthog.com/settings/project-details#variables]
  • PostHog Hobby self-hosted with docker compose, version/commit: [please provide]
  • PostHog self-hosted with Kubernetes (deprecated, see "Sunsetting Kubernetes support"), version/commit: [please provide]

Additional context

Thank you for your help!

@aseemk aseemk added the bug Something isn't working right label Feb 11, 2024
@webjunkie
Copy link
Contributor

I was able to check your query. There are actually results, but our UI doesn't display them anymore with the higher limit. One reason I can see is that there are NaN values appearing 🤔

I can check later more, but could you try modifying your query so that these are 0 and see if that helps?

@webjunkie
Copy link
Contributor

We fixed the issue with NaN values in #20302. Let us know in case this issue here still exists for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working right
Projects
None yet
Development

No branches or pull requests

2 participants