Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It's easy to fetch all search results accidentally with real_search. #929

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

hodgestar
Copy link
Contributor

Currently passing start=None to real_search causes it to ignore rows and just return all search results (which will often time out).

@@ -293,16 +293,10 @@ def _search_iteration(self, bucket, query, rows, start):

def real_search(self, modelcls, query, rows=None, start=None):
rows = 1000 if rows is None else rows
start = 0 if start is None else start
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to throw an exception if we get start=None so we can find places that are using this incorrectly and fix them instead of having them silently return fewer results than they did before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think raising an exception for the default value is a bit crazy? Everywhere I can find that uses real_search currently passes in a start value and expects it to be honoured.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this were a new method, I'd completely agree. What we're doing here is changing the behaviour of the default case, and throwing an error for the now-unsupported behaviour (in any places it's used that we haven't found) is better than silently returning fewer results than we had before. We'd have to document it clearly, of course -- probably in the exception.

(If start were before rows in the parameter list, I'd suggest removing the default value and making it mandatory.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these other places exist? And if there are a couple of places we can fix them when we find them?

The situation in which the current implementation returns more results is a bit corner case anyway since it requires there to be more results than rows (so more than 1000) by default, but few enough to return in a reasonable time frame.

The cost of raising the exception is having yet another ugly function in the code base that we have to clean up later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these other places exist? And if there are a couple of places we can fix them when we find them?

How will we find them if we don't raise an exception?

The cost of raising the exception is having yet another ugly function in the code base that we have to clean up later.

The cost of not raising the exception is potentially introducing subtle data corruption bugs into existing code. Having been on the receiving end of that kind of change more than once, I'll take the ugly function every time.

@jerith
Copy link
Member

jerith commented Mar 16, 2015

Anyway, I'll leave this decision to you. 👍 other than that.

@hodgestar
Copy link
Contributor Author

Tx. While this PR was open @rudigiesler fixed the contacts api in praekeltfoundation/go-contacts-api@cd9b78f but it doesn't appear to have helped in practice. I'd like to understand better what's going on before landing this (in case something more needs to change).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants