Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from position() to strpos() for citext support #154

Open
charliefoxtwo opened this issue Apr 1, 2020 · 6 comments
Open

Switch from position() to strpos() for citext support #154

charliefoxtwo opened this issue Apr 1, 2020 · 6 comments
Labels
Milestone

Comments

@charliefoxtwo
Copy link

Tested with Postgres 9.6.17 and EntityFramework6.Npgsql 6.3.0

The query users.Where(u => u.FullName.Contains("aaron")); will generate the SQL "Extent1"."fullName" LIKE E'%aaron%', which functions as expected when fullName is citext.

However, let's say you're looking for any users matching one of many possible names. You might write your query like this

var searchTerms = new[] { "aaron", "andrew" };
users.Where(u => searchTerms.Any(t => u.FullName.Contains(t)));

Now, the generated query looks like this

EXISTS(SELECT 1 AS "C1"
     FROM ((SELECT CAST(E'aaron' AS varchar) AS "C1" FROM (SELECT 1 AS "C") AS "SingleRowTable1")
           UNION ALL
           (SELECT CAST(E'andrew' AS varchar) AS "C1"
            FROM (SELECT 1 AS "C") AS "SingleRowTable2")) AS "UnionAll1"
     WHERE position("UnionAll1"."C1" in "Extent1"."fullName") > 0);

Note the switch to using the position function. If you have a user whose fullName is "Aaron", this query will not return that user, even though fullName is citext.

While the postgres documentation would indicate that position(a in b) is just syntactic sugar for strpos(b, a), it appears that the query parser converts position(a in b) to position(b, a), and as far as I can tell citext doesn't expose a citext overload for position(b, a). This can be done by manually creating a function, however one wouldn't expect to have to do this in order to get citext working with Npgsql.

My proposed solution would be that Npgsql switch from using position(a in b) to strpos(b, a) since it has the proper overloads created when installing citext.

@Emill
Copy link

Emill commented Apr 1, 2020

Hi. Maybe send a bug report to postgresql that position(a in b) is not equal to strpos(b, a) in the case of citext? Anyway if it is as you say it should be easy to change to using strpos in Npgsql. PR?

@charliefoxtwo
Copy link
Author

I opened a bug with postgres, and the conclusion seems to be that it's not a bug but intended behavior. They're correct in that citext doesn't claim to overload the position function.

It appears this repo doesn't currently have any unit tests built around citext at all, so a PR for this may take a bit of time. I looked into it the other day and am in the process of sorting out how to create a citext column in the code-first model that the unit tests in this repo use, since the database needs to be created in order to install the citext extension but you can't create the database if it has a citext column without having the citext extension installed.

@roji
Copy link
Member

roji commented Apr 4, 2020

I opened a bug with postgres, and the conclusion seems to be that it's not a bug but intended behavior. They're correct in that citext doesn't claim to overload the position function.

Thanks for opening that bug. FWIW I tend to agree with what they're saying there... It's not really feasible to duplicate all text-processing functions for citext, and the new non-deterministic collations will likely obsolete citext and will not have this problem (although they're still limited in some ways).

It appears this repo doesn't currently have any unit tests built around citext at all, so a PR for this may take a bit of time. I looked into it the other day and am in the process of sorting out how to create a citext column in the code-first model that the unit tests in this repo use, since the database needs to be created in order to install the citext extension but you can't create the database if it has a citext column without having the citext extension installed.

I think you may have missed CitextqueryTest - it should be pretty straightforward to add a test there (you already have a model there with a citext column).

@roji roji changed the title citext comparison shouldn't use position() function Switch from position() to strpos() for citext support Apr 4, 2020
@roji roji added the bug label Apr 4, 2020
@roji
Copy link
Member

roji commented Apr 4, 2020

Apologies, I got confused and thought we were in EF Core and not in EF6. It's true that we're lacking a comprehensive test suite for EF6, including with regards to citext.

I'd be OK with merging this without a test though, as the change really is quite minimal and safe - let me know how you want to proceed.

@roji roji added this to the 6.4.2 milestone Apr 4, 2020
@charliefoxtwo
Copy link
Author

I'll go ahead and take care of it sans unit tests. It might be worth bringing some parity between the EF Core and EF6 tests - maybe I'll take a look at that separately as well.

charliefoxtwo pushed a commit to charliefoxtwo/EntityFramework6.Npgsql that referenced this issue Apr 5, 2020
Fixes npgsql#154

Both unit tests appear to only call into the `IndexOf` case in SqlBaseGenerator.cs but all three occurrences of `position` were changed just to be safe.
@roji
Copy link
Member

roji commented Apr 6, 2020

Thanks for your contribution.

Re tests, EF Core has a very comprehensive test suite (well over 10k tests) specifically meant for different database providers to use; this makes it very easy to test that a given provider works correctly. EF6 never had this - as a provider you're on your own, which is one reason why the test coverage here is so lacking.

Note also that EF6 in general is "archived" to a large extent - no new development is going to take place on it. The same is true of the PostgreSQL provider - all innovation and effort is going into EF Core. We definitely still accept bugfixes and minor features, but I wouldn't invest time in new test infrastructure etc...

@roji roji modified the milestones: 6.4.2, 6.4.3 May 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants