-
Notifications
You must be signed in to change notification settings - Fork 18
Use SQLAlchemy for BigQuery backend #72
Comments
For BigQuery this makes sense, especially now that we've done some pretty extensive testing of the SQLAlchemy connector for BigQuery. My concerns:
|
I guess another concern is UDFs. I don't believe we have support for this in the SQLAlchemy connector (though it would be cool to have!) |
Thanks for all the information, that's very useful to know. We clearly need to plan this well, if it ever happens. OmniSci seems to be happy with the idea, and I think for clickhouse is very doable. Impala is trickier, and we'll have to see if it can be done, or if it makes sense. I'll be soon writing a draft of an Ibis roadmap with future plans, so we can discuss them. But I think after 2.0 we can make the base SQLAlchemy not depend on the Ibis compiler to start with. And maybe we can later move to separate projects / extensions the Ibis compiler, and the backends that use them. Then we'll see whether it's worth to maintain the Ibis compiler, or if backends are eventually migrated to SQLAlchemy. We can discuss further after 2.0, when I have the draft for the roadmap. For now all your comments were very useful to know if researching in this direction made sense or not. Thanks! |
Another thought: I'd still want some custom code to execute the query text once compiled & transform to pandas dataframe. SQLAlchemy -> pandas is possible with just the SQLAlchemy connector, but it's much less performant. |
This make sense. I think the part that really adds value to replace is the generation of the SQL, which if we only use SQLAlchemy it will let us delete few thousands lines of code in Ibis. Somehow unrelated, there were some discussions on fetching data from backends as Arrow (which can easily converted to pandas with a |
Some news: we now have a 1.0 (General Availability) release of the official SQLAlchemy connector for BigQuery (https://github.com/googleapis/python-bigquery-sqlalchemy) After that and my failed attempt at migration in #91, I'm thinking maybe it is time to migrate. I don't want to abandon the Ibis 1.x users though, so perhaps we create a |
@tswast I'd like to know your opinion on this.
Seems to me, that the Ibis SQL compiler (i.e.
ibis/backends/base/sql/compiler
) is basically reinventing SQLAlchemy. SQLAlchemy is great at generating an SQL string given expressions, and that's exactly what the compiler also does. And now we've got our backends fragmented in the ones that use the Ibis compiler, and the ones using SQLAlchemy.Seems to me, that for the future would make much more sense to exclusively use SQLAlchemy for the generation of SQL, and remove from Ibis all the code in
ibis/backends/base/sql/compiler
). That would involve that all backends use their SQLAlchemy engine for Ibis. For BigQuery this seems to be trivial, as there is already a BigQuery SQLAlchemy plugin, and implementing a SQLAlchemy backend is quite straight-forward (and even more after the refactoring I'm finishing).Is this something that makes sense to you? Have you considered this already? Anything am I missing?
The text was updated successfully, but these errors were encountered: