-
Notifications
You must be signed in to change notification settings - Fork 23
CartoDB Oracle FDW
The objective of this document is to give a high level view of the status of the Oracle connector for CartoDB as well as the technical details on how to install it and set it up have a working environment of Oracle tables exposed as tables that can be read by CartoDB.
Most technical contents of this page have been extracted from the CartoDB onpremise FDW.
Exposing information stored on clients databases is a typical request. Oracle is one of the most used databases and having a working procedure to connect CartoDB with an Oracle database is needed.
The use case where the Oracle FDW works best with CartoDB is when joining a CartoDB to an Oracle table that stores additional alphanumeric tables.
Other use cases like showing Oracle Spatial geometries are not that good because:
-
The FDW doesn't expose to PostGIS the geometries and the cartographic projection is not recognized
-
The FDW doesn't pass to Oracle the operations done to the geometry so for example a query that tries to gets the geometries by a bounding box will don't convert the
ST_Intersects
function to the equivalentSDO_RELATE
in Oracle so it will retrieve the full dataset and do the operation locally. On the other hand this works for alphanumeric values with some caveats discussed earlier.
The Oracle FDW is documented and hosted at:
- Website: http://laurenz.github.io/oracle_fdw/
- Repository: https://github.com/laurenz/oracle_fdw
- Mailing list: http://lists.pgfoundry.org/mailman/listinfo/oracle-fdw-general
The maintainer of the project is active and accessible for questions and fixing bugs.
Requirements:
- Support for running
make
(build-essential
package, etc) - Postgres development libraries
- Oracle Instant Client (follow instructions at ubuntu website including the SDK Fix section)
The easiest way to install the extension is to download the GitHub repository and build it locally with the typical make; make install
commands. Once the Oracle FDW shared library is accessible by Postgres it should accept to load the extension following next section instructions.
Before trying to connect using the extension, is advisable to try to connect to the remote Oracle instance using the sqlplus
tool so you can be sure everything is working on the Oracle side. To access an Oracle server using this tool you need to run
sqlplus 'user/pass@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(Host=hostname.network)(Port=1521))(CONNECT_DATA=(SID=remote_SID)))'
There are other ways to connect to an Oracle host, one is using easy connect and the other is using a tsnames.ora
definition file.
Usage example:
CREATE EXTENSION oracle_fdw;
CREATE SERVER oradb
FOREIGN DATA WRAPPER oracle_fdw
OPTIONS (
dbserver '//dbserver.mydomain.com/ORADB'
);
GRANT USAGE ON FOREIGN SERVER oradb TO development_cartodb_user_b7861beb-21d4-46bc-ae00-5f09f77701a6;
-- Then you can connect to PostgreSQL as
-- "development_cartodb_user_b7861beb-21d4-46bc-ae00-5f09f77701a6":
CREATE USER MAPPING FOR development_cartodb_user_b7861beb-21d4-46bc-ae00-5f09f77701a6
SERVER oradb
OPTIONS (user 'myremoteuser', password 'mypwd');
CREATE FOREIGN TABLE oratab (
id integer OPTIONS (key 'true') NOT NULL, -- "key" option flags primary key
text character varying(30),
start_time timestamp,
floating double precision NOT NULL
)
SERVER oradb
OPTIONS (
schema 'MYREMOTEUSER',
table 'ORATAB'
);
- Oracle has two time types:
timestamp
anddate
. Timestamps are accurate to fractions of second; dates are accurate to seconds. This is different from PostgreSQL, which uses timestamps that are very accurate and dates that are just calendar dates, to the nearest day. If you have adate
column in Oracle with an index on it, and you attempt to access it via the FDW, the index will not come into play, but the Oracle FDW maps all PostgreSQL time types into timestamps, not dates. To get dates through the FDW, you have to avoid the FDW type mapper, by using theto_date()
function (this works because the FDW function mapper recognizesto_date()
as a function it can pass through to Oracle untouched).
-- Don't use this
SELECT * FROM oratab WHERE timestamp < '2015-05-05 01:02:32';
-- Use this instead
SELECT * FROM oratab WHERE timestamp < to_date('2015-05-05 01:02:32', 'YYYY-MM-DD HH24:MI:SS');
- Because the Oracle Instant Client comes with its own LDAP implementation, it will clash with OpenLDAP library if Postgres is configured to use LDAP. Check the
LDAP Libraries
section of the README file.
Oracle spatial data type (SDO_GEOMETRY
) is supported, but the geometry_columns
metadata view will catalogue foreign tables as SRID=0
no matter they are correctly set up at Oracle using their own SRID cataloguing numbers or using EPSG table.
So querying the Oracle metadata table gives the correct information:
SQL> select table_name, srid from user_sdo_geom_metadata where table_name like 'POINTS';
TABLE_NAME SRID
-------------------------------- ----------
POINTS 4326
But querying the catalogue in the same way geometry_column
view is configured gives a 0 value:
testoracle=# SELECT c.relname::character varying(256) AS f_table_name,
postgis_typmod_srid(a.atttypmod) as srid
FROM pg_class c, pg_attribute a, pg_type t, pg_namespace n
WHERE t.typname = 'geometry'::name AND a.attisdropped = false AND a.atttypid = t.oid AND a.attrelid = c.oid AND c.relnamespace = n.oid AND (c.relkind = 'r'::"char" OR c.relkind = 'v'::"char"
OR c.relkind = 'm'::"char" OR c.relkind = 'f'::"char") AND NOT pg_is_other_temp_schema(c.relnamespace)
AND NOT (n.nspname = 'public'::name AND c.relname = 'raster_columns'::name) AND
has_table_privilege(c.oid, 'SELECT'::text) and c.relname like 'orapoints';
f_table_name | srid
--------------+------
orapoints | 0
(1 row)
To fix this issue it's possible to define a view that wraps the foreign data table specifying the type and SRID using Postgres typmod notation. So in this case creating the view
create view vorapoints as select id, description, cat, geometry::geometry(Point,4326) from orapoints ;
is then correctly catalogued:
testoracle=# select f_table_name, srid from geometry_columns where f_table_name = 'vorapoints';
f_table_name | srid
--------------+------
vorapoints | 4326
(1 row)
Although the format is supported, functions are not mapped to Oracle so Postgres will end up doing a full select over the table in order to apply any function on the geometry column. So if for example we have a world borders table and we want to count the countries that intersect a bounding box we will have something like this:
database=# explain select name from world where geom && ST_MakeEnvelop(20,20,40,40,4326);
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Foreign Scan on world (cost=10000.00..10000.00 rows=1000 width=178)
Filter: (geom && '0103000020E610000001000000050000000000000000003440000000000000344000000000000034400000000000004440000000000000444000000000000044400000000000004440000000000000344000000000000034400000000000003440'::geometry)
Oracle query: SELECT /*9f30fa765ded1ad160dcd1b55f9e6032*/ "NAME", "GEOM" FROM "CDBTEST"."WORLD_BORDERS"
(3 rows)
on the other hand if we do the filter by a numeric column then the filter is well passed to Oracle:
database=# explain select name from world where pop2005 < 1e7;
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Foreign Scan on world (cost=10000.00..10000.00 rows=1000 width=178)
Oracle query: SELECT /*c1f09bfc9f6402f6f945d6b9e8da1ff3*/ "NAME", "POP2005" FROM "CDBTEST"."WORLD_BORDERS" WHERE ("POP2005" < 10000000)
(2 rows)
To invalidate caches on CartoDB platform Postgres tables are watched and some metadata is stored to fire the invalidation process. Because Oracle changes are not exposed to Postgres the cache is not automatically invalidated when changes happen on the Oracle side.
Workarounds to this can be implemented depending on the specifics of every project, like:
- adding a changing parameter on requests to bypass the cache
- creating a crontab that fires a fake update on the Postgres table with the desired periodicity
- replicating the tracking changes metadata table on the Oracle side and check it frequently to fire the fake updates only on those tables that have changes.