Skip to content

Commit

Permalink
Add Exception information
Browse files Browse the repository at this point in the history
  • Loading branch information
volcan01010 committed May 15, 2024
1 parent aadd1d1 commit b727907
Show file tree
Hide file tree
Showing 4 changed files with 93 additions and 7 deletions.
23 changes: 22 additions & 1 deletion docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ etlhelper


etlhelper.row_factories
-----------------------
^^^^^^^^^^^^^^^^^^^^^^^

.. automodule:: etlhelper.row_factories
:members:
Expand All @@ -22,4 +22,25 @@ DB Helpers
^^^^^^^^^^^

.. automodule:: etlhelper.db_helpers
.. autoclass:: DbHelper
.. autoclass:: SQLiteDbHelper
.. autoclass:: PostgresDbHelper
.. autoclass:: OracleDbHelper
.. autoclass:: MSSQLDbHelper


.. _exceptions:

Exceptions
^^^^^^^^^^^

.. automodule:: etlhelper.exceptions
.. autoclass:: etlhelper.exceptions.ETLHelperError
.. autoclass:: etlhelper.exceptions.ETLHelperConnectionError
.. autoclass:: etlhelper.exceptions.ETLHelperQueryError
.. autoclass:: etlhelper.exceptions.ETLHelperDbParamsError
.. autoclass:: etlhelper.exceptions.ETLHelperExtractError
.. autoclass:: etlhelper.exceptions.ETLHelperInsertError
.. autoclass:: etlhelper.exceptions.ETLHelperAbort
.. autoclass:: etlhelper.exceptions.ETLHelperHelperError
.. autoclass:: etlhelper.exceptions.ETLHelperBadIdentifierError
9 changes: 9 additions & 0 deletions docs/demo_error.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
"""ETL Helper script to demonstrate an extract error."""
import sqlite3
import etlhelper as etl

db_file = "igneous_rocks.db"
select_sql = "SELECT * FROM bad_table"

with sqlite3.connect(db_file) as conn:
rows = etl.fetchall(select_sql, conn)
17 changes: 14 additions & 3 deletions docs/etl_functions/copy.rst
Original file line number Diff line number Diff line change
Expand Up @@ -75,15 +75,26 @@ auto-generated values via the INSERT query.
GROUP BY customer_id
"""
# This insert query uses positional parameters, so a namedtuple_row_factory
# is used.
insert_sql = """
INSERT INTO dest (customer_id, total_amount, loaded_by, load_time)
VALUES (%s, %s, current_user, now())
INSERT INTO dest (
customer_id,
total_amount,
loaded_by,
load_time)
VALUES (
%s,
%s,
current_user,
now()
)
"""
with ORACLEDB.connect("ORA_PASSWORD") as src_conn:
with POSTGRESDB.connect("PG_PASSWORD") as dest_conn:
copy_rows(select_sql, src_conn, insert_sql, dest_conn,
row_factory=namedtuple_row_factory) # insert_sql used positional parameters
row_factory=namedtuple_row_factory)
``parameters`` can be passed to the SELECT query as before and the
``commit_chunks``, ``chunk_size`` and ``on_error`` options can be set.
Expand Down
51 changes: 48 additions & 3 deletions docs/etl_functions/error_handling.rst
Original file line number Diff line number Diff line change
@@ -1,9 +1,37 @@
Error Handling
^^^^^^^^^^^^^^

This section describes exception classes and on_error functions.
This section describes Exception classes, ``on_error`` functions and error
handling via SQL.

logged errors

ETLHelperError
--------------

ETL Helper has a :ref:`variety of Exception classes <exceptions>`, all of which are subclasses
of the :class:`ETLHelperError <etlhelper.exceptions.ETLHelperError>` base class.

To aid debugging,
the :class:`ETLHelperQueryError <etlhelper.exceptions.ETLHelperQueryError>`,
:class:`ETLHelperExtractError <etlhelper.exceptions.ETLHelperExtractError>` and
:class:`ETLHelperInsertError <etlhelper.exceptions.ETLHelperInsertError>`
classes print the SQL query and the required paramstyle as well as the error
message returned by the database.

.. literalinclude:: ../demo_error.py
:language: python

The output is:

.. code:: bash
etlhelper.exceptions.ETLHelperExtractError: SQL query raised an error.
SELECT * FROM bad_table
Required paramstyle: qmark
no such table: bad_table
also handling errors in SQL e.g. ON CONFLICT

Expand Down Expand Up @@ -61,4 +89,21 @@ The IDs of failed rows can be written to a file.
``executemany``, ``load``, ``copy_rows`` and ``copy_table_rows`` can all
take an ``on_error`` parameter. They each return a tuple containing the
number of rows processed and the number of rows that failed.
number of rows processed and the number of rows that failed.


Error handling via SQL
----------------------

The ``on_error`` functions allow individual failed rows to be processed,
however this flexibility can come at the expense of speed.
Each chunk of data that contains a bad row will be retried on a row-by-row
basis.

Databases also have methods for handling errors e.g. duplicate primary keys
using SQL.
By customising an INSERT query (which can be programmatically generated with
:func:`generate_insert_query() <etlhelper.generate_insert_query>`) the database
can be instructed how to process such rows.

TODO: example script of ON CONFLICT ignore

0 comments on commit b727907

Please sign in to comment.