Skip to content

Commit

Permalink
[FEATURE] Exists function in DeltaTableStep should not log error (#70)
Browse files Browse the repository at this point in the history
… run black

<!--- Provide a general summary of your changes in the Title above -->

## Description
<!--- Describe your changes in detail -->
The `exists` function of the `DeltaTableStep` should notify the user
about the non existence of the table with a different log level
depending on the value of the `create_if_not_exists` boolean flag
(whereas now it is logging always with level error).
If the flag is True than the log level is set to info. If the flag is
False than the log level is set to error.

A possible alternative is to set the log level always to info or warning
but this solution proposed already by the reporter looks more
informative.

## Related Issue
<!--- This project only accepts pull requests related to open issues -->
<!--- If suggesting a new feature or change, please discuss it in an
issue first -->
<!--- If fixing a bug, there should be an issue describing it with steps
to reproduce -->
<!--- Please link to the issue here: -->
#34 

## Motivation and Context
<!--- Why is this change required? What problem does it solve? -->
Improving logging by making more informative and adeguate to the use
case.

## How Has This Been Tested?
<!--- Please describe in detail how you tested your changes. -->
<!--- Include details of your testing environment, and the tests you ran
to -->
<!--- see how your change affects other areas of the code, etc. -->
By adding unit tests that verify the log message contains a certain
string.

## Screenshots (if appropriate):

## Types of changes
<!--- What types of changes does your code introduce? Put an `x` in all
the boxes that apply: -->
- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)

## Checklist:
<!--- Go over all the following points, and put an `x` in all the boxes
that apply. -->
<!--- If you're unsure about any of these, don't hesitate to ask. We're
here to help! -->
- [x] My code follows the code style of this project.
- [ ] My change requires a change to the documentation.
- [ ] I have updated the documentation accordingly.
- [x] I have read the **CONTRIBUTING** document.
- [x] I have added tests to cover my changes.
- [ ] All new and existing tests passed.

---------

Co-authored-by: Danny Meijer <[email protected]>
  • Loading branch information
femilian-6582 and dannymeijer authored Oct 4, 2024
1 parent 4a6a1d7 commit ecc8b5b
Show file tree
Hide file tree
Showing 2 changed files with 25 additions and 4 deletions.
13 changes: 11 additions & 2 deletions src/koheesio/spark/delta.py
Original file line number Diff line number Diff line change
Expand Up @@ -295,16 +295,25 @@ def has_change_type(self) -> bool:

@property
def exists(self) -> bool:
"""Check if table exists"""
"""Check if table exists.
Depending on the value of the boolean flag `create_if_not_exists` a different logging level is provided."""
result = False

try:
self.spark.table(self.table_name)
result = True
except AnalysisException as e:
err_msg = str(e).lower()
common_message = (
f"Table `{self.table}` doesn't exist. "
f"The `create_if_not_exists` flag is set to {self.create_if_not_exists}."
)

if err_msg.startswith("[table_or_view_not_found]") or err_msg.startswith("table or view not found"):
self.log.error(f"Table `{self.table}` doesn't exist.")
if self.create_if_not_exists:
self.log.info(" ".join((common_message, "Therefore the table will be created.")))
else:
self.log.error(" ".join((common_message, "Therefore the table will not be created.")))
else:
raise e

Expand Down
16 changes: 14 additions & 2 deletions tests/spark/test_delta.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import os
import logging
from pathlib import Path
from unittest.mock import patch

Expand All @@ -14,7 +15,7 @@

pytestmark = pytest.mark.spark

log = LoggingFactory.get_logger(name="test_delta")
log = LoggingFactory.get_logger(name="test_delta", inherit_from_koheesio=True)


@pytest.mark.parametrize(
Expand Down Expand Up @@ -138,8 +139,19 @@ def test_delta_table_properties_dbx():

@pytest.mark.parametrize("value,expected", [("too.many.dots.given.to.be.a.real.table", pytest.raises(ValidationError))])
def test_table_failed(value, expected):
with pytest.raises(ValidationError):
with expected:
DeltaTableStep(table=value)

dt = DeltaTableStep(table="unknown_table")
assert dt.exists is False


@pytest.mark.parametrize(
["table", "create_if_not_exists", "log_level"], [("unknown", False, "DEBUG"), ("unknown", True, "INFO")]
)
def test_exists(caplog, table, create_if_not_exists, log_level):
with caplog.at_level(log_level):
dt = DeltaTableStep(table=table, create_if_not_exists=create_if_not_exists)
dt.log.setLevel(log_level)
assert dt.exists is False
assert f"The `create_if_not_exists` flag is set to {create_if_not_exists}." in caplog.text

0 comments on commit ecc8b5b

Please sign in to comment.