Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Add AzureSQL source and Prefect tasks #1043

Merged
merged 51 commits into from
Oct 4, 2024
Merged

✨ Add AzureSQL source and Prefect tasks #1043

merged 51 commits into from
Oct 4, 2024

Conversation

adrian-wojcik
Copy link
Contributor

@adrian-wojcik adrian-wojcik commented Sep 20, 2024

Summary

Adds SQLServer connector

Checklist

This PR:

  • follows the guidelines laid out in CONTRIBUTING.md
  • links relevant issue(s)
  • adds/updates tests (if appropriate)
  • adds/updates docstrings (if appropriate)
  • adds an entry in CHANGELOG.md

Copy link
Contributor

@trymzet trymzet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrian-wojcik I think you've mixed up Aselite stuff with AzureSQL stuff. Could you pls create separate PRs for each?

@adrian-wojcik
Copy link
Contributor Author

@trymzet I did not mixed up. As you can see in viadot 1 Aselite task is using AzureSQL source only.

@trymzet
Copy link
Contributor

trymzet commented Sep 23, 2024

@adrian-wojcik ok then can you rename this accordingly? Ie. the tasks should and flows should have the name of the source in them, eg. azure_sql_to_df(), etc.

@adrian-wojcik
Copy link
Contributor Author

We were discussing it with Rafał and as Aselite is using AzureSQL which inherited from SQLServer, probably good idea is to remove in future this connector and move logic from AzureSQL Source to SQLServer Source, so all aselite deployments would use simply sql_server_to_adls flow. But as for now we believe the name convention can stay as it is, as it is temporary solution.

@trymzet
Copy link
Contributor

trymzet commented Sep 25, 2024

@adrian-wojcik

  1. I don't think this makes sense - Azure SQL is not the same as SQL Server; it's a layer on top of SQL Server. That's why it inherits from it. It won't work the other way around, and neither will adding AzureSQL functionality into SQLServer (since SQL server - the database - doesn't have this functionality).

  2. As for this code, you're adding tasks and flows for Azure SQL, so they should be named accordingly. I don't see how this is related to Aselite - there is no Aselite source in here. Maybe you're confusing the library with how you're using it? I think it's just unnecessarily making things less generic for no reason. These tasks and flows work for any Azure SQL db, not just for Aselite (whatever that is), so there's no reason to use the less generic naming. What if I now want to extract data from Azure SQL? I need to create another task & flow, which will be exactly the same as yours, or use your aselite_to_df() task and aselite_to_adls() flow, which would be quite confusing.

    If you wanted to use the Aselite name in here, you should create an Aselite source - but there's no point since it's literally just Azure SQL. See my point?

@adrian-wojcik
Copy link
Contributor Author

@trymzet Yes, I see your point. Names are changed. I will let Fabio know to change it in flow deploments as well.

Copy link
Contributor

@trymzet trymzet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added my comments

tests/unit/test_aselite.py Outdated Show resolved Hide resolved
tests/unit/test_aselite.py Outdated Show resolved Hide resolved
tests/unit/test_aselite.py Outdated Show resolved Hide resolved
tests/unit/test_aselite.py Outdated Show resolved Hide resolved
tests/unit/test_aselite.py Outdated Show resolved Hide resolved
@trymzet trymzet changed the title Aselite 2.0 ✨ Add AzureSQL source and Prefect tasks Sep 27, 2024
@trymzet
Copy link
Contributor

trymzet commented Sep 30, 2024

@adrian-wojcik I think all conversations are resolved, can you rebase, make sure CI checks pass, and re-request review?

@Rafalz13
Copy link
Contributor

There is only 1 pre-commit error left but couldn't fix it. I added noqa and pragma:.... It didn't help.
image

@trymzet
Copy link
Contributor

trymzet commented Sep 30, 2024

@Rafalz13 I fixed the pre-commit checks for you. Pls fix the tests and rebase and it should be good to go.

@Rafalz13 Rafalz13 requested a review from trymzet October 1, 2024 12:55
@Rafalz13
Copy link
Contributor

Rafalz13 commented Oct 1, 2024

Done. There is a risk that bulk_insert will not work because self.credentials['data_source']', is missing in credentials but as I know this option is not used. The task uses only run from the base SQL class.

Copy link
Contributor

@trymzet trymzet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some last comments

src/viadot/orchestration/prefect/tasks/azure_sql.py Outdated Show resolved Hide resolved
tests/unit/test_azure_sql.py Outdated Show resolved Hide resolved
@Rafalz13 Rafalz13 requested a review from trymzet October 2, 2024 14:59
src/viadot/utils.py Outdated Show resolved Hide resolved
src/viadot/sources/azure_sql.py Outdated Show resolved Hide resolved
@trymzet trymzet self-requested a review October 4, 2024 09:17
@trymzet trymzet merged commit 4fe8913 into 2.0 Oct 4, 2024
3 checks passed
@trymzet trymzet deleted the aselite_2.0 branch October 4, 2024 09:40
@trymzet trymzet added the 2.0 viadot 2.0 label Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.0 viadot 2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants