Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1726519: Full schema introspection breaks writing if unsupported datatype exists in the schema #534

Open
fmcardoso opened this issue Oct 7, 2024 · 1 comment
Labels
bug Something isn't working status-triage_done Initial triage done, will be further handled by the driver team

Comments

@fmcardoso
Copy link

Please answer these questions before submitting your issue. Thanks!

  1. What version of Python are you using?

    3.10.9

  2. What operating system and processor architecture are you using?

    macOS-14.6.1-arm64-arm-64bit

  3. What are the component versions in the environment (pip freeze)?

aiobotocore==2.4.2
aiohttp==3.9.5
aioitertools==0.11.0
aiosignal==1.3.1
alabaster==0.7.16
alembic==1.13.1
altair==5.3.0
aniso8601==9.0.1
annotated-types==0.7.0
anyio==4.4.0
appdirs==1.4.4
appnope==0.1.4
argcomplete==3.1.6
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asn1crypto==1.5.1
asttokens==2.4.1
astunparse==1.6.3
async-lru==2.0.4
async-timeout==4.0.3
attrs==23.2.0
Babel==2.15.0
backoff==2.2.1
bcrypt==4.1.3
beautifulsoup4==4.12.3
bleach==6.1.0
blinker==1.8.2
bokeh==2.4.3
boto3==1.24.59
botocore==1.27.59
build==1.2.2
CacheControl==0.14.0
cachetools==5.3.3
category-encoders==2.6.3
certifi==2024.8.30
cffi==1.16.0
cfgv==3.4.0
charset-normalizer==3.3.2
cleo==2.1.0
click==8.1.7
cloudpickle==3.0.0
colorama==0.4.6
comm==0.2.2
commitizen==3.9.1
contourpy==1.2.1
coverage==7.5.3
crashtest==0.4.1
cryptography==43.0.1
cycler==0.12.1
debugpy==1.8.1
decli==0.6.2
decorator==5.1.1
defusedxml==0.7.1
Deprecated==1.2.14
deprecation==2.1.0
dill==0.3.8
distlib==0.3.8
distro==1.9.0
docker==7.1.0
docutils==0.21.2
duckdb==0.10.0
duckdb_engine==0.11.1
dulwich==0.21.7
dynaconf==3.2.6
entrypoints==0.4
evidently==0.4.37
exceptiongroup==1.2.1
execnb==0.1.6
execnet==2.1.1
executing==2.0.1
Faker==28.4.1
fastcore==1.5.46
fastjsonschema==2.20.0
filelock==3.15.1
Flask==3.0.3
fonttools==4.53.0
fqdn==1.5.1
frozenlist==1.4.1
fsspec==2022.11.0
future==1.0.0
ghapi==1.0.5
ghp-import==2.1.0
gitdb==4.0.11
GitPython==3.1.43
graphene==3.3
graphql-core==3.2.3
graphql-relay==3.2.0
griffe==0.46.0
gunicorn==22.0.0
h11==0.14.0
httpcore==1.0.5
httptools==0.6.1
httpx==0.27.0
huggingface-hub==0.17.3
hyperopt==0.2.7
hypothesis==6.103.2
hypothesis-jsonschema==0.23.1
identify==2.5.36
idna==3.7
imagesize==1.4.1
imbalanced-learn==0.12.3
imblearn==0.0
importlib-metadata==6.11.0
iniconfig==2.0.0
installer==0.7.0
ipykernel==6.29.4
ipython==8.25.0
ipython-genutils==0.2.0
ipython-sql==0.4.1
ipywidgets==8.1.3
isoduration==20.11.0
iterative-telemetry==0.0.8
itsdangerous==2.2.0
jaraco.classes==3.4.0
jedi==0.19.1
Jinja2==3.1.4
jmespath==1.0.1
joblib==1.4.2
json5==0.9.25
jsonpointer==3.0.0
jsonschema==4.22.0
jsonschema-specifications==2023.12.1
jupyter==1.0.0
jupyter-cache==1.0.0
jupyter-console==6.6.3
jupyter-events==0.10.0
jupyter-lsp==2.2.5
jupyter_client==8.6.2
jupyter_core==5.7.2
jupyter_server==2.14.1
jupyter_server_terminals==0.5.3
jupyterlab==4.2.2
jupyterlab_pygments==0.3.0
jupyterlab_server==2.27.2
jupyterlab_widgets==3.0.11
jupytext==1.16.2
keyring==24.3.1
kiwisolver==1.4.5
litestar==2.11.0
llvmlite==0.43.0
loguru==0.7.2
lxml==5.2.2
Mako==1.3.5
Markdown==3.6
markdown-it-py==3.0.0
MarkupSafe==2.1.5
matplotlib==3.9.0
matplotlib-inline==0.1.7
mdit-py-plugins==0.4.1
mdurl==0.1.2
mergedeep==1.3.4
mistune==3.0.2
mkdocs==1.6.0
mkdocs-autorefs==1.0.1
mkdocs-gen-files==0.5.0
mkdocs-get-deps==0.2.0
mkdocs-git-revision-date-localized-plugin==1.2.6
mkdocs-jupyter==0.24.7
mkdocs-material==9.5.27
mkdocs-material-extensions==1.3.1
mkdocstrings==0.25.1
mkdocstrings-python==1.10.3
mlflow==2.14.0
mlflow-skinny==2.14.0
more-itertools==10.5.0
moto==4.2.14
mpmath==1.3.0
msgpack==1.0.8
msgspec==0.18.6
multidict==6.0.5
multimethod==1.10
mypy-extensions==1.0.0
myst-nb==1.1.0
myst-parser==3.0.1
nbclient==0.10.0
nbconvert==7.16.4
nbdev==2.3.25
nbformat==5.10.4
nest-asyncio==1.6.0
networkx==3.3
nltk==3.9.1
nodeenv==1.9.1
notebook==7.2.1
notebook_shim==0.2.4
numba==0.60.0
numpy==1.26.4
opentelemetry-api==1.25.0
opentelemetry-sdk==1.25.0
opentelemetry-semantic-conventions==0.46b0
overrides==7.7.0
packaging==24.1
paginate==0.5.6
pandarallel==1.6.5
pandas==1.5.3
pandas-datareader==0.10.0
pandas-stubs==2.2.2.240603
pandera==0.19.3
pandocfilters==1.5.1
parso==0.8.4
pathspec==0.12.1
patsy==0.5.6
pexpect==4.9.0
pillow==10.3.0
pkginfo==1.11.1
platformdirs==4.2.2
plotly==5.18.0
pluggy==1.5.0
poetry==1.8.3
poetry-core==1.9.0
poetry-plugin-export==1.8.0
polyfactory==2.16.2
pprintpp==0.4.0
pre-commit==3.7.1
prettytable==0.7.2
prometheus_client==0.20.0
prompt_toolkit==3.0.47
protobuf==4.25.3
psutil==5.9.8
psycopg2-binary==2.9.9
ptyprocess==0.7.0
pure-eval==0.2.2
py-partiql-parser==0.5.0
py4j==0.10.9.7
pyarrow==14.0.2
pycountry==24.6.1
pycountry-convert==0.7.2
pycparser==2.22
pydantic==2.7.4
pydantic-settings==2.3.3
pydantic_core==2.18.4
pydeck==0.9.1
Pygments==2.18.0
PyJWT==2.8.0
pymdown-extensions==10.8.1
pyOpenSSL==24.1.0
pyparsing==3.1.2
pyproject_hooks==1.1.0
pyprojroot==0.3.0
pytest==7.4.4
pytest-cov==4.1.0
pytest-mock==3.14.0
pytest-sugar==1.0.0
pytest-xdist==3.6.1
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-gitlab==4.6.0
python-json-logger==2.0.7
pytz==2024.1
PyYAML==6.0.1
pyyaml_env_tag==0.1
pyzmq==26.0.3
qtconsole==5.5.2
QtPy==2.4.1
querystring-parser==1.2.4
questionary==1.10.0
rapidfuzz==3.9.7
ray==2.20.0
referencing==0.35.1
regex==2024.5.15
repoze.lru==0.7
requests==2.32.3
requests-toolbelt==1.0.0
responses==0.25.3
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rich==13.7.1
rich-click==1.8.3
rpds-py==0.18.1
ruff==0.4.9
s3fs==2022.11.0
s3transfer==0.6.2
safetensors==0.4.3
scikit-learn==1.2.2
scipy==1.13.1
seaborn==0.13.2
Send2Trash==1.8.3
sentence-transformers==2.7.0
setuptools-scm==8.1.0
shap==0.42.1
shellingham==1.5.4
six==1.16.0
slicer==0.0.7
smart-open==6.4.0
smmap==5.0.1
sniffio==1.3.1
snowballstemmer==2.2.0
snowflake-connector-python==3.10.1
snowflake-sqlalchemy==1.5.3
sortedcontainers==2.4.0
soupsieve==2.5
Sphinx==7.3.7
sphinxcontrib-applehelp==1.0.8
sphinxcontrib-devhelp==1.0.6
sphinxcontrib-htmlhelp==2.0.5
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.7
sphinxcontrib-serializinghtml==1.1.10
SQLAlchemy==1.4.52
sqlparse==0.5.0
stack-data==0.6.3
statsmodels==0.14.2
streamlit==1.35.0
sympy==1.12.1
tabulate==0.9.0
tenacity==8.4.2
tensorboardX==2.6.2.2
termcolor==2.4.0
terminado==0.18.1
testcontainers==3.7.1
threadpoolctl==3.5.0
tinycss2==1.3.0
tokenizers==0.15.2
toml==0.10.2
tomli==2.0.1
tomlkit==0.12.5
toolz==0.12.1
torch==2.3.1
tornado==6.4.1
tqdm==4.66.4
traitlets==5.14.3
transformers==4.35.2
trove-classifiers==2024.7.2
typeguard==4.3.0
typer==0.12.5
types-python-dateutil==2.9.0.20240316
types-pytz==2024.1.0.20240417
types-PyYAML==6.0.12.20240311
types-requests==2.31.0.6
types-setuptools==70.0.0.20240524
types-urllib3==1.26.25.14
typing-inspect==0.9.0
typing_extensions==4.12.2
ujson==5.10.0
ulid-py==1.1.0
uri-template==1.3.0
urllib3==1.26.20
uvicorn==0.30.6
uvloop==0.20.0
virtualenv==20.26.2
watchdog==4.0.1
watchfiles==0.24.0
wcwidth==0.2.13
webcolors==24.6.0
webencodings==0.5.1
websocket-client==1.8.0
websockets==13.0.1
Werkzeug==3.0.3
widgetsnbextension==4.0.11
wrapt==1.16.0
xattr==1.1.0
xgboost==2.0.3
xmltodict==0.13.0
yarl==1.9.4
zipp==3.19.2

  1. What did you do?

Writing to Snowflake through pandas.DataFrame.to_sql when a table with datatype VECTOR exists in the schema.

  1. What did you expect to see? What should have happened and what happened instead?

'NullType' object is not callable error is triggered, and it is impossible to write any table through pandas.DataFrame.to_sql while a table with VECTOR datatype exists.

If a table exists in the Snowflake schema having a column with datatype VECTOR, any writing to snowflake using pandas.DataFrame.to_sql will fail due to introspection. As the method get_columns reads the whole schema and there is a column of usuported datatype, method _get_schema_columns will trigger a error here.

One possible solution would be to pass the table to the query built in _get_schema_columns and avoid doing the filter only at the method return statement, if the whole table introspection is not necessary.

  1. Can you set logging to DEBUG and collect the logs?

I believe the log is not necessary for this issue.

@fmcardoso fmcardoso added bug Something isn't working needs triage labels Oct 7, 2024
@github-actions github-actions bot changed the title Full schema introspection breaks writing if unsupported datatype exists in the schema SNOW-1726519: Full schema introspection breaks writing if unsupported datatype exists in the schema Oct 7, 2024
@sfc-gh-dszmolka
Copy link
Contributor

hi - thank you for raising this, and the detailed analysis. VECTOR is documented to be unsupported (for now) but I fully agree it shouldn't break anything still. We'll take a look.

@sfc-gh-dszmolka sfc-gh-dszmolka added status-triage_done Initial triage done, will be further handled by the driver team and removed needs triage labels Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working status-triage_done Initial triage done, will be further handled by the driver team
Projects
None yet
Development

No branches or pull requests

2 participants