Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for CREATE EXTERNAL TABLE #96

Open
djouallah opened this issue Jul 4, 2023 · 4 comments
Open

add support for CREATE EXTERNAL TABLE #96

djouallah opened this issue Jul 4, 2023 · 4 comments

Comments

@djouallah
Copy link

currently only temporary external tables are supported, it will be nice to remove that limitation please

@vogelsgesang
Copy link
Contributor

From an implementation side, what you are asking for is straightforward. In fact, it was already discussed a couple of times internally. Unfortunately, there is more than just implementation to this feature. In particular, usability and security pose challenges.

The main issues currently are:

  1. If you move a .hyper file around, how does Hyper locate the external files? Through paths relative to the Hyper file? Absolute file paths?
  2. Should there be a way to package external files together with the .hyper file? E.g., when uploading it to Tableau Cloud.
  3. If you send a Hyper file via email, and some other person opens it, should Hyper read whichever external files are specified in the .hyper file? What if someone maliciously added an external table which reads /etc/passwd as an external CSV or some other sensitive data?
  4. What if you upload a Hyper file to Tableau Cloud? Should that file be allowed to instruct Hyper to read /etc/password and display it as part of some visualization?

While the answer for /etc/password is clearly a "no, this should not be allowed", it's hard to draw the line here

@djouallah
Copy link
Author

1- absolute Path
2- no, that fail the purpose of an external table, the data has to be in a shared storage
3- that's not Hyper fault if someone store sensitive data without encryption, moreover only the user can see it anyway, but I am not a security expert
4- an option in tableau cloud to block reading from internal data

my use case is reading parquet files from remote storage, which I think is a very common pattern those days with lakehouse and stuff :)

thanks a lot for your reply.

@rferraton
Copy link

Agree with @djouallah, permanent external table (and views) is missing in Hyper.
To answer your questions :
1 : both ( relative and absolute)
2 : external mean external so no packing external data into the hyper file
3 - limit external file extensions to csv, and parquet
4 - limit external file extensions should manage the problem. Limit the number of files in globs to 1000. Limit may be also on directories ( no /etc no /usr/,no /opt, no c:\windows, c:\program files....)

Do this security concerned cannot be blocked also by à security tool (edr or antivirus) on the Tableau cloud clusters ?

Object storage is great but there can be also fast parallel remote filesystems like pnfs or lustre that also provide excellent performance to access remote data....

@djouallah
Copy link
Author

recently DuckDB added an option to turn off reading from a local filesystem, I guess you guys can do the same for Tableau cloud, turn it off by default for security reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants