Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement file_name_mappers in R sparkly API #1601

Open
piotrszul opened this issue Aug 9, 2023 · 0 comments
Open

Implement file_name_mappers in R sparkly API #1601

piotrszul opened this issue Aug 9, 2023 · 0 comments
Labels
new feature New feature or request

Comments

@piotrszul
Copy link
Collaborator

Functions such as ptl_read_ndjson should support file_name_mapper to allow flexible mapping of filenames to resource types.

sparkly java interface however does not provide any standard mechanism for R callback functions (i.e. calling back R code from JVM) in the same manner they are supported by Py4J.

It might be possible to adapt the code used in spark_apply() although this may require digging deep into sparklyr implementation and may make it very coupled with this implementation.

A better approach may be to implement an explicit mapper in Java, that explicitly maps all the files names to it's resources and then construct it in R using an R lambda. That would also require an interface from R to list all the files a directory described by spark supported filesystem URL.

@github-project-automation github-project-automation bot moved this to Backlog in Pathling Sep 2, 2023
@johngrimes johngrimes added the new feature New feature or request label Sep 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature New feature or request
Projects
Status: Backlog
Development

No branches or pull requests

2 participants