You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For testing and developing functions it would be nice if Local and Spark Arrays had identical functions. This might be easiest by a faking SparkContext and RDD and using BoltArraySpark for everything.
The text was updated successfully, but these errors were encountered:
We thought about trying something similar a while back. Recently though, we have been leaning more toward removing the local version of the BoltArray entirely and making the project completely about a distributed ndarray. The thought was that this might be best in terms of modularity -- then a higher level project could wrap the NumPy array, Bolt array, and maybe even the Dask array.
Would be interested to hear your thoughts on this.
@jwittenbach That seems like it might be a good direction to go in. I started on a localspark implementation that simply faked the Spark backend so commands like chunk and unchunk work the same with or without pyspark (https://github.com/bolt-project/bolt/pull/101/files)
Another approach would be to have a generic ParallelBackend for BoltArrays that could be implemented in Spark, Flink, Multiprocessing, etc and allow the operations to all be done the same way.
For testing and developing functions it would be nice if Local and Spark Arrays had identical functions. This might be easiest by a faking SparkContext and RDD and using BoltArraySpark for everything.
The text was updated successfully, but these errors were encountered: