Develop CSV importer #4

alanbernstein · 2017-03-09T21:05:45Z

For the use case work, we put together a CSV import system that is specific to the two use cases, but lays some groundwork for working with more general data sources. The scope is limited to well-formatted, well-defined tabular data, so users will be responsible for providing clean data.

bruth · 2018-08-12T18:11:50Z

Mind sharing those use cases and how a CSV file would map to the structure of an index?

alanbernstein · 2018-08-13T03:10:09Z

The mapping for relational data is outlined in our docs at https://www.pilosa.com/docs/latest/data-model/#relational-analogy, and we have a few use case writeups at https://www.pilosa.com/use-cases/. I believe the two referenced in this ticket are transportation and network traffic. Note that these pages are overdue for some updates; you can see up to date PDK use case code in the repo: https://github.com/pilosa/pdk/tree/master/usecase.

bruth · 2018-08-13T12:48:29Z

Thanks. I found the table in the Python notebook you put together helpful as well as the suggestion for binning strategies. The general recommendation for row IDs is that they are contiguous to optimize the bitmap compression (via roaring)? Is this handled if a field is created that supports keys?

jaffee · 2018-08-24T16:32:15Z

@bruth it isn't as crucial that row IDs be continuous, but column IDs should be as close to continuous as possible. It is handled if you use keys.

alanbernstein mentioned this issue Mar 9, 2017

Develop CSV importer FeatureBaseDB/featurebase#389

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Develop CSV importer #4

Develop CSV importer #4

alanbernstein commented Mar 9, 2017

bruth commented Aug 12, 2018

alanbernstein commented Aug 13, 2018

bruth commented Aug 13, 2018

jaffee commented Aug 24, 2018

Develop CSV importer #4

Develop CSV importer #4

Comments

alanbernstein commented Mar 9, 2017

bruth commented Aug 12, 2018

alanbernstein commented Aug 13, 2018

bruth commented Aug 13, 2018

jaffee commented Aug 24, 2018