You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In a recent infra meeting we were discussing difficulties with enabling full gui based exploration of tables in superset, as is available in dataset. Instead, to explore an arbitrary table, the default path is to use sql, and we know many users don't feel comfortable with sql. While thinking about how we might work around this, it occurred to me that maybe this isn't actually an important feature?
I realized that users who don't want to use sql also don't want to explore hundreds of tables with loads of columns. For these users I believe there's a level of curation necessary that's not satisfied by having drop down filters. Even better docs/ data dictionary (while super valuable) won't make our data approachable for these users. Instead, I think these users would benefit from a deeper level of curation which goes beyond our existing output tables.
Proposal
Curated tables
Rather than view superset as a gui interface to our data warehouse, I think we should view it as a place to collect data and analyzes that provide this deeper curation. We could use sql within superset to create tables that users can download as csv's. These tables would go beyond existing output tables in the extent that we are willing to manipulate the raw data with analyzes, cleaning, aggregations, etc. Any table we make available here should be easily usable within excel or with basic python skills.
Cultivating external contribution
Creating new tables in superset could be a great path for external contribution. We know many users don't know sql and don't have a lot of time/energy to learn new technical skills, but users who will become contributors are inherently the most highly motivated users, and learning a little sql is easier than learning dagster/data engineering concepts necessary to interact with pudl. The tables we create will also serve as examples that we can pair with tutorials to make getting started manageable.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Background
In a recent infra meeting we were discussing difficulties with enabling full gui based exploration of tables in superset, as is available in dataset. Instead, to explore an arbitrary table, the default path is to use sql, and we know many users don't feel comfortable with sql. While thinking about how we might work around this, it occurred to me that maybe this isn't actually an important feature?
I realized that users who don't want to use sql also don't want to explore hundreds of tables with loads of columns. For these users I believe there's a level of curation necessary that's not satisfied by having drop down filters. Even better docs/ data dictionary (while super valuable) won't make our data approachable for these users. Instead, I think these users would benefit from a deeper level of curation which goes beyond our existing output tables.
Proposal
Curated tables
Rather than view superset as a gui interface to our data warehouse, I think we should view it as a place to collect data and analyzes that provide this deeper curation. We could use sql within superset to create tables that users can download as csv's. These tables would go beyond existing output tables in the extent that we are willing to manipulate the raw data with analyzes, cleaning, aggregations, etc. Any table we make available here should be easily usable within excel or with basic python skills.
Cultivating external contribution
Creating new tables in superset could be a great path for external contribution. We know many users don't know sql and don't have a lot of time/energy to learn new technical skills, but users who will become contributors are inherently the most highly motivated users, and learning a little sql is easier than learning dagster/data engineering concepts necessary to interact with pudl. The tables we create will also serve as examples that we can pair with tutorials to make getting started manageable.
Beta Was this translation helpful? Give feedback.
All reactions