Data Inspector is an open-source python library that brings 15 types of different functions to make EDA, data cleaning easier.
Data Inspector brings 15++ essential exploratory data analysis, data cleaning automations to make a dataset understandable. This is a perfect tool to get started with you data.
Added regplots in the library
pip install data-inspector
Package available at https://pypi.org/project/data-inspector/
- Line plot :
line_plot(data, x_data, y_data, x_label="", y_label="", title="")
- Skew feature:
plot_skewed_feature(data, column)
- Showing data distribution:
show_distribution(data, column)
- Scatter plot:
plot_scatter(data,x_data, y_data)
- Correlation plot:
plot_correlation(data)
- Create histogram:
histogram(data,column, x_label, y_label, title)
- Create bar plot:
plot_bar(data, column, xlabel, ylabel, title)
- Create boxplots of all features:
box_plot(data)
- Checking dataset's shape:
datasetShape(data)
- Get dataset's diagnostic plots:
diagnostic_plots(data, variable)
- Divide numerical and categorical features:
divideFeatures(data)
- Fill NaN values:
fillNan(data, column, value)
- Get pearson's correlation between two variables:
get_correlation(column_1, column_2, data)
- Plotting kde plots:
plot_cont_kde(data, var)
- Automatic calculating the missing values and their percentage along with visualization :
calculating_missing_values(data)
- Regression plot with 95% CI :
plot_regplot(data,x_data, y_data)
Link: https://github.com/AmitHasanShuvo/data-inspector/blob/main/notebook/example%20notebook.ipynb
Colab link: https://colab.research.google.com/drive/1mj9gz2XyQprSYdKMUKlKkJ9Qi8XmleHW?usp=sharing
@online{data-inspector,
title={data-inspector},
url={https://pypi.org/project/data-inspector/},
urldate = {2021-08-21},
publisher={Kazi Amit Hasan}
}
- Add some automations for time series data.
Any contribution would be highly appreciated. Kindly go through the guidelines for contributing in github.