Skip to content

# [0.10] Performance Optimization, Enhanced Preprocessing, and much more!

Latest
Compare
Choose a tag to compare
@ombhojane ombhojane released this 13 Oct 07:16
· 55 commits to main since this release
eaaefb0

ExplainableAI v0.10 introduces significant performance improvements, enhanced data preprocessing capabilities, and a more robust logging system.

New Features

Dask Integration for Large Datasets

  • Added support for Dask DataFrames to handle larger-than-memory datasets efficiently.
  • Implemented _preprocess_data_dask method for parallel data preprocessing.

Enhanced analyze Function

  • Added support for batch processing and parallel execution:
    • batch_size: Allows processing of large datasets in smaller chunks. Default is None (process all data at once).
    • parallel: Enables parallel processing of batches using multiprocessing. Default is False.
    • instance_index: Specifies the index of a particular instance for detailed interpretation. Default is 0.

Enhanced Logging

  • Implemented a more comprehensive logging system using Python's logging module.
  • Added colorized console output for better readability using the colorama library.

Expanded Documentation

  • Created a new /doc directory for additional documentation:
    • API reference guide
    • User guide with detailed explanations and best practices
    • Installation and setup instructions

Use cases

  • Added an /examples directory showcasing various use cases:
    • Small code snippets for quick start
    • Comprehensive examples of ExplainableAI in larger projects
    • Jupyter notebooks demonstrating step-by-step analysis

Improvements

Core Functionality

  • Refactored XAIWrapper class for improved performance and modularity.
  • Enhanced error handling and added more informative error messages.

Data Preprocessing

  • Improved categorical and numerical feature handling in the preprocessing pipeline.
  • Added support for handling missing values and outliers.

Model Comparison

  • Enhanced model comparison functionality with more detailed metrics.
  • Improved selection of the best model based on cross-validation scores.

Visualization

  • Added new visualization options, including correlation heatmaps.
  • Improved existing plots for better interpretability.

Report Generation

  • Enhanced PDF report generation with more customizable options.
  • Added ability to selectively include sections in the generated report.

Exploratory Data Analysis (EDA)

  • Implemented a new perform_eda method in XAIWrapper for quick dataset insights.
  • Added correlation analysis and outlier detection to EDA process.

Bug Fixes

  • Fixed issues related to feature importance calculation and visualization.
  • Resolved compatibility issues with the latest versions of dependencies.

Performance Optimization

  • Implemented more efficient data handling techniques for large datasets.
  • Optimized SHAP value calculations and other computationally intensive operations.

Installation

pip install explainableai==0.10

Usage

from explainableai import XAIWrapper
import pandas as pd

# Load your dataset
df = pd.read_csv('your_dataset.csv')
X = df.drop(columns=['target_column'])
y = df['target_column']

# Initialize XAIWrapper
xai = XAIWrapper()

# Fit and analyze models
xai.fit(models, X, y)
results = xai.analyze(batch_size=100, parallel=False, instance_index=0)

# Generate a comprehensive report
xai.generate_report('analysis_report.pdf')

# Make and explain predictions
new_data = {...} # Dictionary of feature values
prediction, probabilities, explanation = xai.explain_prediction(new_data)

Analyze with batch processing and parallel execution

This will:

  • Process the data in batches of 1000 samples
  • Use parallel processing for faster computation
  • Provide detailed interpretation for the 43rd instance (0-based index)
xai = XAIWrapper()
xai.fit(models, X, y)

results = xai.analyze(batch_size=1000, parallel=True, instance_index=42)

Breaking Changes

  • The analyze method now supports batch processing and parallel execution options.
  • Some internal method signatures have been updated to accommodate new features.

We encourage users to update to this version for improved performance and new capabilities. As always, please report any issues or suggestions through our GitHub issue tracker.

For more detailed information, please refer to the documentation in the /doc directory and explore the explainableai usecases in the /examples directory.