Our J-Divergence test is under the next null hypothesis
H0: The predictive power of the variable is not significant.
The null hypothesis is tested using a two-tailed distribution, and this should be taken into consideration when interpreting the p-value.
Optimize your machine learning models with 'Statistical-IV'. Perform automated feature selection based on statistics and customize error control.
-
Import package
from statistical_iv import api
-
Provide a DataFrame as Input:
- Supply a DataFrame
df
containing your data for IV calculation.
- Supply a DataFrame
-
Specify Predictor Variables:
- Prived a list of predictor variable names (
variables_names
) to analyze.
- Prived a list of predictor variable names (
-
Define the Target Variable:
- Specify the name of the target variable (
var_y
) in your DataFrame.
- Specify the name of the target variable (
-
Indicate Variable Types:
- Define the type of your predictor variables as 'categorical' or 'numerical' using the
type_vars
parameter.
- Define the type of your predictor variables as 'categorical' or 'numerical' using the
-
Optional: Set Maximum Bins:
- Adjust the maximum number of bins for discretization (optional) using the
max_bins
parameter.
- Adjust the maximum number of bins for discretization (optional) using the
-
Call the
statistical_iv
Function:- Calculate Statistical IV information by calling the
statistical_iv
function from api with the specified parameters (That is used for OptimalBinning package).
result_df = api.statistical_iv(df, variables_names, var_y, type_vars, max_bins)
- Calculate Statistical IV information by calling the
For a comprehensive exploration of the topic, we recommend perusing the contents of the article available at this link.