Read my final report here!
The non-profit foundation Alphabet Soup wants to create an algorithm to predict whether or not applicants for funding will be successful. With your knowledge of machine learning and neural networks, you’ll use the features in the provided dataset to create a binary classifier that is capable of predicting whether applicants will be successful if funded by Alphabet Soup.
From Alphabet Soup’s business team, you have received a CSV containing more than 34,000 organizations that have received funding from Alphabet Soup over the years. Within this dataset are a number of columns that capture metadata about each organization, such as the following:
- EIN and NAME—Identification columns
- APPLICATION_TYPE—Alphabet Soup application type
- AFFILIATION—Affiliated sector of industry
- CLASSIFICATION—Government organization classification
- USE_CASE—Use case for funding
- ORGANIZATION—Organization type
- STATUS—Active status
- INCOME_AMT—Income classification
- SPECIAL_CONSIDERATIONS—Special consideration for application
- ASK_AMT—Funding amount requested
- IS_SUCCESSFUL—Was the money used effectively
Using your knowledge of Pandas and the Scikit-Learn’s StandardScaler()
, you’ll need to preprocess the dataset in order to compile, train, and evaluate the neural network model later in Step 2
Using the information we have provided in the starter code, follow the instructions to complete the preprocessing steps.
- Read in the charity_data.csv to a Pandas DataFrame, and be sure to identify the following in your dataset:
- What variable(s) are considered the target(s) for your model?
- What variable(s) are considered the feature(s) for your model?
- Drop the
EIN
andNAME
columns. - Determine the number of unique values for each column.
- For those columns that have more than 10 unique values, determine the number of data points for each unique value.
- Use the number of data points for each unique value to pick a cutoff point to bin "rare" categorical variables together in a new value,
Other
, and then check if the binning was successful. - Use
pd.get_dummies()
to encode categorical variables
Using your knowledge of TensorFlow, you’ll design a neural network, or deep learning model, to create a binary classification model that can predict if an Alphabet Soup–funded organization will be successful based on the features in the dataset. You’ll need to think about how many inputs there are before determining the number of neurons and layers in your model. Once you’ve completed that step, you’ll compile, train, and evaluate your binary classification model to calculate the model’s loss and accuracy.
- Continue using the jupter notebook where you’ve already performed the preprocessing steps from Step 1.
- Create a neural network model by assigning the number of input features and nodes for each layer using Tensorflow Keras.
- Create the first hidden layer and choose an appropriate activation function.
- If necessary, add a second hidden layer with an appropriate activation function.
- Create an output layer with an appropriate activation function.
- Check the structure of the model.
- Compile and train the model.
- Create a callback that saves the model's weights every 5 epochs.
- Evaluate the model using the test data to determine the loss and accuracy.
- Save and export your results to an HDF5 file, and name it
AlphabetSoupCharity.h5
.
Using your knowledge of TensorFlow, optimize your model in order to achieve a target predictive accuracy higher than 75%. If you can't achieve an accuracy higher than 75%, you'll need to make at least three attempts to do so.
Optimize your model in order to achieve a target predictive accuracy higher than 75% by using any or all of the following:
- Adjusting the input data to ensure that there are no variables or outliers that are causing confusion in the model, such as:
- Dropping more or fewer columns.
- Creating more bins for rare occurrences in columns.
- Increasing or decreasing the number of values for each bin.
- Adding more neurons to a hidden layer.
- Adding more hidden layers.
- Using different activation functions for the hidden layers.
- Adding or reducing the number of epochs to the training regimen.
- Create a new Jupyter Notebook file and name it
AlphabetSoupCharity_Optimzation.ipynb
. - Import your dependencies, and read in the
charity_data.csv
to a Pandas DataFrame. - Preprocess the dataset like you did in Step 1, taking into account any modifications to optimize the model.
- Design a neural network model, taking into account any modifications that will optimize the model to achieve higher than 75% accuracy.
- Save and export your results to an HDF5 file, and name it
AlphabetSoupCharity_Optimization.h5
.
For this part of the Challenge, you’ll write a report on the performance of the deep learning model you created for AlphabetSoup.
The report should contain the following:
-
Overview of the analysis: Explain the purpose of this analysis.
-
Results: Using bulleted lists and images to support your answers, address the following questions.
- Data Preprocessing
- What variable(s) are considered the target(s) for your model?
- What variable(s) are considered to be the features for your model?
- What variable(s) are neither targets nor features, and should be removed from the input data?
- Compiling, Training, and Evaluating the Model
- How many neurons, layers, and activation functions did you select for your neural network model, and why?
- Were you able to achieve the target model performance?
- What steps did you take to try and increase model performance?
- Summary: Summarize the overall results of the deep learning model. Include a recommendation for how a different model could solve this classification problem, and explain your recommendation.
© 2021 Trilogy Education Services, a 2U, Inc. brand. All Rights Reserved.