DISCLAIMER: This application is used for demonstrative and illustrative purposes only. There is no representation as to the accuracy of the output of this application and it is presented without warranty.
Click on the link above to try out the sample app for yourself! Note, the application is made to classify product descriptions based on the input data.
This app was also featured in an article on Medium.com which explains the process used and potential business applications.
This application was built to demonstrate IBM's Watson Natural Language Classifier (NLC). The data set we will be using consists of a cleaned flipkart and JCPenny datasets made available through Kaggle. The dataset contains product descriptions and category labels.
One of the biggest advantages of stores operating online rather than as a brick-and-mortar has been the efficiency of removing physical inventory and labor. Utilizing Watson Natural Language Classifier allows an online marketplace to further optimize their inventory management and cataloging system by classifying products based on their description. Similar to brick-and-mortar stores, e-commerce retailers need to display their products in the correct section of their stores to optimize their user experience. Watson's Natural Language Classification API allows this to be done both at high levels of confidence and on a continual basis.
For demonstration purposes, the app accepts both raw text descriptions as well as product URLs offered by Kohl's, the application extracts the product description and sends it through Watson. Kohl's was chosen because the nature of products are similar to products on Flipkart and JCPenny, which were used in training. Although the products themselves are similar, there are many nuanced differences in word choice and usage. Watson's ability to understand and analyze the meanings of words, as opposed to simply memorizing them, is what sets its capabilities apart from other machine learning tools.
This application is a Python web application based on the Flask microframework. It uses the Watson Python SDK to create the classifier, list classifiers, and classify the input text.
Lets get started! Along with these instructions, a video tutorial is also available.
Here we create the classifier with our product description dataset.
- Download the product description training dataset by right clicking the link and selecting Save As.
- Create an NLC service in IBM Cloud, make a note of the service name used in the catalog, we'll need this later.
- Create service credentials by using the menu on the left and selecting the default options.
- Upload the data using the command below. Be sure to substitute the username and password. This will take some time.
curl -i -u {username}:{password} -F training_data=@data/hierarchy_product_description_training.csv -F training_metadata="{\"language\":\"en\",\"name\":\"hierarchy_product_description_training\"}" "https://gateway.watsonplatform.net/natural-language-classifier/api/v1/classifiers"
This application can be run locally or hosted on IBM Cloud, follow the steps below depending on your deployment choice
- Clone this project:
git clone [email protected]:erichensleyibm/NLC_product_classifier-demo.git
cd
into this project's root directory- (Optionally) create a virtual environment:
virtualenv my-nlc-demo
- Activate the virtual environment:
./my-nlc-demo/bin/activate
- Activate the virtual environment:
- Run
pip install -r requirements.txt
to install the app's dependencies - Add your NLC credentials
- Update the welcome.py with your NLC credentials hardcoded
- OR add a file named _config.py with your credentials within the same folder as welcome.py
- Run
python welcome.py
- Access the running app in a browser at
http://localhost:5000
-
Clone this project:
git clone [email protected]:erichensleyibm/NLC_product_classifier-demo.git
-
cd
into this project's root directory -
Update
manifest.yml
with the NLC service name (your_nlc_service_name
), a unique application name (your_app_name
) and unique host value (your_app_host
)applications: - path: . memory: 256M instances: 1 domain: mybluemix.net name: your_app_name host: your_app_host disk_quota: 1024M services: - your_nlc_service_name buildpack: python_buildpack
-
Run
bluemix login
from the root directory -
Run
bluemix target -o <organization> -s <space>
- IBM or other employees using a federated login will need to use
bluemix login --sso
and supply the temporary password provided through following the link given
- IBM or other employees using a federated login will need to use
-
Run
bluemix app push
-
Access the running app by going to:
https://<host-value>.mybluemix.net/
If you've never run the
bluemix
command before there is some configuration required, refer to the official IBM Cloud CLI docs to get this set up.