- create knowledge base on a person / team based on previous issues
- suggest / recomment the best person / team when a new request / incident happens
https://www.kaggle.com/datasets/stackoverflow/stacksample
> pip install pipenv
> pipenv install
- Spacy - natural language processing
- Whoosh - full text search
- Pandas - data management
- Dash - visualization
- Load csv data to pandas data frame
- Preprocessing , remove stop words
- Tokenize
- Get top words
- Create labels from top words
- Tag ticket with multiple labels
- Find the pattern from combination of labels occurring together
- create knowledge base for each person based on issue answered and tags
- Analytics : Get top words over time (per day) for trending
- Admin Page: If no label : it is a new / emerging issue
- If assigned a label check if its within a day-limit or week-limit from baseline (1.5*mean), then raise flag
Pipfile
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"
[packages]
dash = "*"
spacy = "*"
whoosh = "*"
pandas = "*"
[dev-packages]
[requires]
python_version = "3.9"
main.py
import pandas
from spacy.lang.en import English
# loading dataset
questions_dataframe = pandas.read_csv('dataset/QuestionsHead.csv')
nlp = English()
# function to remove stopwords and html tags (non alpha words)
def remove_stopwords(text):
if text:
doc = nlp(text.lower())
result = [token.text for token in doc if (token.text not in nlp.Defaults.stop_words) and (token.is_alpha or " " in token.lemma_)]
return " ".join(result)
else:
return text
questions_dataframe["processed_title"] = questions_dataframe["Title"].fillna('').apply(remove_stopwords)
print(questions_dataframe[["Title", "processed_title"]].head())