Skip to content

Various tasks and methods used in Information Retrieval

Notifications You must be signed in to change notification settings

bhshri/Information-Retrieval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Preprocessing Text

Cleaning and preprocessing the text is a prerequisite for all the IR and NLP tasks. Cleaning text by removing tags and punctuations, stopword removal, stemming and lemmatization was performed on the text.

TF-IDF

Representation of text in an important step in all the IR and NLP tasks. TF-IDF representation was implemented from scratch on a set of documents and comparison was done with the Sklearn implementation.

Word2Vec Representation

Document Retrieval using SkipGram and CBOW word representation and evaluation using Precision, Recall and F1 score.

LSI

Implementation of LSI on set of documents with the help of SVD and testing Retrieval of documents using cosine similarity measure.

YASS Stemmer

Stemming is implemented using agglomerative clustering using various distance measures for the strings. https://dl.acm.org/doi/10.1145/1281485.1281489

Query Expansion and Relevance feedback

Document retrieval using query was evaluated by performing query expansion(synonyms of query words) and relevance feedback(rocchio algorithm).

Question Answering

Question answering using unsupervised approach using word2vec representation and evaluation using Exact Match and F1 score.

Text Summarization

Extractive text summarization using Texrank and Lexrank and evaluation using ROGUE1 and ROGUE 2 score.

Text Classification

Multiclass text classification using TF-IDF and word2vec representation using SVM.

Text classification using Ensemble based approach

Multiclass text classification using Stacking and voting classifiers. Ensemble of Multinomial Naive Bayes, Logistic Regression and Random Forests.

About

Various tasks and methods used in Information Retrieval

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published