Assignment 3 --Book Recommendation

The goal of the assignment is to build a book recommendation engine that based on the input query, which Simply describe the kind of book we are looking for by specifying book title, author, etc.. and the search engine returns similar 'likes' pulled from the best books ever list of GoodReads

Data collection

To this end we have to build our own dataset and the search engine have to run on text documents
- Get the list of books
- Crawl books
- Parse downloaded pages

Search Engine

Create two different Search Engines that, given a query, it pulls a list of books that match the query. For this purpose, nltk library is used

Conjunctive query
Conjunctive query & Ranking score

Define a new score

Build a new metric to rank books based on the queries of the users using a scoring function
The output, must contain:
- bookTitle
- Plot
- Url
- The similarity score of the documents with respect to the query

Make a nice visualization (Bonus)

Here the goal is to quantify and visualize the writers' production.

Algorithmic Question

Given a string written in English capital letters, find the maximum length of a subsequence of characters that is in alphabetical order.

Script descriptions

ADM-HW3.ipynb
- Jupyter notebook script that contains the solutions to the given assignment

Content of the repository

data/ :
- vocabulary.json : vocabulary
- inverted_index_2_1_1.json : simple inverted index
- inverted_index_2_2_1.json : TF-IDF inverted index
- url_list.txt
- precomputed/ : doc_magnitude.json, idf.json
scripts/ :
- build_tsv.py
- data_collection.py
- index_creation.py
- search_engine.py
- utilities.py
main_notebook.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
data		data
images		images
log		log
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main_notebook.ipynb		main_notebook.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Assignment 3 --Book Recommendation

Data collection

Search Engine

Define a new score

Make a nice visualization (Bonus)

Algorithmic Question

Script descriptions

Content of the repository

About

Releases

Packages

Languages

License

stdrr/ADM-HW3

Folders and files

Latest commit

History

Repository files navigation

Assignment 3 --Book Recommendation

Data collection

Search Engine

Define a new score

Make a nice visualization (Bonus)

Algorithmic Question

Script descriptions

Content of the repository

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages