Skip to content

Wikipedia is an essential tool in our society, it might not be the most trust- worthy source of information but nonetheless it is an encyclopedia of most of our knowledge. Most articles on Wikipedia have references and in particular, books and their International Standard Book Numbers (ISBN) . This project is a try to create a relationship graph…

Notifications You must be signed in to change notification settings

omidhzr/Dataintensive-processing-Project

Repository files navigation

Dataintensive-processing-Project

Wikipedia is an essential tool in our society, it might not be the most trust- worthy source of information but nonetheless it is an encyclopedia of most of our knowledge. Most articles on Wikipedia have references and in particular, books and their International Standard Book Numbers (ISBN) . This project is a try to create a relationship graph between articles and books, with articles on Wikipedia as data and Hadoop MapReduce as the tool of computation. The goal was to explore or discover the most important books in our society, and to identify different domains of knowledge. These goals were reached with satisfying results.

Read ID2221_Project_Group_DOST.pdf for more info

About

Wikipedia is an essential tool in our society, it might not be the most trust- worthy source of information but nonetheless it is an encyclopedia of most of our knowledge. Most articles on Wikipedia have references and in particular, books and their International Standard Book Numbers (ISBN) . This project is a try to create a relationship graph…

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages