Wikipedia is an essential tool in our society, it might not be the most trust- worthy source of information but nonetheless it is an encyclopedia of most of our knowledge. Most articles on Wikipedia have references and in particular, books and their International Standard Book Numbers (ISBN) . This project is a try to create a relationship graph between articles and books, with articles on Wikipedia as data and Hadoop MapReduce as the tool of computation. The goal was to explore or discover the most important books in our society, and to identify different domains of knowledge. These goals were reached with satisfying results.
Read ID2221_Project_Group_DOST.pdf for more info