Skip to content

Latest commit

 

History

History
5 lines (3 loc) · 315 Bytes

README.md

File metadata and controls

5 lines (3 loc) · 315 Bytes

KMeans-Hadoop

Analysing the Stackoverflow data, that comes in two sets; a join operation was made to combine respective user with his/her posts.

This data was then processed and cleaned up. A KMean clustering algorithm was finally utilised to label these users as Good/Bad based on various statistical ratios.