Skip to content

Locerature Stages

drdarshan edited this page Jun 25, 2011 · 7 revisions

Stage 1 - Map

  • Map Input: cities1000.txt
  • Map Logic: If cities name <= 2 words
  • Map Output: {City Name, {Population,Lat,Lon,CityId}}

Stage 1 - Reduce

  • Reduce Input: {City Name, {{Population,Lat,Lon, CityId}}}
  • Reduce Logic: Use Largest population
  • Reduce Output: {City Name, {Population, Lat, Lon, CityId}} --> DistributeCache?

Stage 2 - Map

  • Map Input: 1-gram Mapper
  • Map Logic: get city from DistributedCache
  • Map Output: {City Name-Year, {NumberOfHits, Population, Lat, Lon, CityId}}}

Stage 2 - Reduce

  • Reduce Input: {City Name-Year, {{NumberOfHits, Population, Lat, Lon}\}
  • Reduce Logic: Aggregate NumberOfHits
  • Reduce Output: {Year, {City Name, Population, Lat, Lon, NumberOfHits}

Server queries for year

HBase Example:

http://sujee.net/tech/articles/hbase-map-reduce-freq-counter/