GitHub - marinaolina/SparkBatch: Reads json, persists locally, some analytics

Create simple spark batch ETL job that satisfies following points.
Assume that actual data volume will be several GBs per day

Reads and parses Youtube trending video data from provided JSON files
Extracts most viewed video per category id and per trending date
Formats data to have required columns for analytics - video_id, trending_date, category_id, title, views, likes, dislikes
Save data to partitioned table to be used in further analysis

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src/main/scala/spark		src/main/scala/spark
.gitignore		.gitignore
README.md		README.md
build.sbt		build.sbt

Provide feedback