Skip to content

A datascience based project which uses popular libraries like pandas , numpy and beautiful soup to scrape data and save in dataframe data structure

Notifications You must be signed in to change notification settings

rahulkrishnan221/Billboard-wiki

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Billboard-Wiki

It scrapes song names and artists from billboard , the creates a valid Wikipedia link from that data for that particular song . The it extracts song writers and genres for all songs , creates a pandas dataframe and stores it as csv file

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

Python or PyCharm

Installing

There are two ways you can run the program

  1. Using Pycharm

1.1 Just clone the project and Under new option in PyCharm , select this folder and you would be good to go .

  1. Using Basic Python

2.1 Install Pandas as

pip install pandas

2.2 Install Requests

pip install Requests

2.3 Install BeautifulSoup4

pip install beautifulsoup4

2.4 Install html5lib

pip install html5lib

Running the tests

  1. First set Dates in data.py as to from which date to whic date you want the billboard data for .
  2. Change the number by which i is divided .From this you can change intervals , as to once per 30 days if the number is 30
  3. Run main.py and then it will generate a csv file name example .csv

About

A datascience based project which uses popular libraries like pandas , numpy and beautiful soup to scrape data and save in dataframe data structure

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 95.9%
  • HTML 2.5%
  • C++ 0.8%
  • C 0.7%
  • Tcl 0.1%
  • Fortran 0.0%