fb_scrape_public

Update 2/22/18: This program is not currently functional, and probably will never be again given Facebook's changes to its API. I'm leaving the repo up in case anyone finds any of the code useful for other applications.

Update 5/17/18: FSP can still be used with tokens generated by the Graphi API Explorer here: https://developers.facebook.com/tools/explorer/ But who knows how long that will last...

Update 5/15/18: Facebook has opened an app review process to grant access to the Pages API. Applicants will "require App Review, Business Verification, and Supplemental Terms." Theoretically any app that successfully navigates this process should be able to use FSP, although I have not tested this myself. See https://developers.facebook.com/docs/apps/review for more details.

fb_scrape_public

This script can download posts and comments from public Facebook pages and groups (but not users). It requires Python 3.

Installation

pip3 install fb_scrape_public will work, but you can also simply download the script and place it in your PYTHONPATH directory.

Instructions

This script is written for Python 3 and won't work with previous Python versions.
The main function in this module is scrape_fb (see comments on lines 147-148). It is the only function most users will need to run directly.
To make this script work, you will need to either:
1. Create your own Facebook app, which you can do here: https://developers.facebook.com/apps . Doesn't matter what you call your new app, you just need to pull its unique client ID (app ID) and app secret.
2. Generate your own FB access token using the Graph API Explorer (https://developers.facebook.com/tools/explorer/) or other means.
Next, you can authenticate using one of the following three methods:
1. Run the save_creds() function, which will save your FB app credentials to a local file. You will then be able to run scrape_fb from the directory containing the file without including your ID and secret as arguments. Alternatively you can insert a path to your credentials file into the cred_file parameter.
2. Include your client ID and secret AS STRINGS in the appropriate scrape_fb parameters.
3. Include a user-generated token in the token parameter.
scrape_fb accepts FB page IDs ('barackobama') and post IDs preceded by the page ID and an underscore (for more details on post ID format, see here). You can load them into the ids field using a comma-delimited string or by creating a plain text file in the same folder as the script containing one or more names of the Facebook pages you want to scrape, one ID per line. For example, if you wanted to scrape Barack Obama's official FB page (http://facebook.com/barackobama/) using the text file method, your first line would simply be 'barackobama' without quotes. I suggest starting with only one ID to make sure it works. You'll only be able to collect data from public pages and groups. For groups you'll need the group ID number; the string alias won't work.
The only required fields for the scrape_fb function are one of the three authentication methods (see step 4 above) and ids. I recommend not changing the other defaults unless you know what you're doing (except for outfile if you want to save your data to disk and scrape_mode if you want to pull post comments).
If you did everything correctly, the command line should show you some informative status messages. Eventually it will save a CSV full of data to the same folder where this script was run if you've set outfile. If something went wrong, you'll probably see an error.

Sample code

import fb_scrape_public as fsp

#below, "YourClientID," "YourClientSecret," and "YourAccessToken" should be your actual client ID, secret, and access token

# to save your Facebook app credentials to disk
fsp.save_creds() 

# if you've run save_creds() once, you can enter the following to get page posts:
obama_posts = fsp.scrape_fb(ids="barackobama") 

# if you haven't run save_creds(), use this (id/secret mode)
obama_posts = fsp.scrape_fb("YourClientID","YourClientSecret",ids="barackobama") 

# or this (access token mode). the outfile attribute is also set, which means the data will be saved to disk
obama_posts = fsp.scrape_fb(token="YourAccessToken",ids="barackobama",outfile='obama_posts.csv') 

# to get comments on a single post (id/secret mode)
comments = fsp.scrape_fb("YourClientID","YourClientSecret",ids="6815841748_10154508876046749",scrape_mode="comments")

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
LICENSE		LICENSE
README.md		README.md
fb_scrape_public.py		fb_scrape_public.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Update 2/22/18: This program is not currently functional, and probably will never be again given Facebook's changes to its API. I'm leaving the repo up in case anyone finds any of the code useful for other applications.

Update 5/17/18: FSP can still be used with tokens generated by the Graphi API Explorer here: https://developers.facebook.com/tools/explorer/ But who knows how long that will last...

fb_scrape_public

Installation

Instructions

Sample code

About

Releases

Packages

Languages

License

dfreelon/fb_scrape_public

Folders and files

Latest commit

History

Repository files navigation

Update 2/22/18: This program is not currently functional, and probably will never be again given Facebook's changes to its API. I'm leaving the repo up in case anyone finds any of the code useful for other applications.

Update 5/17/18: FSP can still be used with tokens generated by the Graphi API Explorer here: https://developers.facebook.com/tools/explorer/ But who knows how long that will last...

fb_scrape_public

Installation

Instructions

Sample code

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages