Vidscraper is a library for retrieving information about videos from various sources – video feeds, APIs, page scrapes – combining it, and presenting it in a unified manner, all as efficiently as possible.
Vidscraper comes with built-in support for popular video sites like blip, vimeo, ustream, and youtube, as well support for generic RSS feeds with feedparser.
>>> import vidscraper >>> video = vidscraper.auto_scrape('http://www.youtube.com/watch?v=PMpu8jH1LE8') >>> video.title u"The Magic Roundabout - Ermintrude's Folly" >>> video.description u"Ermintrude's been at the poppies again, but it's Dougal who ends up high as a kite!" >>> video.user u'nickhirst999' >>> video.guid 'http://gdata.youtube.com/feeds/api/videos/PMpu8jH1LE8'
vidscraper also comes with a command line utility allowing you to get video metadata from the command line. The example above could look like this:
$ vidscraper video http://www.youtube.com/watch?v=PMpu8jH1LE8 \
--fields=title,description,user,guid
Scraping http://www.youtube.com/watch?v=PMpu8jH1LE8...
{
"description": "Ermintrude's been at the poppies again, but it's Dougal who ends up high as a kite!",
"fields": [
"title",
"description",
"user",
"guid"
],
"guid": "http://gdata.youtube.com/feeds/api/videos/PMpu8jH1LE8",
"title": "The Magic Roundabout - Ermintrude's Folly",
"url": "http://www.youtube.com/watch?v=PMpu8jH1LE8",
"user": "nickhirst999"
}
code: | https://github.com/pculture/vidscraper/ |
---|---|
docs: | http://vidscraper.readthedocs.org/ |
bugtracker: | http://bugzilla.pculture.org/ |
code: | https://github.com/pculture/vidscraper/ |
irc: | #vidscraper on irc.freenode.net |
build status: |
- Python 2.6+
- BeautifulSoup 4.0.2+
- feedparser 5.1.2+
- python-requests 0.13.0+ (But less than 1.0.0!)
- requests-oauth 0.4.1+ (for some APIs *cough* Vimeo searching *cough* which require authentication)
- lxml 2.3.4+ (recommended for BeautifulSoup; assumed parser for test results.)
- unittest2 0.5.1+ (for tests)
- mock 0.8.0+ (for tests)
- tox 1.4.2+ (for tests)