-
Notifications
You must be signed in to change notification settings - Fork 275
OpenWayback Replay API
Patrick T. Rourke edited this page Feb 3, 2016
·
9 revisions
The OpenWayback URL for a single archived web page for a specific date and time looks like this:
http://webarchive.archivedomain.tld/all/20000101000000/subjectdomain.tld
http://[wayback server hostname]/[access point]/[yyyymmddhhmmss]/[access_url]
Access points are indicated by strings in the first field of the OpenWayback URL after the hostname; this access point name is configured in the OpenWayback configuration file.
Dates are represented in the second field of the OpenWayback URL after the hostname as fourteen-character integers in the format yyyymmddhhmmss; on requests, they may be truncated.
The access URL - the URL of the archived site - is represented in the third and last field of the OpenWayback URL after the hostname. Because an access URL may itself include a path, the fields of the OpenWayback URL should always be counted from the left; everything after the fifth slash is part of the access URL.
The simplest requests are for a specific access URL for a specific date.
If there is an archive of the requested access URL for the requested date, that archive is returned to the browser by OpenWayback with an HTTP 200 response.
http://webarchive.archivedomain.tld/all/200101092140/subjectdomain.tld
Archived page displayed
URL in location bar: http://webarchive.archivedomain.tld/all/200101092140/subjectdomain.tld
HTTP response: 200
If there is no archive of the requested access URL for the requested date, the archive whose date is closest to the requested date (whether earlier or later) is returned to the browser by OpenWayback with an HTTP 302 response:
http://webarchive.archivedomain.tld/all/200101081200/subjectdomain.tld
Archived page displayed: returns the page whose date most closely matches 2001-01-08 12:00
URL in location bar: http://webarchive.archivedomain.tld/all/200101092140/subjectdomain.tld
HTTP response: 302
If there is no archive of the requested access URL for any date, OpenWayback will return an HTTP 404 response with a page indicated that the site is not found in the archive:
http://webarchive.archivedomain.tld/all/200101081200/nonexistentdomain.tld
Error page displayed
URL in location bar: http://webarchive.archivedomain.tld/all/200101081200/nonexistentdomain.tld
HTTP response: 404
If the date part of the url is truncated, the date closest to the middle of the range implied by the
request is matched and the request is redirected to the matched page (while OpenWayback returns an HTTP 302 response).
http://webarchive.archivedomain.tld/all/2000/subjectdomain.tld
Archived page displayed: Returns the capture of the page whose archival date most closely matches 2000-07-01
URL in location bar: http://webarchive.archivedomain.tld/all/200101092140/subjectdomain.tld
HTTP response: 302
http://webarchive.archivedomain.tld/all/200010/subjectdomain.tld
Archived page displayed: Returns the capture of the page whose archival date most closely matches 2000-10-15
URL in location bar: http://webarchive.archivedomain.tld/all/200101092140/subjectdomain.tld
HTTP response: 302
http://webarchive.archivedomain.tld/all/subjectdomain.tld
Archived page displayed: Returns the most recent capture of the page
URL in location bar: http://webarchive.archivedomain.tld/all/200101092140/subjectdomain.tld
HTTP response: 302
There are special requests that will return either the first or the last capture from the archive.
http://webarchive.archivedomain.tld/all/1/subjectdomain.tld
Archived page displayed: Returns the first capture of the requested page
URL in location bar: http://webarchive.archivedomain.tld/all/200101092140/subjectdomain.tld
HTTP response: 302
http://webarchive.archivedomain.tld/all/2/subjectdomain.tld
Archived page displayed: Returns the most most recent capture of the page
URL in location bar: http://webarchive.archivedomain.tld/all/200101092140/subjectdomain.tld
HTTP response: 302
If there is no archive of the page for any date, the response to a request with a truncated date will be the same as one for a specific date: an error page and an HTTP 404 response.
Requests for date ranges for specific access URLs.
Requesting a specific access URL with an asterisk as the sole character in the date part of the OpenWayback URL will return a page showing the capture dates for the requested URL (different configurations of OpenWayback will return calendar pages for a single year - the year of the latest capture - or for multiple years, or return a table of capture dates):
http://webarchive.archivedomain.tld/all/*/subjectdomain.tld
Capture date page displayed: Returns list of capture dates
URL in location bar: http://webarchive.archivedomain.tld/all/*/subjectdomain.tld
HTTP response: 200
Adding the asterisk wildcard character after a year will return a list of capture dates for that year:
http://webarchive.archivedomain.tld/all/2000*/subjectdomain.tld
Capture date page displayed: Returns list of capture dates in the year 2000
URL in location bar: http://webarchive.archivedomain.tld/all/2000*/subjectdomain.tld
HTTP response: 200
An ordered pair of dates, separated by a hyphen, and concluded by an asterisk represents a date range; a request with a date range will return a list of capture dates for that range:
http://webarchive.archivedomain.tld/all/2000-2012*/subjectdomain.tld -
Capture date page displayed: Returns list of capture dates in the years 2000 to 2012
URL in location bar: http://webarchive.archivedomain.tld/all/2000-2012*/subjectdomain.tld -
HTTP response: 200
Captured Page List Requests
If a wildcard is added to the first part of the access URL, all captured URLs whose original URL begins with the string in the access URL field will be listed, with the number of capture dates for each URL, the total count of captured pages, and the date range of captures.
http://webarchive.archivedomain.tld/all/*/subjectdomain.tld*
List page displayed: Returns a list of all captures of pages with the prefix `subjectdomain.tld` for all dates.
Showing 1 to 6,609 of 6,609 results for subjectdomain.tld
subjectdomain.tld/ 475 versions
2,961 pages between Jun 16, 1997 and Jun 5, 2013
subjectdomain.tld/%22 3 versions
7 pages between Mar 23, 2003 and Jan 22, 2009
subjectdomain.tld/2009/12/07/today_in_history 1 version
3 pages between Aug 27, 2010 and Nov 27, 2010
URL in location bar: http://webarchive.archivedomain.tld/all/*/subjectdomain.tld*
HTTP response: 200
As with capture list responses, page list responses can be limited by date ranges:
http://webarchive.loc.gov/all/2008*/subjectdomain.tld*
List page displayed: Returns a list of all captures of pages with the prefix `subjectdomain.tld` for all dates.
Showing 1 to 2,012 of 2,012 results for subjectdomain.tld
subjectdomain.tld/ 80 versions
500 pages between Jan 1, 2008 and Dec 31, 2008
URL in location bar: http://webarchive.loc.gov/all/2008*/subjectdomain.tld*
HTTP response: 200
A wildcarded access URL request with a specific date will fail and return an error page with an HTTP response code of 400:
http://webarchive.loc.gov/all/200804051200/subjectdomain.tld*
Error page displayed: `The request is missing information, or is not understood by this server. Bad URL(subjectdomain.tld*)`
URL in location bar: http://webarchive.loc.gov/all/200804051200/subjectdomain.tld*
HTTP response: 400
Copyright © 2005-2022 [tonazol](http://netpreserve.org/). CC-BY. https://github.com/iipc/openwayback.wiki.git