Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Waybackpack + matchType #57

Open
uy5cu71 opened this issue Sep 1, 2022 · 1 comment
Open

Waybackpack + matchType #57

uy5cu71 opened this issue Sep 1, 2022 · 1 comment

Comments

@uy5cu71
Copy link

uy5cu71 commented Sep 1, 2022

Wayback API had matchType option, example:
https://web.archive.org/cdx/search/cdx?url=https://twitter.com/jack/statuses&matchType=prefix

Which returns:

com,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20121223123338 https://twicom,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20121223123338 https://twitter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 text/html 404 VNL4UHLBLX2UYNDIOZZ7ZR3CFYURIVND 5296
com,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20130203195805 https://twitter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 warc/revisit - VNL4UHLBLX2UYNDIOZZ7ZR3CFYURIVND 1042
com,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20130312144230 https://twitter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 warc/revisit - VNL4UHLBLX2UYNDIOZZ7ZR3CFYURIVND 1035
com,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20130326132131 https://twitter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 text/html 404 BMAXRTF3OVX3HL22WUMYLBYT2UJV3HT3 9317
com,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20130402123359 https://twitter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 warc/revisit - BMAXRTF3OVX3HL22WUMYLBYT2UJV3HT3 1030tter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 text/html 404 VNL4UHLBLX2UYNDIOZZ7ZR3CFYURIVND 5296
com,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20130203195805 https://twitter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 warc/revisit - VNL4UHLBLX2UYNDIOZZ7ZR3CFYURIVND 1042
com,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20130312144230 https://twitter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 warc/revisit - VNL4UHLBLX2UYNDIOZZ7ZR3CFYURIVND 1035
com,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20130326132131 https://twitter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 text/html 404 BMAXRTF3OVX3HL22WUMYLBYT2UJV3HT3 9317
com,twitter)/jack/statuses/"/antarnisti/status/245078986827386880" 20130402123359 https://twitter.com/jack/statuses/%22/Antarnisti/status/245078986827386880%22 warc/revisit - BMAXRTF3OVX3HL22WUMYLBYT2UJV3HT3 1030

Is it possible to download all of this urls? Because waybackpack will trim url based on cli input.

I have try to add new matchType parametr to the cdx file, i get valid response, but waybackpack still trim url based on cli input

@jsvine
Copy link
Owner

jsvine commented Sep 6, 2022

Hi @uy5cu71, and thanks for your interest in this library. Unfortunately, I'm not sure I 100% understand your inquiry. But if it helps: waybackpack does not currently support the matchType parameter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants