Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better deck scraper #9

Closed
Everlag opened this issue Dec 21, 2015 · 2 comments
Closed

Better deck scraper #9

Everlag opened this issue Dec 21, 2015 · 2 comments

Comments

@Everlag
Copy link
Owner

Everlag commented Dec 21, 2015

The current scraper generates a static cache file which we cannot update. This means we cannot provide any historical data on an archetype which limits our utility.

A better deck scraper would acquire decks from mtgtop8, massage the data, and put everything into a series of sane tables in postgres.

Additionally, a video scraper from mtgcoverage looks possible and would provide a lot of value.

@Everlag
Copy link
Owner Author

Everlag commented Dec 21, 2015

Some notes

  • Structure the scraper ala priceWriter with two asymmetric sources that handle the database access separately. Make sure that any panics are recovered inside the scrapers to avoid full crashes on markup changes.
  • Deck names need to be normalized between mtgtop8 and mtgcoverage.
  • mtgtop8 provides extremely broad categories, ie UrzaTron for U Tron. Granular naming by looking for specific cards, ie snapcaster or sad bot in U tron, could do it.

@Everlag Everlag mentioned this issue Jan 3, 2016
@Everlag
Copy link
Owner Author

Everlag commented Jan 3, 2016

82664ae handles modern decks from mtgtop8 with normalization to our names on egress and blurring on ingress.

Opening #10 to handle mtgcoverage.

@Everlag Everlag closed this as completed Jan 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant