Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API requests are being blocked by cloudflare verification/captcha - zero ability to use the service, I want programatic access to train a LoRa w/o having to rename every downloaded icon #1523

Open
kernkraft235 opened this issue Dec 26, 2024 · 0 comments
Labels
website issues about macosicons.com

Comments

@kernkraft235
Copy link

kernkraft235 commented Dec 26, 2024

Request and Response

API calls do not work anywhere except the demo on the documentation page, cloudflare is getting in the way, I tested on clean IPs, and several random public IPs. Even including cookies from the browser session in my curls didnt work, cloudflare too 💪

Request
curl -X POST 'https://api.macosicons.com/api/search' \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "x-api-key: $ak_macosicons" \
  -d '{"query": "Chrome"}'
Response
<!DOCTYPE html>
<html lang="en-US">

<head>
    <title>Just a moment...</title>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=Edge">
    <meta name="robots" content="noindex,nofollow">
    <meta name="viewport" content="width=device-width,initial-scale=1">
    <style>
    
        #challenge-error-text {
            background-image: url(data:image/svg+xml;base64,...);
            background-repeat: no-repeat;
            background-size: contain;
            padding-left: 34px
        }
        
    </style>
    <meta http-equiv="refresh" content="390">
</head>

<body class="no-js">
    <div class="main-wrapper" role="main">
        <div class="main-content"><noscript>
                <div class="h2"><span id="challenge-error-text">Enable JavaScript and cookies to continue</span></div>
            </noscript></div>
    </div>
    <script>(function () { window._cf_chl_opt = { cvId: '3', cZone: "api.macosicons.com", cType: 'managed', cRay: 'REDACTED', cH: 'REDACTED', cUPMDTk: "\/api\/search?__cf_chl_tk=REDACTED", cFPWv: 'b', cITimeS: 'REDACTED', cTTimeMs: '1000', cMTimeMs: '390000', cTplC: 0, cTplV: 5, cTplB: 'cf', cK: "", fa: "\/api\/search?REDACTED", md: "REDACTED", mdrd: "REDACTED" }; var cpo = document.createElement('script'); cpo.src = '/cdn-cgi/challenge-platform/h/b/orchestrate/chl_page/v1?ray=REDACTED'; window._cf_chl_opt.cOgUHash = location.hash === '' && location.href.indexOf('#') !== -1 ? '#' : location.hash; window._cf_chl_opt.cOgUQuery = location.search === '' && location.href.slice(0, location.href.length - window._cf_chl_opt.cOgUHash.length).indexOf('?') !== -1 ? '?' : location.search; if (window.history && window.history.replaceState) { var ogU = location.pathname + window._cf_chl_opt.cOgUQuery + window._cf_chl_opt.cOgUHash; history.replaceState(null, null, "\/api\/search?__cf_chl_rt_tk=REDACTED" + window._cf_chl_opt.cOgUHash); cpo.onload = function () { history.replaceState(null, null, ogU); } } document.getElementsByTagName('head')[0].appendChild(cpo); }());</script>
</body>

</html>

I was originally writing this issue to address how this would be an amazing data set to train a diffusion model LoRa, but then I discovered the API, and then it didn't work. If you end up abandoning the API and giving up on monetizing your collection can you consider doing the things suggested below? I don't really care WHO does it, I just want the result and I'm willing and able to do it.

Old issue I was writing right before I discovered API

Idea

If there was a way to access all the icons with some correlation to their real name (so that I could make training data programmatically) it would be possible to generate a world class Low Rank Adaptation a diffusion model (probably SDXL and Flux).

This would be a boon for developers. I see a ton of clearly AI generated icons on the app store, and I'm confident with your data set we would create something next level.
Example of a lower-quality icon checkpoint
Example of a slightly better SDXL LoRA

What is stopping me currently?

  • 54277953d8d1f4b64225a97d623f6ab7_1734802713676.icns <- that is how the downloads are named
  • The amount of icons I would have to manually rename would be way beyond my dedication and attention span
  • I've already tried to train a few SDXL LoRa's already using about 1000 downloaded icons, but as with most diffusion models - the quality of the tags/captions makes all the difference.
  • My 1000 icons created a biased dataset because I download only ones I cared about.
  • I have manually named/tagged about 400 just so I could find them locally for my own UI, these were better than the GPT vision tags but inferior to just have the icon be named what it's named on the website (not the file downloaded!!!)
  • I used a python script I made to try to get GPT-4o to rename them it did ok but it lost all brand/logo meaning, and had a poor concept of UI symbolism.

Possible Solutions

  1. Make current website name the download files as they appear in UI - you could still append that long funky string if its there for a reason, because I can just regex it out
  2. MOST IDEAL A way to download all the icons in bulk. I don't know how you backend works but I assume they have some sqlite db or json that correlates the files to their "Friendly name". I can work with any method as long as I can do it programmatically
@kernkraft235 kernkraft235 added the website issues about macosicons.com label Dec 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
website issues about macosicons.com
Projects
None yet
Development

No branches or pull requests

1 participant