Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IP ban investigation #318

Open
sigaloid opened this issue Nov 12, 2024 · 14 comments
Open

IP ban investigation #318

sigaloid opened this issue Nov 12, 2024 · 14 comments
Labels
bug Something isn't working

Comments

@sigaloid
Copy link
Member

There is a number of reports of continued JSON errors despite the fact that most instance operators running on clean IPs do not get these. If you are impacted by this, please ensure you can reproduce this on certain IPs by confirming you can visit the Redlib home page on one IP (perhaps try your home IP, VPN to another place, etc) but cannot on the IP in question. Then, comment or email (ipban @ my domain, linked in profile) the following info:

  • Confirm you're running the latest commit (you should be able to see "✅ Instance is up to date" on the error page)
  • Confirm impact (can you view any pages?)
  • IPv4/IPv6
  • Your IP's ASN and ISP (and IP if you're comfortable sharing privately)
@ggtylerr
Copy link

  • Confirmed.
  • Occasional blockage, ~75-90% of runtime is unblocked.
  • NYC-1: 45.137.206.17, CAL-1: 71.19.146.127
    • (IPv6 ranges can be disclosed in PGP email.)
  • NYC-1: RoyaleHosting BV, ASN 212477. CAL-1: PRGMR.com Inc, ASN 47066.

It should be noted that both are currently fully operational but has previously been experiencing #301.

@sigaloid
Copy link
Member Author

Can you set the environment variable RUST_LOG=redlib=trace? Then when you encounter the error, excerpt from the latest logs? Thanks.

@hyperreal64
Copy link

  • Confirmed.
  • I cannot view any pages.
  • IPv4: 152.53.37.179
  • IPv6: 2a0a:4cc0:2000:2a:1416:76ff:fe0c:d737
  • netcup GmbH ASN: 214996

@ggtylerr
Copy link

ggtylerr commented Nov 14, 2024

Can you set the environment variable RUST_LOG=redlib=trace? Then when you encounter the error, excerpt from the latest logs? Thanks.

Just got the error on both servers. It looks like NYC-1 just became 3 commits too old due to unrelated issues regarding quay.io (it looks like their IPv6 connection is down and docker compose pull is forcing connections over that. for some reason quay.io just can't be reached at all right now despite the site still being up.) However CAL-1 is on latest. The logs, unfortunately, don't seem to be of much use:

redlib  |  WARN  redlib::client > Rate limit 9 is low. Spawning force_refresh_token()
redlib  |  TRACE redlib::oauth  > Rolling over refresh token. Current rate limit: 8
redlib  |  INFO  redlib::oauth  > [🔄] Spoofing Android client with headers: {"Client-Vendor-Id": "d5bc0918-e77d-40f6-9ef0-69274c59539a", "X-Reddit-Device-Id": "d5bc0918-e77d-40f6-9ef0-69274c59539a", "User-Agent": "Reddit/Version 2024.04.0/Build 1391236/Android 9"}, uuid: "d5bc0918-e77d-40f6-9ef0-69274c59539a", and OAuth ID "ohXpoqrZYub1kg"
redlib  |  TRACE redlib::oauth  > Sending token request...
redlib  |  ERROR redlib::client > Got an invalid response from reddit expected value at line 1 column 1. Status code: 403 Forbidden
redlib  |  ERROR redlib::utils  > Error page rendered: Failed to parse page JSON data: expected value at line 1 column 1
redlib  |  ERROR redlib::client > Got an invalid response from reddit expected value at line 1 column 1. Status code: 403 Forbidden
redlib  |  ERROR redlib::utils  > Error page rendered: Failed to parse page JSON data: expected value at line 1 column 1
redlib  |  TRACE redlib::oauth  > Received response with status 200 OK and length Some("1308")
redlib  |  TRACE redlib::oauth  > Serializing response...
redlib  |  TRACE redlib::oauth  > Accessing relevant fields...
redlib  |  INFO  redlib::oauth  > [✅] Success - Retrieved token "eyJhbGciOiJSUzI1NiIsImtpZCI6IlNI...", expires in 86399
redlib  |  INFO  redlib::oauth  > [✅] Successfully created OAuth client
redlib  |  ERROR redlib::client > Got an invalid response from reddit expected value at line 1 column 1. Status code: 403 Forbidden
redlib  |  ERROR redlib::utils  > Error page rendered: Failed to parse page JSON data: expected value at line 1 column 1
redlib  |  ERROR redlib::client > Got an invalid response from reddit expected value at line 1 column 1. Status code: 403 Forbidden

@sigaloid
Copy link
Member Author

Hm, the fact that it's near to a token refresh makes me wonder if it's the fact that the token reaches its rate limit too quickly. It currently starts a background task when there's 9 left, which is working:

image

But... if 9 concurrent requests come in simultaneously, it's possible they exhaust the remaining rate limit count. Is this a high-traffic instance, specifically?

@ggtylerr
Copy link

ggtylerr commented Nov 15, 2024

Is this a high-traffic instance, specifically?

Definitely not. CAL-1 is one of two secondary servers, only listed on my site. It's primarily used for Invidious.

This doesn't appear to be deliberate DDoS either, as my other server, POL-1, has been completely unaffected so far.

EDIT: With that being said, Invidious does have the feature to fetch comments from Reddit. But this hasn't been functional since Reddit's API fiasco and only attempts to fetch when the user explicitly clicks the "View Reddit comments" button on a video. Plus it wouldn't make sense why that would start to become a problem now, many months after hosting both.

@sigaloid
Copy link
Member Author

Hm, yeah, that checks out. So far every IP has been a commercial ASN and I'm hoping that's not the commonality, that they all have flagged some heuristic for being suspicious because they're hosted.

@ggtylerr
Copy link

I am noting an unusual amount of requests on NYC-1 (the public server on redlib's list.) I've heavily adjusted rate limits over nginx to be only 5 requests per second (10 burst) but without any further logging, it's difficult to say whether this is regular user activity or possibly scrapers.
https://github.com/user-attachments/assets/8ed8eb93-d0f5-430d-89bb-93bc830ab8e2

@np22-jpg
Copy link
Contributor

I was running the same commit and couldn’t view any pages, either, on a residential IP (which is why I’m not comfortable with sharing it). I was able to get around it by taking my instance offline, requesting an IP unban from Reddit, and bringing it back up.

I'm not sure if this is at all helpful information.

@sigaloid
Copy link
Member Author

That's very helpful. How did you request an unban? And did you have the same IP the whole time (during the ban and after it worked again)? Sometimes residential IPs cycle and change normally.

@np22-jpg
Copy link
Contributor

When I got IP banned, I received a page that looked like this or this when trying to access from my browser. Based on my previous emails, the link to contact support led me to [email protected], who then pointed me to fill out this form. I have had the same IP the entire time.

@hyperreal64
Copy link

Hm, yeah, that checks out. So far every IP has been a commercial ASN and I'm hoping that's not the commonality, that they all have flagged some heuristic for being suspicious because they're hosted.

So I set up Redlib on one of my homelab machines which uses my residential IP address. It worked for a few hours, then I noticed I get the same error. My residential IP ASN is COMCAST-7922. I'm not sure if this is classified as a "commercial ASN", but if it's not, then hopefully this quells your concern.

@sigaloid
Copy link
Member Author

Weird. After only a few hours, it should have only requested one token total and presumably not had enough traffic (more than 99 reqs in a 5 minute period) to require a new token rollover. That's really surprising that's all it took to get your IP under their watchful eye. I'm going to think a bit on how to proceed here. I don't know if it makes sense to attempt to identify a single factor that causes this policy ban, at least via trial and error, for a few reasons:

  • it could be any number of factors
  • it could be ML-type ban of many factors in which case we really couldn't guess and check over and over
  • I don't have unlimited clean IPs that I can test with
  • I already put a lot of work into attempting to copy exact behavior of the mobile apps in terms of new tokens, it's unclear if it's a single thing I missed or if it's some lower level thing like TLS signatures, etc...

Maybe I need to take a look at the auth flow on a current app since it's been a few months. Not sure if anything that has changed should really cause this, since people who haven't used the app in a few months should still be able to use the app without being IP banned.

@hyperreal64
Copy link

OK, right now I can access Redlib from my home IP instance. My public-facing instance is still error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants