Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add rate limit and faster default speed #35

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

wakamex
Copy link

@wakamex wakamex commented Oct 28, 2022

What I did

add rate limit, implementing similar logic to alchemy-web3 but with exponential falloff which the alchemy docs tell me is preferred

fixed: #60
also fixes ape issue #613 which suggests adding a rate limit as another mix-in for ProviderAPI. not sure what that means, but this modifies a sub-class of a sub-class of ProviderAPI.

How I did it

for loop around _make_request call, if a call is successful it returns it, otherwise keeps going. if the for loop ends, it raises an error saying it's rate limited.

How to verify it

I only checked this against an intro notebook called hello_ape.ipynb but i don't see why it shouldn't work generally.
In the two timed queries this is 6x (1.98s/0.335s) and 13x (202s/15.9s) faster.

Checklist

  • Passes all linting checks (pre-commit and CI jobs)
  • New test cases have been added and are passing
  • Documentation has been updated
  • PR title follows Conventional Commit standard (will be automatically included in the changelog)

fubuloubu
fubuloubu previously approved these changes Oct 28, 2022
Copy link
Member

@fubuloubu fubuloubu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

absolutely sexy!

@unparalleled-js would we want to upstream this as a mix-in in ape core per ApeWorX/ape#613?

@fubuloubu fubuloubu requested a review from antazoey October 28, 2022 13:51
Copy link
Member

@antazoey antazoey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing! I want, I want.

Regarding the mixin: I don't think making a mixin out of this would be an easy task because:

  1. The error message for exceeding computational units may differ from provider to provider
  2. We need to retain the Alchemy-specific error handling we do here already, such as the 403-like handling
  3. It increases the complexity of super() and base-classes, causing me to believe some sort of middleware design pattern is more suited, but also would like to avoid that.

At least for the time being, I think handling this on a provider-by-provider basis is okay!

See feedback though :)

ape_alchemy/provider.py Outdated Show resolved Hide resolved
ape_alchemy/provider.py Outdated Show resolved Hide resolved
ape_alchemy/provider.py Outdated Show resolved Hide resolved
ape_alchemy/provider.py Outdated Show resolved Hide resolved
fubuloubu
fubuloubu previously approved these changes Nov 5, 2022
Copy link
Member

@fubuloubu fubuloubu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment, otherwise it looks good

ape_alchemy/provider.py Outdated Show resolved Hide resolved
@wakamex
Copy link
Author

wakamex commented Nov 7, 2022

incorporated the remaining comments

fubuloubu
fubuloubu previously approved these changes Nov 9, 2022
ape_alchemy/provider.py Outdated Show resolved Hide resolved
fubuloubu
fubuloubu previously approved these changes Nov 9, 2022
Copy link
Member

@fubuloubu fubuloubu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be best to get some of these parameters from the config object instead

perhaps @unparalleled-js can make a suggestion there

Comment on lines 142 to 143
def _make_request(self, endpoint: str, parameters: list,min_retry_delay:int = 1000, retry_backoff_factor:int = 2,
max_retry_delay:int = 30000, max_retries:int = 3, retry_jitter:int = 250) -> Any:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _make_request(self, endpoint: str, parameters: list,min_retry_delay:int = 1000, retry_backoff_factor:int = 2,
max_retry_delay:int = 30000, max_retries:int = 3, retry_jitter:int = 250) -> Any:
def _make_request(
self,
endpoint: str,
parameters: list,
min_retry_delay: int=1000,
retry_backoff_factor: int=2,
max_retry_delay: int=30000,
max_retries: int=3,
retry_jitter: int=250,
) -> Any:

@wakamex
Copy link
Author

wakamex commented Nov 28, 2022

checked that it still works with latest ape version 0.5.7.dev5+gf8d4ab07
using this test notebook https://github.com/wakamex/ape-stuff/blob/main/hello_ape.ipynb
you can see the alchemy provider config is basically empty right now (4th cell):

provider.config of type=<class 'ape.api.config.PluginConfig'>
 env_prefix = 
 env_file = None
 env_file_encoding = None
 env_nested_delimiter = None
 secrets_dir = None
 validate_all = True
 extra = forbid
 arbitrary_types_allowed = True
 case_sensitive = False
 prepare_field = <classmethod(<cyfunction BaseSettings.Config.prepare_field at 0x0000015911EE97D0>)>
 customise_sources = <classmethod(<cyfunction BaseSettings.Config.customise_sources at 0x0000015911EE98A0>)>
 parse_env_var = <classmethod(<cyfunction BaseSettings.Config.parse_env_var at 0x0000015911EE9970>)>

@@ -1,8 +1,10 @@
import os
import os, time
import numpy.random as random
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way we can use random instead? It is a built-in module to python. numpy is probably an indirect dependency of Ape and this plugin, however it is still protocol to include it in the setup.py... Thus, I think if we could use the built-in random module, it would be better. It has the same method that you use here randint

Suggested change
import numpy.random as random
import random

ape_alchemy/provider.py Show resolved Hide resolved
ape_alchemy/provider.py Outdated Show resolved Hide resolved
@fubuloubu
Copy link
Member

Found this cool package today: https://pyratelimiter.readthedocs.io/en/latest/

@wakamex wakamex mentioned this pull request Jan 14, 2023
27 tasks
@gosuto-inzasheru
Copy link
Contributor

love this! what is status? would hate to see this stale...

@fubuloubu
Copy link
Member

love this! what is status? would hate to see this stale...

Quite stale, needs another champion u less @wakamex comes back and implements the feedback. Just a few things to add, and can also implement this for other providers

@wakamex
Copy link
Author

wakamex commented Apr 18, 2023

love this! what is status? would hate to see this stale...

Quite stale, needs another champion u less @wakamex comes back and implements the feedback. Just a few things to add, and can also implement this for other providers

i have been summoned! been pretty heads down on other stuff since December.. but I learned a lot about incorporating PR comments since then 😅

lemme look at this again, I'll ping in discord with any questions

@wakamex
Copy link
Author

wakamex commented Jan 25, 2024

I learned how to use custom configs!

added concurrency and block_page_size to the custom config, so we can see nice outputs like this:

print(f"provider.config of type={type(context.provider.config)}")
for k,v in context.provider.config.__dict__.items():
    if not k.startswith('__'):
        print(f" {k} = {v}")
provider.config of type=<class 'ape_alchemy.provider.AlchemyConfig'>
 concurrency = 1
 block_page_size = 250000
 min_retry_delay = 1000
 retry_backoff_factor = 2
 max_retry_delay = 30000
 max_retries = 3
 retry_jitter = 250

I set block_page_size to a really big number, though I don't see any reason it shouldn't be infinite, since it seems to act as an upper limit. Here are my test results:

default:
    alchemy through ape took 78.44 seconds for 1291 events (events per second: 16.46)
block_page_size = 50_000:
    alchemy through ape took 0.85 seconds for 35 events (events per second: 41.09)
    alchemy through ape took 2.37 seconds for 1290 events (events per second: 544.34)
    alchemy through ape took 61.28 seconds for 40281 events (events per second: 657.27)
block_page_size = 10_000:
    alchemy through ape took 0.84 seconds for 35 events (events per second: 41.64)
    alchemy through ape took 1.72 seconds for 1290 events (events per second: 750.73)
    alchemy through ape took 29.22 seconds for 40281 events (events per second: 1378.50)
block_page_size = 50_000:
    alchemy through ape took 0.88 seconds for 35 events (events per second: 39.91)
    alchemy through ape took 1.09 seconds for 1290 events (events per second: 1179.53)
    alchemy through ape took 13.13 seconds for 40281 events (events per second: 3067.61)
    INFO (ape-alchemy): Alchemy compute units exceeded, retrying, attempt 1/3 in 1000 ms
    INFO (ape-alchemy): Alchemy compute units exceeded, retrying, attempt 1/3 in 1000 ms
    INFO (ape-alchemy): Alchemy compute units exceeded, retrying, attempt 1/3 in 1000 ms
    INFO (ape-alchemy): Alchemy compute units exceeded, retrying, attempt 1/3 in 1000 ms
    INFO (ape-alchemy): Alchemy compute units exceeded, retrying, attempt 1/3 in 1000 ms
    INFO (ape-alchemy): Alchemy compute units exceeded, retrying, attempt 1/3 in 1000 ms
    INFO (ape-alchemy): Alchemy compute units exceeded, retrying, attempt 1/3 in 1000 ms
    INFO (ape-alchemy): Alchemy compute units exceeded, retrying, attempt 1/3 in 1000 ms
    alchemy through ape took 57.42 seconds for 123991 events (events per second: 2159.28)
block_page_size = 250_000:
    alchemy through ape took 0.64 seconds for 35 events (events per second: 54.67)
    alchemy through ape took 1.18 seconds for 1290 events (events per second: 1092.30)
    alchemy through ape took 12.72 seconds for 40281 events (events per second: 3165.94)
    alchemy through ape took 39.44 seconds for 123991 events (events per second: 3143.90)
infura default:
    infura through ape took 0.46 seconds for 35 events (events per second: 76.43)
    infura through ape took 1.01 seconds for 1290 events (events per second: 1276.06)
    infura through ape took 13.15 seconds for 40281 events (events per second: 3062.54)
    infura through ape took 41.21 seconds for 123991 events (events per second: 3008.86)

Copy link
Member

@antazoey antazoey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the config is awesome and well thought out but as-is I don't think it applies to all requests made to Alchemy with this provider because not all requests use the internal _make_request() method: some use higher-level web3.py methods which have their own make_request within web3.py

"""Configuration for Alchemy.

Fields
------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use google style docs!

else AlchemyProviderError
)
raise cls(message) from err
def _make_request(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is awesome but i don't think it'll apply to all the requests. Maybe that is ok for a first pass..

But, we need to use something web3.py middlerware or something such that all requests have rate-limiting handling logic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wakamex This is still the main reason we haven't hammered this through the gate... .make_request is not called for every request, and it is not guaranteed to be called at all unless we refactor Web3Provider from core or come up with a plan.

Right now, make_request serves as a blocker-prevention... Meaning, if an API is not available at the Ape-level, you can make a raw request to the provider (whether than be HTTP, WS, under-the-hood, is up to the plugin's implementation), and still script out whatever you need. It is not a function called for every request. So we need a way such that every request made to Alchemy is rate-limited regardless if it using .make_request (as a work-around or optimized call) or some other method such as web3.eth.get_balance.. This make_request is not the same as web3.provider.make_request.

else AlchemyProviderError
)
raise cls(message) from err
def _make_request(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base class has this same, internal method with different kwargs now, I wonder if that will cause problems?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if they're strictly after the existing (unchanged) signature, then I think it's probably fine? since they're strictly optional. baking them into the parameters parameter would clean this up, but I have no idea where that's created or what it's used for.

@wakamex
Copy link
Author

wakamex commented Jan 25, 2024

removed class fields since every Alchemy class should have access to the AlchemyConfig I believe. Also removed network_uris since mypy didn't complain.

@wakamex
Copy link
Author

wakamex commented Jan 25, 2024

not all requests use the internal _make_request() method: some use higher-level web3.py methods which have their own make_request within web3.py

which other methods need testing? I only tested this with query:

curve_steth = Contract("0xDC24316b9AE028F1497c275EB9192a3Ea0f67022")
events = curve_steth.TokenExchange.query("*", start_block=chain.blocks[-1].number - int(86400*365/12))  # last 1 year

@wakamex
Copy link
Author

wakamex commented Jan 26, 2024

did you just merge your main into my main?

@wakamex
Copy link
Author

wakamex commented Jan 26, 2024

try me now
image

fubuloubu
fubuloubu previously approved these changes Jan 26, 2024
@fubuloubu
Copy link
Member

Note: rate limit test would be hard and exhaust our already limited requests

@antazoey
Copy link
Member

Note: rate limit test would be hard and exhaust our already limited requests

this implies you can't test Python logic without making live network requests.
I am not saying I am requiring tests, but I am calling this out as a false excuse.

@wakamex
Copy link
Author

wakamex commented Jan 26, 2024 via email

@fubuloubu
Copy link
Member

Note: rate limit test would be hard and exhaust our already limited requests

this implies you can't test Python logic without making live network requests.

That's fair, can unit test it with mocks and that works

@fubuloubu
Copy link
Member

with big block_page_size we can rest with 1 request that gets lots of
data. but since Alchemy uses "compute units" it'll still count lots toward
your rate limit. but I doubt you use your Alchemy API key a ton?

We do use it quite a bit for some other plugins too, plus our own private testing use, so it's been a problem in the past

@wakamex
Copy link
Author

wakamex commented Feb 7, 2024

found a benchmark to justify infinite block page size

connection = "alchemy through ape"
context = networks.parse_network_choice('ethereum:mainnet:alchemy')
context.__enter__();
legacy_abi = '[{"anonymous": false,"inputs": [{ "indexed": false, "name": "postTotalPooledEther", "type": "uint256" },{ "indexed": false, "name": "preTotalPooledEther", "type": "uint256" },{ "indexed": false, "name": "timeElapsed", "type": "uint256" },{ "indexed": false, "name": "totalShares", "type": "uint256" }],"name": "PostTotalShares","type": "event"}]'
oracle1 = Contract("0x442af784A788A5bd6F42A01Ebe9F287a871243fb", abi=legacy_abi)  # steth legacy oracle
v2_abi = '[{"anonymous": false,"inputs": [{ "indexed": true, "name": "reportTimestamp", "type": "uint256" },{ "indexed": false, "name": "timeElapsed", "type": "uint256" },{ "indexed": false, "name": "preTotalShares", "type": "uint256" },{ "indexed": false, "name": "preTotalEther", "type": "uint256" },{ "indexed": false, "name": "postTotalShares", "type": "uint256" },{ "indexed": false, "name": "postTotalEther", "type": "uint256" },{ "indexed": false, "name": "sharesMintedAsFees", "type": "uint256" }],"name": "TokenRebased","type": "event"}]'
steth = Contract("0xae7ab96520de3a18e5e111b5eaab095312d7fe84",abi=v2_abi)  # Lido v2 main contract

# %%
# local node through ape took 41.28 seconds for 1011 events (events per second: 24.49)
# 250k block page size: alchemy through ape took 10.33 seconds for 1012 events (events per second: 97.95)
# 2.5m block page size: alchemy through ape took 1.60 seconds for 1012 events (events per second: 631.47)
# 25m block page size: alchemy through ape took 0.90 seconds for 1012 events (events per second: 1128.32)
# infura through ape took 9.62 seconds for 1012 events (events per second: 105.17)
start_time = time.time()
events = oracle1.PostTotalShares.query("*")
print(f"{connection} took {time.time()-start_time:.2f} seconds for {len(events)} events (events per second: {len(events)/(time.time()-start_time):.2f})")

# %%
# local node through ape took 69.69 seconds for 267 events (events per second: 3.83)
# 250k block page size: alchemy through ape took 11.92 seconds for 268 events (events per second: 22.48)
# 2.5m block page size: alchemy through ape took 1.34 seconds for 268 events (events per second: 200.64)
# 25m block page size: alchemy through ape took 0.90 seconds for 1012 events (events per second: 1128.32)
# infura through ape took 9.39 seconds for 268 events (events per second: 28.55)
start_time = time.time()
events2 = steth.TokenRebased.query("*")
print(f"{connection} took {time.time()-start_time:.2f} seconds for {len(events2)} events (events per second: {len(events2)/(time.time()-start_time):.2f})")

@wakamex
Copy link
Author

wakamex commented Feb 7, 2024

how would I mock up some tests? with like a MockLaggyProvider to connect to?

@antazoey
Copy link
Member

antazoey commented Feb 7, 2024

how would I mock up some tests? with like a MockLaggyProvider to connect to?

Using pytest.mock, you can basically mock anything in Python land.
Sometimes I mock web3 and then configure some side-effect responses which ca simulate waiting.

If it is too complex, I'd be happy to take a swing at it and share how I did it

antazoey
antazoey previously approved these changes Feb 7, 2024
@wakamex
Copy link
Author

wakamex commented Feb 7, 2024

I can take a look, pytest.mock sounds like it should be easy.

message = str(err)
if any(
error in message
for error in ["exceeded its compute units", "Too Many Requests for url"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the main part that changes per-provider (the specific error string to look for). I am thinking we may do what you suggested and move some of this to core when using default node connections (ape-geth), we can still receive this benefit.

Maybe we can a method to the base class that can be overriden like:

def is_rate_limit_error(self, err):

delay. Defaults to 250 milliseconds.
"""

concurrency: int = 1 # can't do exponential backoff with multiple threads
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make sense to expose then?

@wakamex
Copy link
Author

wakamex commented May 13, 2024

started mocking up a test for the exponential backoff

oh, and I noticed a drawback of setting the page size really high (when there are more than X results, alchemy throws an error)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: rate limiting not being handled add rate limit logic
4 participants