Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize getting Price data #16

Open
wants to merge 60 commits into
base: main
Choose a base branch
from
Open

Conversation

scientes
Copy link
Contributor

@scientes scientes commented Mar 5, 2021

Closes #14

This tries to fetch the price data in batches from an supported exchange (must be supported by ccxt)
if this fails for certain datapoints it does nothing and thus lets the current implementation try to fetch the price data.
The current state works, but is not very well documented and may leave room for improvement in certain areas.

This was only tested using my personal dataset of transactions with Binance.
Fetching Batches from multiple exchanges is not Supported yet.

Also produces very much output in the logs due to ccxt on logging.DEBUG. This should maybe be fixed.

Suggestions Welcome

Copy link
Owner

@provinzio provinzio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @scientes,

thank you for your hard work. I really appreaciate the time you took to develop the grouped price data import.

I took quite some time, too, to think about your implementation. Some of my ideas might be overkill, bullshit or need some further tuning. I hope, that you don't feel overrun by all of this and that we can stay in a dialog to get this into CoinTaxman. Let me know, how you feel about my review.

Some of my comments are my personal recommandation, you don't have to bend for them. Feel free to stay with your coding style.

As you already indicated, I would like to see some documentation, docstrings on important functions and type annotations.

Lets rock this together :)

src/price_data.py Outdated Show resolved Hide resolved
src/price_data.py Outdated Show resolved Hide resolved
src/price_data.py Outdated Show resolved Hide resolved
src/price_data.py Outdated Show resolved Hide resolved
src/price_data.py Outdated Show resolved Hide resolved
src/price_data.py Outdated Show resolved Hide resolved
src/price_data.py Outdated Show resolved Hide resolved
src/price_data.py Outdated Show resolved Hide resolved
src/price_data.py Outdated Show resolved Hide resolved
src/price_data.py Outdated Show resolved Hide resolved
@scientes
Copy link
Contributor Author

scientes commented Mar 10, 2021

I sadly do not have the time to work on this further this month, but ill try and finish it to the beginning of the next month.
But if anybody wants to make changes that move this PR forward go ahead

@provinzio provinzio marked this pull request as draft April 1, 2021 06:51
@scientes
Copy link
Contributor Author

scientes commented Apr 4, 2021

@provinzio i just pushed a PoC for a graph based approach. this approach is atm a standalone and not yet integrated into the actual program. Also it does not look pretty and could have been coded better at some places. But i wanted to share my current progress so you could have a first look at it and find stuff i missed.

Things i know that need improvement:

  • there is not that much in the way of configuration possible
  • if a exchange does not support 1Week candles the active timeframe of a pair can not be determined and it is ignored a fallback to 1d candles is needed here. Also if the timeframe is larger than 1000 candles (19 years in 1w candles 2.7 years in 1d candles) errors will be thrown
  • I wanted to sort the paths by volume but that is not possible without requesting the data for all paths wich takes a long time 1-2s per path and there are about 5-100 paths depending on max path length and amount of pairs available for a pair
  • the pathfinding algo is a little wasteful due to it calculating all possible paths and can be slow with many pairs (with 2800 pairs, 7 exchanges,it took about 3-4s to find all paths)

But on the upside

This should make the price calculation multi exchange ready and when implemented properly should provide ample fallback paths so that the price calculation should always work, even with every fiat currency.

src/price_data.py Outdated Show resolved Hide resolved
@provinzio
Copy link
Owner

I like the general idea and would love to see the end result. I wonder why cctx does not support anything like this. Looks like a basic feature for me.

I wanted to sort the paths by volume but that is not possible without requesting the data for all paths wich takes a long time 1-2s per path and there are about 5-100 paths depending on max path length and amount of pairs available for a pair

Are we able, to get all pairs with one request or something similar? The algorithm might profit from fetching a lot of data and preprocessing it.

@scientes
Copy link
Contributor Author

scientes commented Apr 6, 2021

I like the general idea and would love to see the end result. I wonder why cctx does not support anything like this. Looks like a basic feature for me.

I wanted to sort the paths by volume but that is not possible without requesting the data for all paths wich takes a long time 1-2s per path and there are about 5-100 paths depending on max path length and amount of pairs available for a pair

Are we able, to get all pairs with one request or something similar? The algorithm might profit from fetching a lot of data and preprocessing it.

well yes in a way i can fetch information per exchange on which pairs it offers but i have not found a way to determine per pair from when to when it was active, and because of that i currently request the weekly candles per pair to determine the timeframe in which the pair was traded as a side bonus i get the volume information per week

@scientes
Copy link
Contributor Author

scientes commented Apr 6, 2021

I wonder why cctx does not support anything like this. Looks like a basic feature for me.

well cctx is supposed to be a libary which standartises the apis for the exchanges for the end user nothing more nothing less and i suppose that most exchanges do not provide the data which describes from when to when a pair was active.

i could however sort the pairs/paths by current volume but that does not really help. also the volume feature is not really needed and is a small side bonus to the actual purpose which is to find a price for every trade, which i think that my current implementation can achieve

@scientes scientes marked this pull request as ready for review April 8, 2021 08:04
@scientes scientes requested a review from provinzio April 8, 2021 08:04
@scientes
Copy link
Contributor Author

scientes commented Apr 8, 2021

The current implementation is tested with my data and works for me. the intial pair takes some time to find the right path when we have old data (2017) but due to caching after the initial pair it should be pretty fast

@provinzio
Copy link
Owner

@scientes could you rebase your branch before I review?

@scientes
Copy link
Contributor Author

scientes commented Apr 8, 2021

i hope i didn't mess anything up. i do not rebase that often^^

@scientes
Copy link
Contributor Author

scientes commented Apr 8, 2021

after rebase i get this error:

  File "/home/bastian/Documents/git/CoinTaxman/src/main.py", line 20, in <module>
    from book import Book
  File "/home/bastian/Documents/git/CoinTaxman/src/book.py", line 25, in <module>
    import misc
  File "/home/bastian/Documents/git/CoinTaxman/src/misc.py", line 40, in <module>
    L = TypeVar("L", bound=list[Any])
TypeError: 'type' object is not subscriptable

@provinzio
Copy link
Owner

after rebase i get this error:

  File "/home/bastian/Documents/git/CoinTaxman/src/main.py", line 20, in <module>
    from book import Book
  File "/home/bastian/Documents/git/CoinTaxman/src/book.py", line 25, in <module>
    import misc
  File "/home/bastian/Documents/git/CoinTaxman/src/misc.py", line 40, in <module>
    L = TypeVar("L", bound=list[Any])
TypeError: 'type' object is not subscriptable

What is your python version? This shouldn't be a problem with 3.9

@scientes
Copy link
Contributor Author

scientes commented Apr 8, 2021

well yeah code switched to 3.8.5 without me noticing

@provinzio
Copy link
Owner

provinzio commented Apr 8, 2021

I am surprised that I can not see the workflow checks directly in this PR..
whatever :D could you fix the checks? You can have a look into the Makefile to see how you can run the formatter/linter.

You should run at least black and isort.

It would be awesome if you can fix the mypy errors, which help incredibly to find bugs related to wrong types.

@scientes
Copy link
Contributor Author

scientes commented Apr 8, 2021

Well i get Emails that the Pipelines are running and failing (flake8 and mypy)

@provinzio
Copy link
Owner

i hope i didn't mess anything up. i do not rebase that often^^

I don't really get how you combined the branches.
It looks like a rebase with an additional merge. The commit history tree looks really fucked up :D You might want to look into it to avoid unnoticed merging errors.
It could be the case, that your push after the rebase wasn't a git push --force, but a merge onto the server... but for some weird reason, the merging commit shows differences, which shouldn't be possible?

Rebasing should work with git rebase origin/main and git push --force to overwrite the server commit history.

@scientes
Copy link
Contributor Author

scientes commented Apr 8, 2021

Well i just used the rebase function in vs Code. And after that it showed 11 pulls from origin dunno why

@scientes
Copy link
Contributor Author

scientes commented Apr 8, 2021

Ah i missed the force and thats why the merge 😄

@scientes
Copy link
Contributor Author

scientes commented Apr 8, 2021

The differences i Do Not understand

@provinzio
Copy link
Owner

provinzio commented Apr 8, 2021

Well i just used the rebase function in vs Code. And after that it showed 11 pulls from origin dunno why

I'd really would like to keep a clean commit history. But I guess it started already when you merged the newest changes into your main instead of rebasing (a40e2ef).

But I don't want you to have a lot of work because of my stupid ideals. While fixing I even found an unwanted merging error. I fixed the history in my Branch @scientes. It has the same data as 5278f0d, which is the current head of your branch ohlcv-batch.

You could do (from your branch ohlcv-batch) git reset --hard origin/@scientes and git push --force to overwrite your branch with mine.

If you have more data right now: there are different ways how we could bring them togther. For example: Commit all your changes to ohlcv-batch. Create temporary branch to save this point git checkout -b tmp. Do the reset. Cherry-pick the additional commits from your tmp branch onto the new ohlcv-batch.

Please excuse my sturdiness. I hope I dont scare you off, but we might profit from a clean commit history later on.


Edit: Btw you can compare the branches with the excellent VS Code Extension Git Tree compare to see for yourself.

@scientes
Copy link
Contributor Author

scientes commented Apr 8, 2021

i think git doesnt like the branch name @scientes eToro and main work on reset but @scientes fails with

git reset --hard  origin/@scientes
fatal: ambiguous argument 'origin/@scientes': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'

edit: forgot to fetch from all remotes (not only my fork)

@Griffsano
Copy link
Contributor

Hey @scientes,

for the ohlcv-batch branch, I'm getting inaccurate price data for Kraken exports (see table below). The fetched prices are constant over multiple timestamps. I'm working on commit bbd8bb8, and added "kraken" to the EXCHANGES list in config.py.

Console output (for each entry):

2021-12-28 17:58:15,444 price_data   INFO     getting data from 01-Nov-2021 (12:00) to 01-Nov-2021 (12:00) for BTC
2021-12-28 17:58:17,073 price_data   DEBUG    found path over BTC/EUR (kraken)

However, if you replace BTC by e.g. XXBT (the "dirty" Kraken coin name), the "old" API call is made and the prices look more realistic (they vary for each timestamp):

2021-12-28 17:54:11,570 price_data   INFO     getting data from 01-Nov-2021 (12:00) to 01-Nov-2021 (12:00) for XBT
...
2021-12-28 17:54:11,598 price_data   DEBUG    Querying trades for XXBTZEUR at 2021-11-02 12:00:00+00:00 (offset=10m): Calling https://api.kraken.com/0/public/Trades?pair=XXBTZEUR&since=1635853800000000000

Have you experienced something similar? Could it be that the resolution of the OHLCV is too low (e.g. only weekly data)?

Inaccurate price data in kraken.db, table BTC/EUR:

utc_time price
2021-11-01 12:00:00+00:00 43383.85
2021-11-02 12:00:00+00:00 43383.85
2021-11-03 12:00:00+00:00 43383.85
2021-11-04 12:00:00+00:00 43383.85
2021-11-05 12:00:00+00:00 43383.85
2021-11-06 12:00:00+00:00 43383.85
2021-11-07 12:00:00+00:00 43383.85
2021-11-08 12:00:00+00:00 43356.85
2021-11-09 12:00:00+00:00 43356.85
2021-11-10 12:00:00+00:00 43356.85
2021-11-11 12:00:00+00:00 43356.85
2021-11-12 12:00:00+00:00 43356.85
2021-11-13 12:00:00+00:00 43356.85
2021-11-14 12:00:00+00:00 43356.85
2021-11-15 12:00:00+00:00 43356.85
2021-11-16 12:00:00+00:00 43356.85
2021-11-17 12:00:00+00:00 43356.85
2021-11-18 12:00:00+00:00 43356.85
2021-11-19 12:00:00+00:00 43397.3
2021-11-20 12:00:00+00:00 43397.3
2021-11-21 12:00:00+00:00 43397.3
2021-11-22 12:00:00+00:00 43397.3
2021-11-23 12:00:00+00:00 43397.3
2021-11-24 12:00:00+00:00 43397.3
2021-11-25 12:00:00+00:00 43397.3
2021-11-26 12:00:00+00:00 43397.3
2021-11-27 12:00:00+00:00 43397.3
2021-11-28 12:00:00+00:00 43397.3
2021-11-29 12:00:00+00:00 43397.3
2021-11-30 12:00:00+00:00 43394.15

Dummy CSV export file:

"txid","refid","time","type","subtype","aclass","asset","amount","fee","balance"
"","","2021-11-01 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-01 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-02 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-02 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-03 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-03 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-04 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-04 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-05 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-05 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-06 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-06 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-07 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-07 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-08 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-08 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-09 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-09 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-10 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-10 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-11 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-11 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-12 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-12 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-13 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-13 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-14 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-14 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-15 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-15 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-16 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-16 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-17 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-17 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-18 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-18 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-19 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-19 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-20 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-20 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-21 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-21 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-22 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-22 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-23 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-23 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-24 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-24 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-25 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-25 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-26 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-26 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-27 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-27 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-28 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-28 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-29 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-29 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-30 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-30 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0

@scientes
Copy link
Contributor Author

scientes commented Dec 28, 2021

Could it be that the resolution of the OHLCV is too low (e.g. only weekly data)?

Shouldn't be. The weekly calls are only for looking for the time period in which the pair Was traded on the platform. For the actual data 1m candles are used.

However, if you replace BTC by e.g. XXBT (the "dirty" Kraken coin name), the "old" API call is made and the prices look more realistic (they vary for each timestamp):

Weird normally ccxt should convert the Symbols automatically to the exchange Variant. But maybe theres some Bug or wrong Implementation on Our side there

Ill Look into it tomorrow

@scientes
Copy link
Contributor Author

Hey @scientes,

for the ohlcv-batch branch, I'm getting inaccurate price data for Kraken exports (see table below). The fetched prices are constant over multiple timestamps. I'm working on commit bbd8bb8, and added "kraken" to the EXCHANGES list in config.py.

Console output (for each entry):

2021-12-28 17:58:15,444 price_data   INFO     getting data from 01-Nov-2021 (12:00) to 01-Nov-2021 (12:00) for BTC
2021-12-28 17:58:17,073 price_data   DEBUG    found path over BTC/EUR (kraken)

However, if you replace BTC by e.g. XXBT (the "dirty" Kraken coin name), the "old" API call is made and the prices look more realistic (they vary for each timestamp):

2021-12-28 17:54:11,570 price_data   INFO     getting data from 01-Nov-2021 (12:00) to 01-Nov-2021 (12:00) for XBT
...
2021-12-28 17:54:11,598 price_data   DEBUG    Querying trades for XXBTZEUR at 2021-11-02 12:00:00+00:00 (offset=10m): Calling https://api.kraken.com/0/public/Trades?pair=XXBTZEUR&since=1635853800000000000

Have you experienced something similar? Could it be that the resolution of the OHLCV is too low (e.g. only weekly data)?

Inaccurate price data in kraken.db, table BTC/EUR:
utc_time price
2021-11-01 12:00:00+00:00 43383.85
2021-11-02 12:00:00+00:00 43383.85
2021-11-03 12:00:00+00:00 43383.85
2021-11-04 12:00:00+00:00 43383.85
2021-11-05 12:00:00+00:00 43383.85
2021-11-06 12:00:00+00:00 43383.85
2021-11-07 12:00:00+00:00 43383.85
2021-11-08 12:00:00+00:00 43356.85
2021-11-09 12:00:00+00:00 43356.85
2021-11-10 12:00:00+00:00 43356.85
2021-11-11 12:00:00+00:00 43356.85
2021-11-12 12:00:00+00:00 43356.85
2021-11-13 12:00:00+00:00 43356.85
2021-11-14 12:00:00+00:00 43356.85
2021-11-15 12:00:00+00:00 43356.85
2021-11-16 12:00:00+00:00 43356.85
2021-11-17 12:00:00+00:00 43356.85
2021-11-18 12:00:00+00:00 43356.85
2021-11-19 12:00:00+00:00 43397.3
2021-11-20 12:00:00+00:00 43397.3
2021-11-21 12:00:00+00:00 43397.3
2021-11-22 12:00:00+00:00 43397.3
2021-11-23 12:00:00+00:00 43397.3
2021-11-24 12:00:00+00:00 43397.3
2021-11-25 12:00:00+00:00 43397.3
2021-11-26 12:00:00+00:00 43397.3
2021-11-27 12:00:00+00:00 43397.3
2021-11-28 12:00:00+00:00 43397.3
2021-11-29 12:00:00+00:00 43397.3
2021-11-30 12:00:00+00:00 43394.15

Dummy CSV export file:

"txid","refid","time","type","subtype","aclass","asset","amount","fee","balance"
"","","2021-11-01 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-01 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-02 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-02 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-03 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-03 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-04 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-04 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-05 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-05 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-06 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-06 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-07 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-07 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-08 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-08 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-09 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-09 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-10 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-10 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-11 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-11 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-12 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-12 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-13 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-13 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-14 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-14 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-15 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-15 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-16 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-16 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-17 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-17 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-18 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-18 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-19 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-19 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-20 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-20 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-21 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-21 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-22 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-22 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-23 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-23 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-24 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-24 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-25 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-25 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-26 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-26 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-27 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-27 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-28 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-28 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0
"","","2021-11-29 12:00:00","receive","","currency","BTC",1.0,0.0,0.0
"","","2021-11-29 12:00:00","spend","","currency","ZEUR",-1.0,0.0,0.0
"","","2021-11-30 12:00:00","spend","","currency","BTC",-1.0,0.0,0.0
"","","2021-11-30 12:00:00","receive","","currency","ZEUR",1.0,0.0,0.0

this error is extremely weird and kraken specific i think:
the data reported by kraken through ccxt is off

start,stop,candle_ts,symbol,open,high,low,close,avg
1635768000000,1635768000000,1640923200000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1635768000000,1635768000000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1635854400000,1635854400000,1640923200000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1635854400000,1635854400000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1635940800000,1635940800000,1640923200000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1635940800000,1635940800000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636027200000,1636027200000,1640923200000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1636027200000,1636027200000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636113600000,1636113600000,1640923200000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1636113600000,1636113600000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636200000000,1636200000000,1640923200000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1636200000000,1636200000000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636286400000,1636286400000,1640923200000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1636286400000,1636286400000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636372800000,1636372800000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636372800000,1636372800000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1636459200000,1636459200000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636459200000,1636459200000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1636545600000,1636545600000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636545600000,1636545600000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1636632000000,1636632000000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636632000000,1636632000000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1636718400000,1636718400000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636718400000,1636718400000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1636804800000,1636804800000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636804800000,1636804800000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1636891200000,1636891200000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636891200000,1636891200000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1636977600000,1636977600000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1636977600000,1636977600000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637064000000,1637064000000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1637064000000,1637064000000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637150400000,1637150400000,1640923260000,BTC/EUR,41742.1,41742.1,41741.0,41741.1,41741.6
1637150400000,1637150400000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637236800000,1637236800000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637236800000,1637236800000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1637323200000,1637323200000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637323200000,1637323200000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1637409600000,1637409600000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637409600000,1637409600000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1637496000000,1637496000000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637496000000,1637496000000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1637582400000,1637582400000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637582400000,1637582400000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1637668800000,1637668800000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637668800000,1637668800000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1637755200000,1637755200000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637755200000,1637755200000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1637841600000,1637841600000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637841600000,1637841600000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1637928000000,1637928000000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1637928000000,1637928000000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1638014400000,1638014400000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1638014400000,1638014400000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1638100800000,1638100800000,1640923320000,BTC/EUR,41741.0,41741.0,41741.0,41741.0,41741.0
1638100800000,1638100800000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1638187200000,1638187200000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1638187200000,1638187200000,1640923440000,BTC/EUR,41741.1,41741.1,41692.4,41720.3,41730.7
1638273600000,1638273600000,1640923380000,BTC/EUR,41741.1,41741.1,41741.0,41741.0,41741.05
1638273600000,1638273600000,1640923440000,BTC/EUR,41741.1,41741.1,41692.4,41720.3,41730.7

as you can see the time frame requested and the timestamp of the candle which is reported back vary significantly. more interesting is that the timestamp reported back is not much older than 12 hours to execution time
the best solution until i fugure out how to get older data from kraken would be to not use kraken for price data at all for now

@scientes
Copy link
Contributor Author

https://www.freqtrade.io/en/stable/exchanges/#historic-kraken-data

The Kraken API does only provide 720 historic candles, which is sufficient for Freqtrade dry-run and live trade modes, but is a problem for backtesting. To download data for the Kraken exchange, using --dl-trades is mandatory, otherwise the bot will download the same 720 candles over and over, and you'll not have enough backtest data.

using trades instead of candles was another approach i wanted to implement in the future as an alternative anyways but I'd rather not do this in this PR. it has taken to long anyways

@scientes
Copy link
Contributor Author

@provinzio i think the current state is merge ready. what do you think?

@provinzio provinzio marked this pull request as draft January 2, 2022 21:20
@provinzio provinzio marked this pull request as ready for review January 2, 2022 21:22
@oldgitdaddy
Copy link

oldgitdaddy commented Jan 3, 2022

Would be great to have this available in a merged fashion together with the latest fix for kraken (#100) on main. Currently, I get also stuck on my analysis of binance trades (recursion problem and some buy/sell issue).

@provinzio
Copy link
Owner

Would be great to have this available in a merged fashion together with the latest fix for kraken (#100) on main. Currently, I get also stuck on my analysis of binance trades (recursion problem and some buy/sell issue).

I try to work on it as soon as possible, this is definitly an awesome PR. I appreciate everybody who helps reviewing the code. Feel free to give your feedback as well.

@scientes scientes mentioned this pull request Jan 3, 2022
* progress bar output: Reduced to one line

* flake8

* variable and file naming

* detailed warning if price already exists in database

* do not preload prices for Kraken
@scientes
Copy link
Contributor Author

scientes commented Jan 21, 2022

Just noticed that ccxt has a very fast release cycle and has almost hourly package updates. I'd suggest unpinning the ccxt version in the requirements.txt package so that when Cointaxman is installed, the newest version of ccxt is always installed.

https://github.com/ccxt/ccxt/tags

@Griffsano
Copy link
Contributor

Just to let you know, I've tested this PR with my Binance/Coinbase logs and it worked flawlessly :)
Awesome graph-theory-based path finding approach, thanks for implementing!

Copy link
Owner

@provinzio provinzio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @scientes,

I am very sorry that my review is taking so long.
Your work on this branch is impressive, but also comprehensive.
I'll try my best, to give your work a good review.

I merged origin/main into this branch so that we keep everything up to date. :)

Please have a look at the left over TODOs in the code and the open conversations.

@@ -516,6 +522,70 @@ def get_price(

return price

def get_missing_price_operations(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need to query the prices for all operation types, e.g. CoinLend and CoinLendEnd.


Although I have a local database with all necessary prices setup, the tool is fetching the same prices on every run.

2022-02-06 12:32:52,967 price_data   DEBUG    found path over BTC/EUR (binance)
2022-02-06 12:32:51,186 price_data   INFO     getting data from 16-May-2021 (02:42) to 16-May-2021 (02:42) for BTC
2022-02-06 12:32:51,526 price_data   DEBUG    found path over BTC/EUR (binance)
2022-02-06 12:32:52,625 price_data   INFO     getting data from 17-May-2021 (02:46) to 17-May-2021 (02:46) for BTC

On every run, this function returns, that the prices for the operation CoinLend is missing, the log output states, that the prices are fetched.. but on the next run, the prices are fetched again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be sufficient to drop all elements of operations that are coin lend, coin lend end, staking, staking end etc. at the beginning of the function? Theoretically this could be different for each country, so it might make sense to differentiate here before dropping the elements.

@@ -60,3 +60,4 @@ def IS_LONG_TERM(buy: datetime, sell: datetime) -> bool:
DATA_PATH = Path(BASE_PATH, "data")
EXPORT_PATH = Path(BASE_PATH, "export")
FIAT = FIAT_CLASS.name # Convert to string.
EXCHANGES = ["binance", "coinbasepro"]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment for the EXCHANGES variable. Whats the benefit of the variable? Why do we need it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OHLCV feature is not supported for all exchanges (e.g. Kraken, it works without error, but the returned data is inaccurate and not useable). So I see two options: Define all supported (and tested) exchanges in a list, or define exchanges that should not be used for this feature. I would go with the first option (which is how it is implemented now).

@provinzio
Copy link
Owner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize getting Price data
4 participants