Skip to content
This repository has been archived by the owner on Oct 10, 2022. It is now read-only.

Question #1

Open
joaosalvado10 opened this issue Dec 15, 2017 · 17 comments
Open

Question #1

joaosalvado10 opened this issue Dec 15, 2017 · 17 comments

Comments

@joaosalvado10
Copy link

Hello, this seems like a nice work.
I have some questions though.
When you're comparing on the test data what does refer the market value?
In your perspective do you think this is a good approach?

@vermouth1992
Copy link
Owner

vermouth1992 commented Dec 24, 2017

The market value refers to the return of equal distribution of your current investment volume. I think it's better to incorporate news data into the model as it is super useful to indicate the sharp transition of the market.

@joaosalvado10
Copy link
Author

How did you include news? And did you include news for each stock separately?

@vermouth1992
Copy link
Owner

We didn't include the news in this course project due to time limit. The general idea is to predict the sentiment label (positive, negative, neutral) of each stock at each timestamp and use it to guide the choice of prediction produced by value. The key assumption is that we assume the stock market follow the statistics of history value until certain turning point happens, which can be reflected by the news.

@joaosalvado10
Copy link
Author

joaosalvado10 commented Jan 3, 2018

So in order to include the news, it would be necessary to change the model and training it again right?
At this moment the model is only capable of receiving the open high low close of the 16 stocks right?
How can I reTrain the model?

Also, I found out that the model always tries to buy only one stock instead of distributing the money, this could be a nice approach, however, a bit risky no?

How would you ensemble the imitation learning and the DDPG?
What kind of improves you think that should be made beside ensemble and using the Sharpe ratio instead of realized the profit.

@vermouth1992
Copy link
Owner

Yes, the result of imitation learning is to just buy one stock. We try to optimize Sharpe ratio directly, but it turns out to be a very difficult problem since it's not a standard MDP. To include news, you have to build another separate model. It would take considerable amount of time to implement because collecting and processing the dataset is not very easy at the first place.

@joaosalvado10
Copy link
Author

Yes, but by the time that I would have let's say the sentiment of the news I think that it would not take that much effort.
Also, I found out that this model is only capable of buy (long) it is not capable of sell (short) wouldn't make sense to include it in the model?

@vermouth1992
Copy link
Owner

There is an assumption of the trading rule. Buy using the open price and sell all the holdings at the close price on the same day and repeat.

@joaosalvado10
Copy link
Author

joaosalvado10 commented Jan 5, 2018

Yes, I know but besides buying a stock is also possible to short one stock which is the opposite of buying the stock. In order to do this the output layer of the agent would have to be capable of output values between -1 and 1 instead of 0 and 1, also the reward function would need to be changed in order to include this. It is good to include this because if there is any crash in the stock markets it is important that the agent stop buying and start shorting instead.

When I said that the agent always output 1 for one stock I was referring to the DDPG agent, I mean the agent does not output 1 but it outputs a value really close to 1 for one stock while outputs values really close to zero for the other stocks.

I found this have a look:
https://github.com/ZhengyaoJiang/PGPortfolio

@joaosalvado10
Copy link
Author

I changed the code so the instead of feeding the network with only the close/open ration, feed the network with open high low close and stock news, however, when I do this the network does not learn anymore. Any reason why do you feed only with close/open ratio and how would you change?

@vermouth1992
Copy link
Owner

You need to normalize the data first.

@joaosalvado10
Copy link
Author

I did normalize it.
How would you allow the network the possibility to short a stock? i think that is also important

@joaosalvado10
Copy link
Author

Also i found out that during the test the output of the network is always almost 1 for one stock and zero for the others. This can not be the best choice, so do you know how to solve this?

@vermouth1992
Copy link
Owner

The best choice is actually investing anything for only 1 stock with highest return if we know the future. The reason that we split our investment is to avoid risk of false prediction. If you are running imitation learning, then it is actually expected. For DDPG, it's very hard to tell. You need to look at the actual trading performance.

@joaosalvado10
Copy link
Author

In fact that is right, however, when I looked at the output produced by the DDPG it seems like it takes always the same decision. (I am using the pretrained algo with different stocks, trying to see if it capable of generalizing)

@vermouth1992
Copy link
Owner

Maybe you want to take a look at the input scale. The pretrained model assumes the input is close/open ratio and normalize with (x - 1) / 100.

@joaosalvado10
Copy link
Author

Yeah, I realized that but when predict single is called it normalizes the observation.
I realized that when I trained the DDPG model is quite difficult for generalizing, I cannot replicate your results when using my own dataset.

@yanxurui
Copy link

Hello, I could not find how the pretrained model is generated. Is something missing?

Thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants