Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question in getting the previous weights #113

Open
BruceDai003 opened this issue Nov 27, 2018 · 0 comments
Open

Question in getting the previous weights #113

BruceDai003 opened this issue Nov 27, 2018 · 0 comments

Comments

@BruceDai003
Copy link

Thanks for your great work, it's very interesting. As I walk through your code, I found that in the file datamatrices.py, there is a __pack_samples() method under the DataMatrices, which is confusing to me.

First, let me paste it all here for reference:

    def __pack_samples(self, indexs):
        indexs = np.array(indexs)
        last_w = self.__PVM.values[indexs-1, :]

        def setw(w):
            self.__PVM.iloc[indexs, :] = w
        M = [self.get_submatrix(index) for index in indexs]
        M = np.array(M)
        X = M[:, :, :, :-1]
        y = M[:, :, :, -1] / M[:, 0, None, :, -2]
        return {"X": X, "y": y, "last_w": last_w, "setw": setw}

    # volume in y is the volume in next access period
    def get_submatrix(self, ind):
        return self.__global_data.values[:, :, ind:ind+self._window_size+1]

In the initialization of the test_set, we passed in the test_indices, which are np.array([32281, ..., 35056]) by default, . If I understand correctly, shape of X would be 2776, 3, 11, 31, y would be 2776, 3, 11.
y is the next day's 'close', 'high', 'low' price array, normalized relative to the last day's close price in X for each sample.
Now the confusing part is in the weights. Let me use t represent for the time index here. For t = 32281, we get the X from t = 32281 up to 32311, that's 31 days, and for y, t = 32312. So, for this sample, you are using forward looking samples. What I mean is that you are actually sitting at the time of t = 32311, and look backwards for 31 days including t = 32311. And try to predict the weights for t = 32312. This is your intention. Now you should use weights of t = 32311 as the other input. And the weights for the action is at t = 32312. However, instead you used weights at t = 32280 as an input. That's the weight 32 days ago.

Thus, my suggested correction would look like this:

    def __pack_samples(self, indexs):
        indexs = np.array(indexs)
        last_w = self.__PVM.values[indexs+self._window_size-1, :]

        def setw(w):
            self.__PVM.iloc[indexs+self._window_size, :] = w
        M = [self.get_submatrix(index) for index in indexs]
        M = np.array(M)
        X = M[:, :, :, :-1]
        y = M[:, :, :, -1] / M[:, 0, None, :, -2]
        return {"X": X, "y": y, "last_w": last_w, "setw": setw}

    # volume in y is the volume in next access period
    def get_submatrix(self, ind):
        return self.__global_data.values[:, :, ind:ind+self._window_size+1]

I might be wrong here. Please help me understand this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant