Skip to content
This repository has been archived by the owner on Oct 15, 2019. It is now read-only.

Can I create a variable shared by forward and back propagation in customop('numpy')? #173

Open
rguo12 opened this issue Apr 19, 2017 · 1 comment

Comments

@rguo12
Copy link

rguo12 commented Apr 19, 2017

I do not quite understand the mechanism behind @customop('numpy').
I find that there's an intermediate variable 'Q' which is expensive to compute and appears in computing both the output and gradient.
By the way, I also wonder if I can create gradient function for multiple parameters (e.g. w1 and w2 as in the code below).
e.g.

@customop('numpy')
def my_operator(X,w1,w2):
    Q = f(X,w1,w2)
    H = g1(Q)
    return H
def my_operator_grad1(ans,X,w1,w2):
    def grad1(g):
        Q = f(X,w1,w2)
        R = g2(Q)
        return R
    return grad1
def my_operator_grad2(ans,X,w1,w2):
        def grad2(g):
        Q = f(X,w1,w2)
        R = g3(Q)
        return R
    return grad2
my_operator.def_grad(my_operator_grad1,argnum=1)
my_operator.def_grad(my_operator_grad2,argnum=2)

Thanks!

@Taco-W
Copy link
Member

Taco-W commented May 25, 2017

@swanderingf One of the primary reasons for the customop wrapper is some operations are not defined in GPU.

Say, I have a function using some operations defined only in CPU. Without the customop hint, the input data could be stored in GPU before the invocation. However, in execution, the intermediate data is copied from GPU to CPU in order to run the cpu-defined op, which hurts the performance and invalidate the action of converting the input data in GPU.

The customop enables the user to tell the system where to save the input data and the some of the slow data copies between different devices can be avoided.

Sharing the mutual computation is supported in minpy by def_multiple_grad. You can re-write the code by:

@customop('numpy')
def my_operator(X, w1, w2):
    Q = f(X,w1,w2)
    H = g1(Q)
    return H
def my_operator_grads(ans, X, w1, w2):
    def grad(g):
        Q = f(X,w1,w2)
        return (g2(Q), g3(Q))
    return grad
my_operator.def_multiple_grad(my_operator_grads, (0, 1))

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants