Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify BiRNNBuilder to get easy access to the final hidden states of the component RNNs #263

Open
wants to merge 1,205 commits into
base: master
Choose a base branch
from

Conversation

bskaggs
Copy link
Contributor

@bskaggs bskaggs commented Jan 21, 2017

In situations where you are using BiRNNs to output a fixed sized state for a variable-length sequence, it is convenient to build it from the hidden states of the two constituent RNNs at the final layer.

This request refactors BiRNNBuilder to unify transduce and add_inputs, and adds a transduce_final function that produces an expression that is the concatenation of the last element of h() for the last states of the two RNNs at the last layer. (I think this the state I want, but please correct me if that doesn't make sense. Also, I'll caveat that Python isn't my strongest language, so I'd appreciate any style and naming comments!)

Also, I was confused by the near duplicate existing definition of BiRNNBuilder in python/dynet_viz.py and python/_dynet.pyx so I made them the same. ¯\_(ツ)_/¯

hayounav and others added 30 commits October 27, 2016 17:33
Python works in Windows, including MKL and/or GPU support
# Conflicts:
#	examples/CMakeLists.txt
Removing multiprocess examples from Windows build and simplifying windows build instructions to reflect latest Eigen version
add the -O1/-O2 swith for the gcc4.9 or later versions.
@yoavg
Copy link
Contributor

yoavg commented Jan 21, 2017

If I understand correctly what you are trying to achieve with transduce_final, can't it be implemented as:

x = builder.add_inputs(seq) 
return concatenate([ x[-1][0].output(), x[0][-1].output() ])

?
(this takes the (f) from the last element and the (b) from the first element, and then concats their outputs)

also, I find the name transduce_final confusing, perhaps we can find a better name for this?

@bskaggs
Copy link
Contributor Author

bskaggs commented Jan 21, 2017

Yes, I believe it is equivalent, but add_inputs constructs and collects all the intermediate states as data structures in python list, but you only care about the last one of each of the two RNNs. I'll benchmark if there is a performance difference, especially on longer sequences.

And, yes, it definitely needs a better name.

@neubig
Copy link
Contributor

neubig commented Mar 31, 2017

Did anything happen on this pull request? I think it'd be nice to have if the various issues can be resolved.

@bskaggs
Copy link
Contributor Author

bskaggs commented Apr 6, 2017

First, I'd like to say I support the BiLSTM hegemony.

However, I haven't done the benchmarking I've promised. If someone can think of a better name than "transduce_final", that would motivate me to do it!

@neubig
Copy link
Contributor

neubig commented Apr 7, 2017

What about transduce_final_states?

@danielhers
Copy link
Collaborator

Why just BiRNNs? Shouldn't any RNN allow efficiently getting just the final state?
In Keras, for example, this is the default, and you need to set return_sequences=True to get all the outputs from the intermediate time points too.
Another option is to keep transduce for both, but add a final_state boolean argument, with default value False (to keep backward compatibility). This naming is analogous to the initial_state() function.

@neubig
Copy link
Contributor

neubig commented May 29, 2017

@danielhers Yeah, actually this seems like a really nice solution. @yoavg might have opinions, but I'd be happy to have a pull request to this effect.

@yoavg
Copy link
Contributor

yoavg commented May 29, 2017

for the one-directional RNN, I think .transduce(seq)[-1] will be as efficient. But I am ok with transduce(seq,final_only=True).

for the bi-RNNs, note that the semantics of final_state is different than in the uni-RNNs, because it concats the final states of both, and is something that you don't have access to in the "regular" biRNN interface.

I don't care that much about the name that will finally be chosen, as long as its properly documented.

@neubig
Copy link
Contributor

neubig commented Jun 30, 2017

@bskaggs What do you think about the final_only option? If that sounds reasonable, then perhaps we can go with that.

@bskaggs
Copy link
Contributor Author

bskaggs commented Jul 2, 2017

@neubig: sounds like a good option! Should I take crack at implementing it?

@neubig
Copy link
Contributor

neubig commented Sep 22, 2017

@bskaggs Sorry, I totally missed your response. Yes, I think this would be worth doing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.