forked from xixiaoyao/CS224n-winter-together
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
9 changed files
with
1,496 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
做了这个作业后,发现要充分了解模型的各个输入输出维度。很关键 | ||
![词级翻译模型](./word_nmt.jpg) |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
#!/usr/bin/env python3 | ||
# -*- coding: utf-8 -*- | ||
|
||
""" | ||
CS224N 2019-20: Homework 4 | ||
model_embeddings.py: Embeddings for the NMT model | ||
Pencheng Yin <[email protected]> | ||
Sahil Chopra <[email protected]> | ||
Anand Dhoot <[email protected]> | ||
Vera Lin <[email protected]> | ||
""" | ||
|
||
import torch.nn as nn | ||
|
||
class ModelEmbeddings(nn.Module): | ||
""" | ||
Class that converts input words to their embeddings. | ||
""" | ||
def __init__(self, embed_size, vocab): | ||
""" | ||
Init the Embedding layers. | ||
@param embed_size (int): Embedding size (dimensionality) | ||
@param vocab (Vocab): Vocabulary object containing src and tgt languages | ||
See vocab.py for documentation. | ||
""" | ||
super(ModelEmbeddings, self).__init__() | ||
self.embed_size = embed_size | ||
|
||
# default values | ||
self.source = None | ||
self.target = None | ||
|
||
src_pad_token_idx = vocab.src['<pad>'] | ||
tgt_pad_token_idx = vocab.tgt['<pad>'] | ||
|
||
### YOUR CODE HERE (~2 Lines) | ||
### TODO - Initialize the following variables: | ||
### self.source (Embedding Layer for source language) | ||
### self.target (Embedding Layer for target langauge) | ||
### | ||
### Note: | ||
### 1. `vocab` object contains two vocabularies: | ||
### `vocab.src` for source | ||
### `vocab.tgt` for target | ||
### 2. You can get the length of a specific vocabulary by running: | ||
### `len(vocab.<specific_vocabulary>)` | ||
### 3. Remember to include the padding token for the specific vocabulary | ||
### when creating your Embedding. | ||
### | ||
### Use the following docs to properly initialize these variables: | ||
### Embedding Layer: | ||
### https://pytorch.org/docs/stable/nn.html#torch.nn.Embedding | ||
|
||
self.source = nn.Embedding(len(vocab.src), self.embed_size, padding_idx=src_pad_token_idx) | ||
self.target = nn.Embedding(len(vocab.tgt), self.embed_size, padding_idx=tgt_pad_token_idx) | ||
|
||
### END YOUR CODE | ||
|
||
|
Large diffs are not rendered by default.
Oops, something went wrong.
Oops, something went wrong.