- MAML-based
- (MAML) Chelsea Finn, Pieter Abbeel, et al. "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks." (2017). [arXiv:1703.03400] ⭐
- (Reptile)Alex Nichol, Joshua Achiam, et al. "On First-Order Meta-Learning Algorithms."(2017). [arXiv:1803.02999] ⭐
- Metric-base
- (Siamese Network) Gregory Koch, et al. "Siamese Neural Networks for One-shot Image Recognition." (2016) [arXiv]
- (Prototypical Networks) Jake Snell, Kevin Swersky, et al. "Prototypical Networks for Few-shot Learning." (2017). [arXiv:1703.0517]
- (Match Networks) Oriol Vinyals, Charles Blundell, et al. "Matching Networks for One Shot Learning." (2017)[arXiv:1606.04080]
- (Relation Network) Flood Sung, Yongxin Yang, et al."Learning to Compare: Relation Network for Few-Shot Learning" (CVPR 2018)[arXiv:1711.06025]
- Gradient Descent as LSTM
- S Ravi, et al. "OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING",(2016).[openreview]
- Marcin Andrychowicz, Misha Denil, et al. "learning to learn by gradient by gradient"(2017)[NIPS 2017]
- (L2X) Jianbo Chen, Le Song, et al. "Learning to Explain: An Information-Theoretic Perspective on Model Interpretation." (ICML 2018)[arXiv:1802.07814]
- (VIBI) Seojin Bang, Pengtao Xie, et al. "Explaining a black-box using Deep Variational Information Bottleneck Approach." (2019)[arXiv:1902.06918]
-
Text Generation (tips about text generation) “Neural Text Generation: A Practical Guide” (2018)[zxie]
-
Neural Dialogue Generation "Rethinking Neural Dialogue Generation: A Practical Guide"(2018)[dialoge generation]
- (LeNet) LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998).
- (AlexNet) Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. (2012).
- (ZFNet) Zeiler, Matthew D., and Rob Fergus. "Visualizing and understanding convolutional networks." European conference on computer vision. Springer, Cham, (2014).
- (NIN) Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." (2013). [arXiv:1312.4400]
- (VGGNet) Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition."(2014). [arXiv:1409.1556]
- (GoogLeNet) Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
- (BN) Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." International Conference on Machine Learning. (2015). [arXiv:1502.03167]
- (ResNet) He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. (2016). [arXiv:1512.03385] [CVPR 2016 Best Paper] ⭐
- (Pre-active) He, Kaiming, et al. "Identity mappings in deep residual networks." European Conference on Computer Vision. Springer International Publishing. (2016). [arXiv:1603.05027]
- Huang, Gao, et al. "Deep networks with stochastic depth." European Conference on Computer Vision. Springer, Cham, 2016. [arXiv:1603.09382]
- (Wide ResNet) Zagoruyko, Sergey, and Nikos Komodakis. "Wide residual networks." (2016). [arXiv:1605.07146]
- (ResNeXt) Xie, Saining, et al. "Aggregated residual transformations for deep neural networks." 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, (2017). [arXiv:1611.05431]
- (DenseNet) Huang, Gao, et al. "Densely connected convolutional networks." (2016). [arXiv:1608.06993]
- Pleiss, Geoff, et al. "Memory-efficient implementation of densenets." arXiv preprint (2017). [arXiv:1707.06990]
- (DPN) Chen, Yunpeng, et al. "Dual path networks." Advances in Neural Information Processing Systems. (2017). [arXiv:1707.01629]
- (SENet) Hu, Jie, Li Shen, and Gang Sun. "Squeeze-and-excitation networks." (2017). [arXiv:1709.01507]
- (CondenseNet) Huang, Gao, et al. "CondenseNet: An Efficient DenseNet using Learned Group Convolutions." (2017). [arXiv:1711.09224]
- (GN) Yuxin Wu, Kaiming He. "Group Normalization." (2018). [arXiv:1803.08494]