diff --git a/_requests_for_research/im2latex.html b/_requests_for_research/im2latex.html index 5ba4b1a..c3a49e6 100644 --- a/_requests_for_research/im2latex.html +++ b/_requests_for_research/im2latex.html @@ -34,5 +34,10 @@

Notes

While this is a very non-trivial project, we've marked it with a one-star difficulty rating because we know it's solvable using current methods. It is still very challenging to really do it, as it requires getting several ML components together correctly.

Solutions

- +

Solution 1

Results, data set, code, and a write-up are available at http://lstm.seas.harvard.edu/latex/. The model is trained on the above data sets and uses an extension of the Show, Attend and Tell paper combined with a multi-row LSTM encoder. Code is written in Torch (based on the seq2seq-attn system), and the model is optimized using SGD. Additional experiments are run using the model to generate HTML from small webpages. + +

Solution 2

+

+The paper is available on arXiv. Datasets, visualization and ancillary material (including a hardware parts list) is available at https://untrix.github.io/i2l/. This model is based on the Show, Attend and Tell paper. However significant changes were needed to be made to that model in order to get to a BLEU score of 89%; the highest reported so far. We detail those changes, the reasons why they were needed and their effect on performance. We also provide visuals demonstrating that the model focuses its attention on small regions of the image and scans it left/right, up/down as it generates the corresponding LaTeX. The implementation, written from scratch in Python is available under open-source license at https://github.com/untrix/im2latex. The model was implemented using Tensorflow and pre/post processing is presented via. Jupyter Notebooks. +