diff --git a/content/articles/016-pygrunn14.rst b/content/articles/016-pygrunn14.rst
new file mode 100644
index 0000000..ec16354
--- /dev/null
+++ b/content/articles/016-pygrunn14.rst
@@ -0,0 +1,249 @@
+Pygrunn 2014
+============
+
+:date: 2014-05-10 15:56
+:tags: conference, talk, nlp, pygrunn
+:category: life
+:author: Dmitrijs Milajevs
+:template: article_cover
+:cover: 016-pygrunn14.jpg
+
+`Pygrunn `_ is an awesome conference for Python
+developers and friends, which takes place in
+`Groningen `_.
+
+As usually, the conference was perfectly organized. This is one of the most
+stylish conferences I've ever attended. It constantly grows, and next year the
+conference moves to a bigger venue, so keep the beginning of May 2015 free and
+attend the event.
+
+Another positive trend is the growing proportion of science related talks. One
+of the topics of the conference became (scientific) code quality and
+collaboration between professional developers and scientists.
+
+Check awesome summaries of talks by
+`Reinout van Rees `_
+and
+`Maurits van Rees `_. Get the
+`#pygrunn `_ tweets and follow
+`@pygrunn `_.
+
+
+Computational linguistics 101
+-----------------------------
+
+`My presentation`__ started as a demonstration of the modern pythonic scientific
+tools (my subjective classification):
+
+__ http://nbviewer.ipython.org/urls/bitbucket.org/dimazest/phd-buildout/raw/tip/notebooks/pygrunn14.ipynb
+
+1. Data structures: NumPy_, SciPy_, Pandas_
+2. Algorithms: scikit-learn_, NLTK_, TextBlob_, gensim_
+3. Reporting: IPython_, matplotlib_ seaborn_
+
+.. _NumPy: http://www.numpy.org/
+.. _SciPy: http://www.scipy.org/scipylib/index.html
+.. _Pandas: http://pandas.pydata.org/
+.. _scikit-learn: http://scikit-learn.org/
+.. _NLTK: http://www.nltk.org/
+.. _TextBlob: http://textblob.readthedocs.org
+.. _gensim: http://radimrehurek.com/gensim/
+.. _IPython: ttp://ipython.org/
+.. _matplotlib: http://matplotlib.org/
+.. _seaborn: http://www.stanford.edu/~mwaskom/software/seaborn/
+
+
+However, I find the technical talks with a lot of code rather boring, so I
+decided to show how these libraries are used to solve simple CL tasks.
+
+A universal pattern behind natural languages
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+First, I `covered`__ `Zipf's law `_,
+which states that the frequency of any word in a corpus of texts is inversely
+proportional to its rank in the frequency table. To show that the law holds for
+an English text, I loaded `the BNC frequency list`__ provided by `Adam
+Kilgarriff`__ into `Pandas `_ `DataFrame`__ and
+plotted the sorted frequencies on the log-log scale.
+
+__ http://nbviewer.ipython.org/urls/bitbucket.org/dimazest/phd-buildout/raw/tip/notebooks/pygrunn14.ipynb#english-word-frequencies
+__ http://www.kilgarriff.co.uk/BNClists/lemma.num
+__ http://www.kilgarriff.co.uk/bnc-readme.html
+__ http://pandas.pydata.org/pandas-docs/version/0.13.1/generated/pandas.DataFrame.html
+
+.. image:: {filename}/static/images/016-bnc_freq.png
+ :align: center
+ :alt: English word frequency counts extracted from the British National Corpus on the log-log scale.
+ :target: {filename}/static/images/016-bnc_freq.png
+
+As a homework, I asked whether the same behavior is observed in
+other languages and what the differences are.
+
+Distributional semantics
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+I could not resist and `presented`__ my `research area`__ :) by extracting word
+co-occurrence counts and projecting the word vectors to 2 dimensions using
+`scikit-learn`__ implementation of `manifold learning`__.
+
+__ http://nbviewer.ipython.org/urls/bitbucket.org/dimazest/phd-buildout/raw/tip/notebooks/pygrunn14.ipynb#distributional-semantics
+__ http://www.eecs.qmul.ac.uk/~dm303/
+__ http://scikit-learn.org/stable/
+__ http://scikit-learn.org/stable/modules/manifold.html
+
+In distributional semantics, words are represented as rows in a matrix. The
+columns correspond to other words the word co-occurs with. The values of the
+matrix are the frequencies the words co-occurred together. For example, here are
+the vectors for the words ``idea``, ``notion``, ``boy`` and ``girl``.
+
+======= ========== ==== ======
+\ philosophy book school
+======= ========== ==== ======
+idea 10 47 39
+notion 7 3 15
+boy 0 12 146
+girl 0 19 93
+======= ========== ==== ======
+
+So, ``idea`` was seen with ``philosophy`` 10 times in the corpus I used. A
+co-occurrence in this case means that ``philosophy`` was not more than 5 words
+further from ``idea``.
+
+The number patterns for ``boy`` and ``girl`` are much more similar than for
+``boy`` and ``notion``, suggesting that ``boy`` is much more similar to ``girl``
+than to ``notion``. Clearly, we can select much more words to label rows, making
+the similarity reasoning more precise.
+
+We can reason on word semantic similarity from a geometrical point of view using
+a distance measure (for example, Euclidean distance). The closer are two vectors
+to each other in the vector space, the closer are the words semantically.
+
+Unfortunately, it's difficult for humans to reason in more than 3 dimensions.
+While the multidimensional space is useful to perform computations, it's useless
+to present the patterns words share. If we could imagine the space, we would
+discover areas (or directions) that correspond to the girlish/boylish words and
+to the more abstract idea/notion.
+
+To overcome the issue, we can reduce the dimensionality of the space in such a
+way that the distance between the elements is respected. Clearly, we can't
+completely preserve the distances, but it's possible to respect the distances to
+some degree.
+
+Manifold learning is one of many techniques to perform dimensionality reduction.
+If we apply it to the extracted co-occurrence counts for some of the words and
+reduce to two dimensions (so we can plot it), we will notice that related words
+cluster around each other.
+
+.. image:: {filename}/static/images/016-ds.png
+ :align: center
+ :alt: Word semantic relatedness.
+ :target: {filename}/static/images/016-ds.png
+
+
+Sprint
+------
+
+`Spyros Ioakeimidis `_ and
+`Sjoerd de Haan `_ liked the
+idea of counting word frequencies among various languages and see how they
+compare in relation to Zipf's law.
+
+Initially, we wanted to take EU directives and compare the official EU languages,
+however, the website was down, and we were kindly redirected to
+`this page `_ every time we wanted to get a legal
+document.
+
+Luckily, we found an already prepared `word frequencies for many languages
+`_ and reused them. We
+wrote a simple function to plot the frequency of the words against the rank of
+the words in the frequency table. Here is the top 10 most frequently used words
+in English, Dutch and Latvian:
+
+==== ======== ========= ======== ========= ======== =========
+\ English Dutch Latvian
+---- ------------------ ------------------ ------------------
+Rank Word Frequency Word Frequency Word Frequency
+==== ======== ========= ======== ========= ======== =========
+1 you 6281002 ik 2091479 ir 20182
+2 i 5685306 je 1995150 es 19042
+3 the 4768490 het 1428477 un 12737
+4 to 3453407 de 1399236 tu 12141
+5 a 3048287 is 1202489 tas 8601
+6 it 2879962 dat 1188131 ka 7964
+7 and 2127187 een 1011496 man 7725
+8 that 2030642 niet 997681 to 7535
+9 of 1847884 en 774098 vai 7527
+10 in 1554103 wat 618627 ko 6906
+==== ======== ========= ======== ========= ======== =========
+
+If you plot the word rank on the x axis and the word frequency on the y axis on
+a log-log scale you should see a straight line. A straight line on a log-log
+plot implies that the quantities on the two axis are related trough a power law.
+Thus, if our data would fit straight line perfectly, that would mean that the
+frequency of a word occurring is exactly proportional to a power of the rank of
+that word in the frequency table. This is the content of Zipf's law, but
+of course, such laws are never exact.
+
+.. image:: {filename}/static/images/016-en_zipf.png
+ :align: center
+ :alt: English word frequency counts on the log-log scale.
+ :target: {filename}/static/images/016-en_zipf.png
+
+The blue line is the provided frequencies, the green is a regression line.
+
+One thing we can compare amongst languages is how well this plot follows a
+straight line. Also the slope of the line contains interesting information. It
+tells what kind of power law we are dealing with exactly.
+
+The slope is related to the morphology of a language. For example, in Latvian,
+which has quite rich morphology, the word `"city"` is `"pilsēta"`, but the
+English phrase `"in a city"` is `"pilsētā"`. All the occurrences of "`pilsēta`"
+in a Latvian text will be distributed over several morphological forms, lowering
+the counts. As a result, the slope for a Latvian text will be less steep
+comparing to English.
+
+We `tried`__ English, Ukrainian, Dutch, Russian, Latvian, Spanish and Italian. All
+languages obey Zipf's law, at least visually.
+
+__ http://nbviewer.ipython.org/urls/bitbucket.org/dimazest/phd-buildout/raw/tip/notebooks/Word%20frequencies.ipynb
+
+========= ========= ===========
+Language Slope Intercept
+========= ========= ===========
+en -1.717729 21.934904
+uk -1.044263 11.212273
+nl -1.566664 19.635268
+ru -1.395736 17.781756
+lv -1.055992 11.541761
+es -1.707326 22.161790
+it -1.601567 20.000540
+========= ========= ===========
+
+Theory [Li1992]_ says that the slope coefficient should be close to -1. As the
+table below shows, the values deviate from -1 quite drastically (-1.57 for
+Dutch, for example). Also, the `slope estimate`__ for English from the `British
+National Corpus`__ is -1.18 in contrary to -1.72. Here is the Zipf's law
+visualization for English extracted from the BNC.
+
+__ http://nbviewer.ipython.org/urls/bitbucket.org/dimazest/phd-buildout/raw/tip/notebooks/pygrunn14.ipynb#estimating-the-slope
+__ http://www.natcorp.ox.ac.uk/
+
+.. image:: {filename}/static/images/016-en_bnc_zipf.png
+ :align: center
+ :alt: Actual and estimated English word frequencies from the BNC.
+ :target: {filename}/static/images/016-en_bnc_zipf.png
+
+Conclusion
+----------
+
+Pygrunn is a great conference that start attracting not only (professional web)
+developers, but also scientists. I was really surprised that my talk got a bit
+of attention and people were willing to hack around a linguistic phenomena. I
+hope that next year this trend continues. And the two communities become closer
+to each other.
+
+.. [Li1992] Li, Wentian.
+ `Random texts exhibit Zipf's-law-like word frequency distribution.`__
+ Information Theory, IEEE Transactions on 38.6 (1992): 1842-1845.
+
+__ http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.164.8422&rep=rep1&type=pdf
\ No newline at end of file
diff --git a/content/static/article_covers/016-pygrunn14.jpg b/content/static/article_covers/016-pygrunn14.jpg
new file mode 100644
index 0000000..845b48b
Binary files /dev/null and b/content/static/article_covers/016-pygrunn14.jpg differ
diff --git a/content/static/images/016-bnc_freq.png b/content/static/images/016-bnc_freq.png
new file mode 100644
index 0000000..e57ea54
Binary files /dev/null and b/content/static/images/016-bnc_freq.png differ
diff --git a/content/static/images/016-ds.png b/content/static/images/016-ds.png
new file mode 100644
index 0000000..0756264
Binary files /dev/null and b/content/static/images/016-ds.png differ
diff --git a/content/static/images/016-en_bnc_zipf.png b/content/static/images/016-en_bnc_zipf.png
new file mode 100644
index 0000000..454151c
Binary files /dev/null and b/content/static/images/016-en_bnc_zipf.png differ
diff --git a/content/static/images/016-en_zipf.png b/content/static/images/016-en_zipf.png
new file mode 100644
index 0000000..820cf5c
Binary files /dev/null and b/content/static/images/016-en_zipf.png differ