You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! I though I'd give a try a this. I already had the idea of turning PHP code into plain natural language.
The idea is to take PHP code, and read it aloud, in English. Not read the tokens one by one, but turn an expression like 'class x extends y {}' into a full, meaningful sentence : the definition of the class x, which extends the class y'.
This relies on the Abstract Syntactic Tree of PHP 7.0+, so the internal tokens are non-ambiguous. Then, the native PHP functions and operator are organized in a dictionary to be able to turn 'trim($a)' into 'the trimming of the spaces at the beginning and the end of $a'.
While preparing this entry, I realized that is close to #80 (made in Python) and dariusk/NaNoGenMo-2015#173 (with the phone line twist).
The result is an ugly text file, with a lot of repetition.
Hello! I though I'd give a try a this. I already had the idea of turning PHP code into plain natural language.
The idea is to take PHP code, and read it aloud, in English. Not read the tokens one by one, but turn an expression like 'class x extends y {}' into a full, meaningful sentence : the definition of the class x, which extends the class y'.
This relies on the Abstract Syntactic Tree of PHP 7.0+, so the internal tokens are non-ambiguous. Then, the native PHP functions and operator are organized in a dictionary to be able to turn 'trim($a)' into 'the trimming of the spaces at the beginning and the end of $a'.
While preparing this entry, I realized that is close to #80 (made in Python) and dariusk/NaNoGenMo-2015#173 (with the phone line twist).
The result is an ugly text file, with a lot of repetition.
The code : https://github.com/dseguy/php2natural
An example of output : https://github.com/dseguy/php2natural/blob/master/out.txt
It is about 83k words : it was actually difficult to go beyond the threshold of 50k words in one file. Even the self-reading is about 20k.
The code won't work on every PHP code ATM : some tokens are not processed yet, some features silently ignored. It should run on a fair number of them.
I might keep working on this some more, as this PHP -> NL process is very interesting in terms of code understanding.
The text was updated successfully, but these errors were encountered: