This program was written by Jonathon Galsurkar and Meredith Lancaster under the supervision of Prof. William Sakas at Hunter College, Computer Science and the Graduate Center, Linguistics and Computer Science of the City University of New York. It is written in Python 3.5.1.
The program implements a learner that learns multiple abstract, human-like languages grounded in Chomsky's principles and parameters framework. The learning model is one of first language acquisition, i.e., acquisition by a child of approximately 2 years of age.
The learner and the abstract domain over which it operates is described in detail in:
Sakas, W.G. & Fodor, J.D. (2012) Disambiguating Syntactic Triggers, Language Acquisition (19) pp 83-143.
The paper and domain and other relevant information are downloadable here:
http://www.colag.cs.hunter.cuny.edu/downloadables.html
8/9/2016: The program is currently being maintained by Meredith Lancaster
The most recent data generated by the program are currently available in the results folder
3/17/17: The repository now also contains an implementation of Charles Yang's weighted variational model. Yang's model can be found in the Yang-weighted-model subdirectory and is written in Scala 2.12.1
The program must be run with a Python interpreter that supports Python 3.5. It can run with
python main.py <number of learners> <number of sentences to be processed> <language code> [-c, --convergence] [-p, --plots]
The file EngFrJapGerm.txt contains 3522 sentences, corresponding to fake languages which represent (and mimic the grammatical structure of) either English, French, Japanese, or German. Each language has a code associated with it, allowing the user to specify which language the learners should be trained on. French=584, English=611, German=2253, and Japanese=3856
There are two optional arguments represented by the -c/--c and -p/--plot flags. Using these flags will cause the program to produce additional results
The flag will cause the program to produce two plots, one recording the psets of each parameter (the order in which each converged) and the other showing at what time (represented in sentences) the parameters converged.
This flag will cause the program to find convergence patterns between parameters. For example, it will record the number of times parameter 4 converged before parameter 1.
The program will write simulation results to several csv and png files to a folder whose name starts with the used language followed by the number of learners, and a timestamp. For example: English_100_32016-06-09T14.17.54.747187 will contain the results of simulation using 100 learners with English.
I wrote a port of this program in Clojure as introduction to functional programming and concurrency, which can be found here
If SBT is installed on your machine, then do cd /Yang-weighted-mode
followed by sbt
This will start the sbt prompt: >
Do > compile
followed by > run <number of sentences to be processed> <grammar code>
Similar to the Fodor-Sakas model, this program takes two command line arguments. It uses the same language codes that the other program uses: French=584, English=611, German=2253, and Japanese=3856
Again, after starting the sbt prompt, do > package
This will create a JAR file stored in the target
subdirectory and can be run with java -jar test.jar
Currently, the program only prints the final grammar to the terminal