-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.txt
executable file
·95 lines (70 loc) · 2.83 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
CDACM-KBCS n-fold-cross-validation - 2014-11-05
----------------------------------------------
This package provides a fram-work for automatic evaluation of Statistical Machine Translation System with n-folds.
(c) 2014-2024. Centre for Development of Advanced Computing, Mumbai
Knowledge Based Computer Systems Division. All Rights Reserved.
Original code by Raj Nath Patel.
The software is licensed under the GPL V3. Please see the file LICENSE.txt
For more information, bug reports, and fixes, contact:
Raj Nath Patel / Prakash Pimpale
CDAC, Rain tree marg, sector 7, CBD Belapur Navi Mumbai - 400614
India
http://kbcs.in/
CONTACT
For questions about this distribution, please contact KBCS, CDAC Mumbai at [email protected]. We provide assistance on a best-effort basis.
PRE-REQUISIT
- Linux operating system
- 'GIZA++' should be installed
- 'Moses decoder' (version 1.9 or above) should be installed
- 'gunzip' should be installed(install using 'yum' or 'apt-get' command: #yum install zip unzip)
- 'sed' should be installed(install using 'yum' or 'apt-get' command: #yum install sed)
DEPENDENCY INSTALLATION
-GIZA++ installation
-Download system(http://code.google.com/p/giza-pp/downloads/list)
-Extract and change directory to "giza-pp"
$cd giza-pp
-Compile the code as instructed in README
or
For basic installation run
$make
-Create directory "bin"
-Copy GIZA++, snt2cooc.out and mkcl to bin directory
$cp GIZA++-v2/GIZA++ ./bin
$cp GIZA++-v2/snt2cooc.out ./bin
$cp mkcls-v2/mkcls ./bin
-MOSES installation
-Download MOSES(https://github.com/moses-smt/mosesdecoder/tree/RELEASE-2.1.1)
-Extract the download
-Follow the instruction in BUILD-INSTRUCTIONS.txt
or
For basic installation run
$./bjam -jN 4
//4 is number of CPUs, change acoordingly
SYSTEM INSTALLATION-
#Change whare you want to install systems
In my case its-
"WORKDIR=/some path/current dir/s1"
#Change bin directory where GIZA++ is installed
In my case its-
BIN=/home/anuvad/smt/decoder/bin
#Change "MOSES_HOME" according moses installation dir w.r.t. your system
In my case its-
"MOSES_HOME=/home/raj/smt/decoder/mosesdecoder-RELEASE-2.1.1/"
USE
Usage:
$bash <SYSTEM-SCRIPT> -s <SOURCE-FILE> -t <TARGET-FILE> -r <ROOT-DIRECTORY> -f <FOLDS>
SAMPLE Usage 1:
To train 3-fold validation test
$bash n-fold-validation.sh -s sample-corpus/corpus.eng -t sample-corpus/corpus.hin -r s1 -f 3
RESULTS
Individual system testing is done in "eval" dir of respective systems. (Eg: for system "1" it will be at "s1/1/eval"). Consolidated results will be in "s1/results"
HELP
#To know about 'n-fold-validation.sh', run
$bash n-fold-validation.sh
or
$bash n-fold-validation.sh -h
--------------------
CHANGES
--------------------
2014-11-26 1.0 Initial release