Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IOError: [Errno 2] No such file or directory: 'data/demo/infovis/corpus/corpus.txt' #32

Open
dbl001 opened this issue Jul 26, 2017 · 1 comment

Comments

@dbl001
Copy link

dbl001 commented Jul 26, 2017

I'm trying to install the termite-data-server on OSX 10.11.6.

./demo.py infovis
--------------------------------------------------------------------------------
Build a topic model (mallet) using a demo dataset (infovis)
  database = data/demo/infovis/corpus
    corpus = data/demo/infovis/corpus
     model = data/demo/infovis/model-mallet
       app = infovis_mallet
--------------------------------------------------------------------------------
# Setting up the infovis dataset...
    Creating folder 'data/demo/infovis'...
    Downloading...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0   154    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (7) Failed to connect to localhost port 80: Connection refused
    Uncompressing...
[data/demo/infovis/download/infovis.zip]
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
unzip:  cannot find zipfile directory in one of data/demo/infovis/download/infovis.zip or
        data/demo/infovis/download/infovis.zip.zip, and cannot find data/demo/infovis/download/infovis.zip.ZIP, period.
    Extracting corpus.txt from corpus.db...
Exporting database [data/demo/infovis/corpus/corpus.db] to file [data/demo/infovis/corpus/corpus.txt]
Traceback (most recent call last):
  File "bin/export_corpus.py", line 25, in <module>
    main()
  File "bin/export_corpus.py", line 22, in main
    ExportCorpus(args.database, args.corpus)
  File "bin/export_corpus.py", line 14, in ExportCorpus
    with Corpus_DB(database_path) as corpusDB:
  File "/Users/davidlaxer/termite-data-server/bin/db/Corpus_DB.py", line 33, in __enter__
    self.DefineOptionsTable()
  File "/Users/davidlaxer/termite-data-server/bin/db/Corpus_DB.py", line 52, in DefineOptionsTable
    self.SetOption( key, value, overwrite = self.isInit )
  File "/Users/davidlaxer/termite-data-server/bin/db/Corpus_DB.py", line 57, in SetOption
    if self.db( where ).count() > 0:
  File "web2py/gluon/dal.py", line 10515, in count
    return db._adapter.count(self.query,distinct)
  File "web2py/gluon/dal.py", line 1902, in count
    self.execute(self._count(query, distinct))
  File "web2py/gluon/dal.py", line 1969, in execute
    return self.log_execute(*a, **b)
  File "web2py/gluon/dal.py", line 1963, in log_execute
    ret = self.cursor.execute(command, *a[1:], **b)
sqlite3.OperationalError: no such table: options
    Corpus available: data/demo/infovis/corpus
# Setting up MALLET (MAchine Learning for LanguagE Toolkit)...
    Creating folder 'externals/mallet-2.0.7'...
    Downloading...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 11.8M  100 11.8M    0     0  1624k      0  0:00:07  0:00:07 --:--:-- 2011k
    Extracting license...
    Creating folder 'tools/mallet-2.0.7'...
    Uncompressing...
    Available: tools/mallet-2.0.7
    Available: tools/mallet-2.0.7
# Setting up Stanford CoreNLP tools...
    Creating folder 'externals/corenlp-3.3.1'...
    Downloading...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0   336    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  214M  100  214M    0     0  1211k      0  0:03:01  0:03:01 --:--:-- 1429k
    Extracting license...
    Creating folder 'tools/corenlp-3.3.1'...
    Uncompressing...
    Available: tools/corenlp-3.3.1
--------------------------------------------------------------------------------
Training an LDA topic model using MALLET...
       corpus = data/demo/infovis/corpus/corpus.txt
        model = data/demo/infovis/model-mallet
  token_regex = \w{3,}
       topics = 20
        iters = 1000
--------------------------------------------------------------------------------
Importing a file into MALLET: [data/demo/infovis/corpus/corpus.txt] --> [data/demo/infovis/model-mallet/corpus.mallet]
Exception in thread "main" java.io.FileNotFoundException: data/demo/infovis/corpus/corpus.txt (No such file or directory)
	at java.io.FileInputStream.open(Native Method)
	at java.io.FileInputStream.<init>(FileInputStream.java:131)
	at cc.mallet.classify.tui.Csv2Vectors.main(Csv2Vectors.java:260)
Training an LDA model in MALLET: [data/demo/infovis/model-mallet/corpus.mallet] --> [data/demo/infovis/model-mallet/lda.mallet]
java.io.FileNotFoundException: data/demo/infovis/model-mallet/corpus.mallet (No such file or directory)
	at java.io.FileInputStream.open(Native Method)
	at java.io.FileInputStream.<init>(FileInputStream.java:131)
	at cc.mallet.types.InstanceList.load(InstanceList.java:787)
	at cc.mallet.topics.tui.Vectors2Topics.main(Vectors2Topics.java:378)
Exception in thread "main" java.lang.IllegalArgumentException: Couldn't read InstanceList from file data/demo/infovis/model-mallet/corpus.mallet
	at cc.mallet.types.InstanceList.load(InstanceList.java:794)
	at cc.mallet.topics.tui.Vectors2Topics.main(Vectors2Topics.java:378)
--------------------------------------------------------------------------------
Import a MALLET LDA topic model as a web2py application...
           app_name = infovis_mallet
           app_path = apps/infovis_mallet
         model_path = data/demo/infovis/model-mallet
    corpus_filename = data/demo/infovis/corpus/corpus.txt
  database_filename = data/demo/infovis/corpus/corpus.db
--------------------------------------------------------------------------------
Creating app: infovis_mallet [apps/temp_20170725_221922_651726_0528]
Creating folder: [apps/temp_20170725_221922_651726_0528/data]
Creating folder: [apps/temp_20170725_221922_651726_0528/databases]
Linking folder: [apps/temp_20170725_221922_651726_0528/models]
Linking folder: [apps/temp_20170725_221922_651726_0528/views]
Linking folder: [apps/temp_20170725_221922_651726_0528/controllers]
Linking folder: [apps/temp_20170725_221922_651726_0528/static]
Linking folder: [apps/temp_20170725_221922_651726_0528/modules]
Creating file: [apps/temp_20170725_221922_651726_0528/__init__.py]
David-Laxers-MacBook-Pro:termite-data-server davidlaxer$ ./demo.py infovis
--------------------------------------------------------------------------------
Build a topic model (mallet) using a demo dataset (infovis)
  database = data/demo/infovis/corpus
    corpus = data/demo/infovis/corpus
     model = data/demo/infovis/model-mallet
       app = infovis_mallet
--------------------------------------------------------------------------------
    Available: tools/corenlp-3.3.1
--------------------------------------------------------------------------------
Training an LDA topic model using MALLET...
       corpus = data/demo/infovis/corpus/corpus.txt
        model = data/demo/infovis/model-mallet
  token_regex = \w{3,}
       topics = 20
        iters = 1000
--------------------------------------------------------------------------------
    Already exists: data/demo/infovis/model-mallet
--------------------------------------------------------------------------------
Import a MALLET LDA topic model as a web2py application...
           app_name = infovis_mallet
           app_path = apps/infovis_mallet
         model_path = data/demo/infovis/model-mallet
    corpus_filename = data/demo/infovis/corpus/corpus.txt
  database_filename = data/demo/infovis/corpus/corpus.db
--------------------------------------------------------------------------------
Creating app: infovis_mallet [apps/temp_20170725_222309_156011_4992]
Creating folder: [apps/temp_20170725_222309_156011_4992/data]
Creating folder: [apps/temp_20170725_222309_156011_4992/databases]
Linking folder: [apps/temp_20170725_222309_156011_4992/models]
Linking folder: [apps/temp_20170725_222309_156011_4992/views]
Linking folder: [apps/temp_20170725_222309_156011_4992/controllers]
Linking folder: [apps/temp_20170725_222309_156011_4992/static]
Linking folder: [apps/temp_20170725_222309_156011_4992/modules]
Creating file: [apps/temp_20170725_222309_156011_4992/__init__.py]
Copying [data/demo/infovis/corpus/corpus.db] --> [apps/temp_20170725_222309_156011_4992/databases/corpus.db]
Copying [data/demo/infovis/corpus/corpus.txt] --> [apps/temp_20170725_222309_156011_4992/data/corpus.txt]
An error occured while creating app: infovis_mallet [apps/infovis_mallet]
Traceback (most recent call last):
  File "bin/read_mallet.py", line 85, in <module>
    main()
  File "bin/read_mallet.py", line 82, in main
    ImportMalletLDA( args.app_name, args.model_path, args.corpus_path, args.database_path, args.quiet, args.overwrite )
  File "bin/read_mallet.py", line 47, in ImportMalletLDA
    shutil.copy( corpus_filename, app_corpus_filename )
  File "/Users/davidlaxer/anaconda/lib/python2.7/shutil.py", line 119, in copy
    copyfile(src, dst)
  File "/Users/davidlaxer/anaconda/lib/python2.7/shutil.py", line 82, in copyfile
    with open(src, 'rb') as fsrc:
IOError: [Errno 2] No such file or directory: 'data/demo/infovis/corpus/corpus.txt'

Any ideas about the missing file: corpus.txt?

@xinnyuann
Copy link

Got the same error when trying the demo with 20newsgroups. I checked the data-fetch.sh file, guess it's because the download link doesn't exist anymore? Could anyone help point it to the right direction? Thanks so much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants