How to add more fulltext sources #62

talentoscope · 2016-09-20T21:21:20Z

This software is simply astounding.

I'd love to know, as would many, exactly how to add multiple sources of information.
Ideally by adding more documents to solr for indexing, such as other Wikis, Project Gutenberg texts, etc. I assume these would all be processed with the fulltext search using solr whether there is a dbpedia clue or not?

Please help.

pasky · 2016-09-21T19:11:09Z

Documenting this is a subject of #17 but it is not possible out of the box, only in principle. (The architecture allows it, but there is no explicit code support for querying multiple Solr indices. Sure, you could just make sure IDs are non-duplicate and index everything in a single Solr collection, as a starting point...)

talentoscope · 2016-09-21T19:27:20Z

Thanks for the reply. Will look at doing that, and maybe playing with the code to add multiple instances of solr (or multiple collections).

pasky · 2016-09-21T19:33:57Z

Any contributions to the code or just to the documentation will be welcome!

talentoscope · 2016-09-21T19:36:05Z

Will definitely be contributing code back if I come up with anything, just
getting up to speed with the code. Not used to java.

On Wed, 21 Sep 2016, 20:34 Petr Baudis, [email protected] wrote:

Any contributions to the code or just to the documentation will be
welcome!

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#62 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AVItTvG0Cu36QVOJKD8MhvpN0kwZ_YZbks5qsYaogaJpZM4KCKAv
.

talentoscope · 2016-09-22T02:01:06Z

Having looked at the code, I really don't think I'm going to be much use there, so instead will create a good dataset of questions sourced from many places, will curate this with question, answer, LAT type and anything else you feel necessary to help towards training the system.

pasky · 2016-09-22T09:10:30Z

That would be also really cool! :)

talentoscope · 2016-09-22T09:15:00Z

Made a start on this last night, up to about 250 questions. There are a few inference based ones in there too but shouldn't be too hard to still find the answer using current system. Benchmarking using base YodaQA too with correct answer position number and confidence. Hopefully that extra info will go towards diagnosis, easily removable. :)

pasky closed this as completed Sep 21, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to add more fulltext sources #62

How to add more fulltext sources #62

talentoscope commented Sep 20, 2016

pasky commented Sep 21, 2016

talentoscope commented Sep 21, 2016

pasky commented Sep 21, 2016 via email

talentoscope commented Sep 21, 2016

talentoscope commented Sep 22, 2016

pasky commented Sep 22, 2016 via email

talentoscope commented Sep 22, 2016

How to add more fulltext sources #62

How to add more fulltext sources #62

Comments

talentoscope commented Sep 20, 2016

pasky commented Sep 21, 2016

talentoscope commented Sep 21, 2016

pasky commented Sep 21, 2016 via email

talentoscope commented Sep 21, 2016

talentoscope commented Sep 22, 2016

pasky commented Sep 22, 2016 via email

talentoscope commented Sep 22, 2016