You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the indexer does not use the backend interface, but is directly handed the ISO/TEI files going into the index. I don't see why this has to be so, although I know that AGD indexing is based, not on the transcripts linked to the backend, but to a transformed / enriched version of them.
Running the indexer via the backend would be more consistent and transparent (for other users, for documentation), It will certainly do no harm to have an additional indexer which uses methods from BackendInterface to iterate over transcripts. The requirement that ISO/TEI transcripts will have to be pre-processed before being handed to the indexer could be handled via an abstract method:
public abstract Transcript preProcess(Transcript transcriptFromCorpus);
There may be performance issues, but they will be much less pronounced for anything smaller than FOLK or ZW (i.e. for almost all corpora). For the challenging cases, the current indexer would still be there (but maybe in the specific application, not in the "general API"?).
Currently, the indexer does not use the backend interface, but is directly handed the ISO/TEI files going into the index. I don't see why this has to be so, although I know that AGD indexing is based, not on the transcripts linked to the backend, but to a transformed / enriched version of them.
Running the indexer via the backend would be more consistent and transparent (for other users, for documentation), It will certainly do no harm to have an additional indexer which uses methods from
BackendInterface
to iterate over transcripts. The requirement that ISO/TEI transcripts will have to be pre-processed before being handed to the indexer could be handled via an abstract method:public abstract Transcript preProcess(Transcript transcriptFromCorpus);
There may be performance issues, but they will be much less pronounced for anything smaller than FOLK or ZW (i.e. for almost all corpora). For the challenging cases, the current indexer would still be there (but maybe in the specific application, not in the "general API"?).
Wondering what @EleFri thinks :-)
The text was updated successfully, but these errors were encountered: