Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correcting AS-MOSES outputs:Approach Suggestion #115

Open
Eman22S opened this issue Dec 10, 2019 · 4 comments
Open

Correcting AS-MOSES outputs:Approach Suggestion #115

Eman22S opened this issue Dec 10, 2019 · 4 comments

Comments

@Eman22S
Copy link

Eman22S commented Dec 10, 2019

Overview

This is to address the issue raised in the report Optimizing as-moses: Reports #109. To summarize, asmoses is currently not working as required comparing its prediction of programs to moses's. When the problem is passed as a dataset of contin only or boolean only values, however small it might be, it fails to come up with the correct program prediction. The slow down discussed in the report is also very likely to be related to this, as running demo problems with and without atomspace-port=1 do not create a significant slow down as much as when the dataset is passed.

Observation

My observation on the behavior of asmoses is as follows:

Running a demo problem using atomspace port =1 produces the correct output and no discrepancy created between moses and asmoses
no-discrepency

Running the same problem which is disjunction using a csv dataset however creates incorrect results
dataset-or-weired

dataset-test2.csv contains :

datset

The following is the result of solving a conjunction problem using a csv dataset:
dataset-and-weired result

dataset-test1.csv contains:
datset-and

Suggestions

Looking at these outputs we might conclude that somewhere in the workflow of asmoses -i dataset.csv , a logic error is created in writing these codes. The same argument cannot be made for demo problems,however, as they are working as expected.

I have tried to look into some of these codes and got few speculations. Ctable population is one of which I have doubt on:

So the dataset is compressed and each of its features is populated into its atomspace in a condensed non-redundant structure. For instance, if we have a two column with 9 rows it condenses to say 2 or 3 rows and gets populated into the atomspace. The target column however is not populated in a way where the compression structure is kept for atomese to understand. what I am saying is, we don't have a compressed table representation for atomses and that might be what's creating all that Compressed Table Representation #19. Hence asmoses is not quite understanding the ctable is a ctable when populating it but turning it into a dataset it likes.

Conclusion

I would like to hear your thoughts @ngeiswei @kasimebrahim on the above and what testing strategy I should follow. My initial testing approach is black box testing of each module in the workflow of asmoses. But I am quite sure, given the above problem description, there is a systematic way of coming across it.

@ngeiswei
Copy link
Member

Thanks @Eman22S, that's very useful info.

Have you tried to enable fine log level and compare with and without port?

If you use diff, you can remove timestamps prior to that, with

https://github.com/singnet/cogutil/blob/master/scripts/util/rm-timestamps.sh

@Eman22S
Copy link
Author

Eman22S commented Dec 12, 2019

@ngeiswei, that sounds plausible. How do I use that when running asmoses?

@ngeiswei
Copy link
Member

For log level just use asmoses -l fine ..., for rm-timestamps.sh you need to call it on the log file afterwards, I think calling rm-timestamps without arguments provide some help.

@kasimebrahim
Copy link
Collaborator

@Eman22S, @ngeiswei I think the problems we are facing right now with as-moses rise from same root. The data population is not working properly, specifically the population from input table.

The first problem is the Not a link! in Interpreter.cc line:66 and the only way that could happen is if a node[Predicate or Schema] is given to the interpreter but it's value is not set, this problem rises only when working on table-based problems and it has to do with populating the dataset. It also explains the above unexpected programs.

I recommend two things
one adding at-least an OC_ASSERT(program->is_node(), "...") just before line::65 in the Interpreter.
two make sure the input tables are populated properly to the atomspace in table-problems.cc line:167 and populate_atomspace.cc, and make sure the candidate programs in instance_scorer.cc, composite_score atomese_based_scorer::operator()(const instance &inst) are in synch with the atomspace instance. You can do that simply by checking if the arguments[Predicates and schemas] are populated after running Handle prog = _as.add_atom(prog);.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants