Cannot create and train a new trait entity using HTTP API #811

Boyko-Karadzhov · 2017-10-05T12:49:29Z

Do you want to request a feature, report a bug, or ask a question about wit?
bug

What is the current behavior?

Entity is not recognized using expressions that were previously given as samples
For some time after submitting the samples, the values do not appear in the entity while training status remains clean
After the values appear, there are duplicating values (probably same as this Duplicate trait values for intent #743)

If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem.
Create an entity:

curl -X POST \
  https://api.wit.ai/entities \
  -H 'authorization: Bearer xxxxxxx' \
  -H 'cache-control: no-cache' \
  -H 'content-type: application/json' \
  -d '{
   "id":"Conversation"
}'

Add samples for two values:

curl -X POST \
  'https://api.wit.ai/samples?v=20170307' \
  -H 'authorization: Bearer xxxxxxx' \
  -H 'cache-control: no-cache' \
  -H 'content-type: application/json' \
  -d '[{"text":"Book a doctor","entities":[{"entity":"Conversation","value":"bookDoctor"}]},{"text":"I would like to book a doctor","entities":[{"entity":"Conversation","value":"bookDoctor"}]},{"text":"Can I book a doctor for this Tuesday?","entities":[{"entity":"Conversation","value":"bookDoctor"}]},{"text":"Is doctor Burke available this Tuesday?","entities":[{"entity":"Conversation","value":"bookDoctor"}]},{"text":"Can I check doctor Burke'\''s schedule for this week?","entities":[{"entity":"Conversation","value":"bookDoctor"}]}]'

curl -X POST \
  'https://api.wit.ai/samples?v=20170307' \
  -H 'authorization: Bearer xxxxxxxx' \
  -H 'cache-control: no-cache' \
  -H 'content-type: application/json' \
  -d '[{"text":"Contact support","entities":[{"entity":"Conversation","value":"contactOperator"}]},{"text":"Can I talk to an operator?","entities":[{"entity":"Conversation","value":"contactOperator"}]},{"text":"Would you get me in touch with a human?","entities":[{"entity":"Conversation","value":"contactOperator"}]},{"text":"It will be great if I can talk to a person.","entities":[{"entity":"Conversation","value":"contactOperator"}]},{"text":"Can you get me in touch with an operator?","entities":[{"entity":"Conversation","value":"contactOperator"}]},{"text":"Forward me to an operator","entities":[{"entity":"Conversation","value":"contactOperator"}]}]'

Wait for training status to become clean and test application's understanding for one of the expressions like: Book a doctor.

What is the expected behavior?
When training status is clean:

the entity should have only 2 values: bookDoctor and contactOperator
messages that match expressions (or are similar) should be recognized as one of the entity's values. In this case: Book a doctor should be recognized as Conversation entity with bookDoctor value

What is the App ID where you are experiencing this issue (if applicable)?
59d6256d-faae-4935-a4ba-7ff546707d4d (easily reproduced in a new app)

The text was updated successfully, but these errors were encountered:

l5t · 2017-10-05T22:53:49Z

Thanks for reporting. We identified the bug and are working on it

patapizza · 2017-10-11T17:32:10Z

@Boyko-Karadzhov I removed the duplicates for Conversation. We are still working on a fix.

blandinw · 2017-10-12T22:38:23Z

@Boyko-Karadzhov this is now fixed. We had a bug in our normalization code, causing the uppercase letter in your value to not be properly handled. Apologies for the inconvenience and thanks for your patience.

Boyko-Karadzhov · 2017-10-16T08:58:06Z

Hello again,

The values are not duplicating now but there is still the issue with the recognition. I have made a new app to demonstrate: 59e47043-64f9-4daf-acf2-5c2710a999dc

It has the entity, values and samples persisted. I have trained it using this bash script (same as before):

curl -X POST \
  https://api.wit.ai/entities \
  -H 'authorization: Bearer xxxxxx' \
  -H 'cache-control: no-cache' \
  -H 'content-type: application/json' \
  -d '{
   "id":"Conversation"
}'

curl -X POST \
  'https://api.wit.ai/samples?v=20170307' \
  -H 'authorization: Bearer xxxxxx' \
  -H 'cache-control: no-cache' \
  -H 'content-type: application/json' \
  -d '[{"text":"Book a doctor","entities":[{"entity":"Conversation","value":"bookDoctor"}]},{"text":"I would like to book a doctor","entities":[{"entity":"Conversation","value":"bookDoctor"}]},{"text":"Can I book a doctor for this Tuesday?","entities":[{"entity":"Conversation","value":"bookDoctor"}]},{"text":"Is doctor Burke available this Tuesday?","entities":[{"entity":"Conversation","value":"bookDoctor"}]},{"text":"Can I check doctor Burke'\''s schedule for this week?","entities":[{"entity":"Conversation","value":"bookDoctor"}]}]'

curl -X POST \
  'https://api.wit.ai/samples?v=20170307' \
  -H 'authorization: Bearer xxxxxx' \
  -H 'cache-control: no-cache' \
  -H 'content-type: application/json' \
  -d '[{"text":"Contact support","entities":[{"entity":"Conversation","value":"contactOperator"}]},{"text":"Can I talk to an operator?","entities":[{"entity":"Conversation","value":"contactOperator"}]},{"text":"Would you get me in touch with a human?","entities":[{"entity":"Conversation","value":"contactOperator"}]},{"text":"It will be great if I can talk to a person.","entities":[{"entity":"Conversation","value":"contactOperator"}]},{"text":"Can you get me in touch with an operator?","entities":[{"entity":"Conversation","value":"contactOperator"}]},{"text":"Forward me to an operator","entities":[{"entity":"Conversation","value":"contactOperator"}]}]'

After waiting for the status to become "clean" I have tried recognition of a message "Book a doctor" and it is not recognized.

I don't think it is a matter of waiting and status. At this point, it is stuck and it will not learn to recognize Conversation entities until I add another value (I have experimented).

Is it possible that the actual training is queued for the first request (create entity, no samples) and then next requests don't trigger new training since one is already queued but with outdated model? I'm trying to explain how adding new value after a while, results in correct training of all values.

If you make the requests manually, with a significant delay, then training is working. The problem is that I'm doing it programmatically and adding constant delays still leaves it to chance.

Boyko-Karadzhov · 2017-10-17T07:18:22Z

@blandinw Would you reopen the issue?
I'm not sure if it is reaching you since it is already closed.

blandinw · 2017-10-17T18:55:46Z

@Boyko-Karadzhov I'm looking into it.
Also, as the creator of the issue, are you not able to reopen it yourself?

blandinw · 2017-10-18T23:46:16Z

@Boyko-Karadzhov I used your script and was not able to reproduce your issue.

Please keep in mind that POST /samples will train your app asynchronously, getting a 200 back does not mean your app is trained. During normal operations, your app should be trained within a few seconds of the POST /samples request. However at the moment, during peak traffic (like we had a few times these past few weeks), it may take a few minutes or even a few hours. We are working a new dataset infra that should make the worst case scenario way faster (max 1min), but this has not been released yet.

I'm going to close this, please comment back if it still does not work.

Boyko-Karadzhov · 2017-10-19T07:53:27Z

@blandinw The app, 59e47043-64f9-4daf-acf2-5c2710a999dc that I used to demonstrate, was created 3 days ago and it is still not trained. I doubt it will start recognizing entities no matter how much we wait. I have just tried it again with a new app - 59e854b0-264f-4825-87bc-de62686bd9a6. Again - no recognition. Asynchronicity aside, I think there is a bug because the apps will be trained if I force them with one extra sample after a while. They just don't train on the first run. I reproduce it consistently.

There is also the issue is with the uncertainty of the outcome. There is no way to know if it is done now, we should wait more or if something went wrong along the way and it will never be done.

Can we get a status endpoint to tell us if training is queued, in progress, ready or failed? I'm using the /status which you use in the wit.ai UI but it seems to return clean before processing the samples and cannot be used reliably as an indicator.

grinono · 2017-10-23T09:19:14Z

We have the same issue, Pushing keywords to a entity via the HTTP rest API works, But once their, they will never be recognized.

hristoborisov · 2017-10-25T13:19:38Z

@blandinw we are continuing to experience this problem. Can you please help us resolve it before you close the issue? We are relying on your API to train chatbots on the go, and it simply doesn't work.

stopachka · 2017-10-25T19:23:28Z

Hey Hristo,

There's quite a few different issues here.

Training not stopping
Training never starting
Duplicate keywords

Are you experiencing issues with 1, 2, or 3?

If you could tell me your app-id, and a repro of your issue happy to look into it

(Also 3. does not relate to this initial issue, as trait entities do not have keywords, if that is the issue will consolidate into a different item, so we can be on the same page)

hristoborisov · 2017-10-26T12:27:19Z

Hey @stopachka,

I am referring to the only issue that wasn't resolved in this thread - The training is done (clean status), but there is no understanding. Let me copy/paste @Boyko-Karadzhov's step to reproduce again here.

I have made a new app to demonstrate: 59e47043-64f9-4daf-acf2-5c2710a999dc. It has the entity, values and samples persisted. I have trained it using this bash script (same as before):

  https://api.wit.ai/entities \
  -H 'authorization: Bearer xxxxxx' \
  -H 'cache-control: no-cache' \
  -H 'content-type: application/json' \
  -d '{
   "id":"Conversation"
}'

curl -X POST \
  'https://api.wit.ai/samples?v=20170307' \
  -H 'authorization: Bearer xxxxxx' \
  -H 'cache-control: no-cache' \
  -H 'content-type: application/json' \
  -d '[{"text":"Book a doctor","entities":[{"entity":"Conversation","value":"bookDoctor"}]},{"text":"I would like to book a doctor","entities":[{"entity":"Conversation","value":"bookDoctor"}]},{"text":"Can I book a doctor for this Tuesday?","entities":[{"entity":"Conversation","value":"bookDoctor"}]},{"text":"Is doctor Burke available this Tuesday?","entities":[{"entity":"Conversation","value":"bookDoctor"}]},{"text":"Can I check doctor Burke'\''s schedule for this week?","entities":[{"entity":"Conversation","value":"bookDoctor"}]}]'

curl -X POST \
  'https://api.wit.ai/samples?v=20170307' \
  -H 'authorization: Bearer xxxxxx' \
  -H 'cache-control: no-cache' \
  -H 'content-type: application/json' \
  -d '[{"text":"Contact support","entities":[{"entity":"Conversation","value":"contactOperator"}]},{"text":"Can I talk to an operator?","entities":[{"entity":"Conversation","value":"contactOperator"}]},{"text":"Would you get me in touch with a human?","entities":[{"entity":"Conversation","value":"contactOperator"}]},{"text":"It will be great if I can talk to a person.","entities":[{"entity":"Conversation","value":"contactOperator"}]},{"text":"Can you get me in touch with an operator?","entities":[{"entity":"Conversation","value":"contactOperator"}]},{"text":"Forward me to an operator","entities":[{"entity":"Conversation","value":"contactOperator"}]}]'

After waiting for the status to become "clean" I have tried recognition of a message "Book a doctor" and it is not recognized.

I don't think it is a matter of waiting and status. At this point, it is stuck and it will not learn to recognize Conversation entities until I add another value (I have experimented).

Is it possible that the actual training is queued for the first request (create entity, no samples) and then next requests don't trigger new training since one is already queued but with outdated model? I'm trying to explain how adding new value after a while, results in correct training of all values.

If you make the requests manually, with a significant delay, then training is working. The problem is that I'm doing it programmatically and adding constant delays still leaves it to chance.

patapizza · 2017-10-30T16:32:54Z

Hey @hristoborisov, just to make sure, does it work after a while without adding in a new value?

darvinai · 2017-11-02T14:32:13Z

@patapizza no, it doesn't work. We have projects created a month ago that we haven't touched and are still not working. If you touch them with new values, they start to work. I am writing from our system github account.

-Hristo Borisov

blandinw · 2017-11-06T17:25:56Z

We'll look into it again, thanks for your patience

bpleao · 2017-11-29T11:54:32Z

Any news on this topic? I'm facing similar issues. Thanks

stopachka · 2017-12-01T19:58:46Z

Update on this: #876
tl:dr -- reproing is quite hard for this, but we have 2 action items to get to a solution. Moving the convo to that thread

mohit2494 · 2019-02-19T11:45:19Z

any update on the above points?
is anybody still facing the issue?

I wanted to know how to create a spanless entity (trait) which was formerly called intent ( now deprecated ).
Can anyone help me out with an example where I can create a spanless entity and train is using some expressions? Thanks for your help.

patapizza · 2019-02-25T21:10:42Z

@mohit2494 Seems like you got your answers in #231? Please create a new issue instead of bubbling up old ones, it makes it easier to track.

Thanks.

patapizza added the bug label Oct 5, 2017

patapizza self-assigned this Oct 5, 2017

blandinw closed this as completed Oct 12, 2017

blandinw reopened this Oct 17, 2017

blandinw closed this as completed Oct 18, 2017

stopachka reopened this Oct 25, 2017

stopachka mentioned this issue Oct 25, 2017

PUT a bulk list of Keywords to a Entity #830

Closed

blandinw added the investigating label Nov 6, 2017

stopachka self-assigned this Nov 9, 2017

stopachka mentioned this issue Nov 11, 2017

BUG, Build Entities with API - Samples don't work #848

Closed

stopachka closed this as completed Dec 1, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot create and train a new trait entity using HTTP API #811

Cannot create and train a new trait entity using HTTP API #811

Boyko-Karadzhov commented Oct 5, 2017

l5t commented Oct 5, 2017

patapizza commented Oct 11, 2017

blandinw commented Oct 12, 2017

Boyko-Karadzhov commented Oct 16, 2017

Boyko-Karadzhov commented Oct 17, 2017

blandinw commented Oct 17, 2017

blandinw commented Oct 18, 2017

Boyko-Karadzhov commented Oct 19, 2017

grinono commented Oct 23, 2017 •

edited

Loading

hristoborisov commented Oct 25, 2017 •

edited

Loading

stopachka commented Oct 25, 2017 •

edited

Loading

hristoborisov commented Oct 26, 2017 •

edited

Loading

patapizza commented Oct 30, 2017

darvinai commented Nov 2, 2017 •

edited

Loading

blandinw commented Nov 6, 2017

bpleao commented Nov 29, 2017

stopachka commented Dec 1, 2017

mohit2494 commented Feb 19, 2019

patapizza commented Feb 25, 2019

Cannot create and train a new trait entity using HTTP API #811

Cannot create and train a new trait entity using HTTP API #811

Comments

Boyko-Karadzhov commented Oct 5, 2017

l5t commented Oct 5, 2017

patapizza commented Oct 11, 2017

blandinw commented Oct 12, 2017

Boyko-Karadzhov commented Oct 16, 2017

Boyko-Karadzhov commented Oct 17, 2017

blandinw commented Oct 17, 2017

blandinw commented Oct 18, 2017

Boyko-Karadzhov commented Oct 19, 2017

grinono commented Oct 23, 2017 • edited Loading

hristoborisov commented Oct 25, 2017 • edited Loading

stopachka commented Oct 25, 2017 • edited Loading

hristoborisov commented Oct 26, 2017 • edited Loading

patapizza commented Oct 30, 2017

darvinai commented Nov 2, 2017 • edited Loading

blandinw commented Nov 6, 2017

bpleao commented Nov 29, 2017

stopachka commented Dec 1, 2017

mohit2494 commented Feb 19, 2019

patapizza commented Feb 25, 2019

grinono commented Oct 23, 2017 •

edited

Loading

hristoborisov commented Oct 25, 2017 •

edited

Loading

stopachka commented Oct 25, 2017 •

edited

Loading

hristoborisov commented Oct 26, 2017 •

edited

Loading

darvinai commented Nov 2, 2017 •

edited

Loading