Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Glebashnik/feed field generator #32842

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

glebashnik
Copy link
Contributor

I confirm that this contribution is made under the terms of the license found in the root directory of this repository's source tree and that I have the authority necessary to make this contribution on behalf of its copyright owner.

import java.util.List;
import java.util.Map;

public class GenerateExpression extends Expression {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add @author and description.

Comment on lines +58 to +67
@Override
public DataType setInputType(DataType type, VerificationContext context) {
// TODO: Not sure if this implementation of the methods is correct, needs careful review.
super.setInputType(type, context);

if (type == DataType.STRING)
throw new IllegalArgumentException("generate requires a string input type, but got " + type);

return DataType.STRING;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@Override
public DataType setInputType(DataType type, VerificationContext context) {
// TODO: Not sure if this implementation of the methods is correct, needs careful review.
super.setInputType(type, context);
if (type == DataType.STRING)
throw new IllegalArgumentException("generate requires a string input type, but got " + type);
return DataType.STRING;
}
@Override
public DataType setInputType(DataType inputType, VerificationContext context) {
return super.setInputType(inputType, DataType.STRING, context);
}

Comment on lines +69 to +78
@Override
public DataType setOutputType(DataType type, VerificationContext context) {
// TODO: Not sure if this implementation of the methods is correct, needs careful review.
super.setOutputType(type, type, context);

if (type != DataType.STRING)
throw new IllegalArgumentException("generate requires a string input type, but got " + type);

return DataType.STRING;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@Override
public DataType setOutputType(DataType type, VerificationContext context) {
// TODO: Not sure if this implementation of the methods is correct, needs careful review.
super.setOutputType(type, type, context);
if (type != DataType.STRING)
throw new IllegalArgumentException("generate requires a string input type, but got " + type);
return DataType.STRING;
}
@Override
public DataType setOutputType(DataType outputType, VerificationContext context) {
return super.setOutputType(DataType.STRING, outputType, null, context);
}

String generatorId,
List<String> generatorArguments
) {
super(null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
super(null);
super(DataType.STRING);

Comment on lines +89 to +92

if (!validTarget(targetType))
throw new VerificationException(this, "The generate target field must be a String");

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!validTarget(targetType))
throw new VerificationException(this, "The generate target field must be a String");

@@ -0,0 +1,29 @@
package ai.vespa.generative;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, I think we could put this in llm.generation

And maybe, since what separates it from Generator isn't that it's generating from a language model, but that it's taking an additional config specifying a prompt, we could call it something like PromptedGenerator, or ConfiguredGenerator?

return Map.of(name, this);
}

String generate(String prompt, Context context, DataType dataType);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the DataType argument, I don't think you need it and sine it is a document side concept it makes it weird to use this interface in other contexts.

return Map.of(name, this);
}

String generate(String prompt, Context context, DataType dataType);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the Prompt class here so we have the option of passing prompts through that aren't just strings.

@@ -31,7 +34,7 @@
*
* @author lesters
*/
public class LocalLLM extends AbstractComponent implements LanguageModel {
public class LocalLLM extends AbstractComponent implements LanguageModel, Generator {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The role of Generator is to apply a LanguageModel to a generation task specified by a context. This is a more general API so it's a bit messy for it to know about that context (and also to call itself through a utility defined elsewhere). I see the purpose though ... maybe there's a better solution, but I need to get some breakfast now.

@@ -33,6 +33,7 @@ import com.yahoo.text.StringUtilities;
import com.yahoo.vespa.indexinglanguage.expressions.*;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, atm we have two parsers. You need to also make these same additions in IndexingParser.ccc in schema-language-server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants