Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a bool fold_lowercase to whisper_context_params #2005

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

ulatekh
Copy link
Contributor

@ulatekh ulatekh commented Mar 27, 2024

If true, it folds language-model tokens to lowercase. By default, it's false.
This is intended to make grammar matching more predictable, e.g. no need to account for case in the grammar.

If true, it folds language-model tokens to lowercase.
By default, it's false.
This is intended to make grammar matching more predictable,
e.g. no need to account for case in the grammar.
@@ -44,6 +44,7 @@ struct whisper_params {
bool print_energy = false;
bool no_timestamps = true;
bool use_gpu = true;
bool model_fold_lc = false;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change name to vocab_lc

whisper.h Outdated Show resolved Hide resolved
whisper.cpp Outdated Show resolved Hide resolved
Comment on lines 209 to +210
fprintf(stderr, " -nt, --no-timestamps [%-7s] do not print timestamps\n", params.no_timestamps ? "true" : "false");
fprintf(stderr, " --model-fold-lc [%-7s] fold all model tokens to lowercase\n", params.model_fold_lc ? "true" : "false");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fprintf(stderr, " -nt, --no-timestamps [%-7s] do not print timestamps\n", params.no_timestamps ? "true" : "false");
fprintf(stderr, " --model-fold-lc [%-7s] fold all model tokens to lowercase\n", params.model_fold_lc ? "true" : "false");
fprintf(stderr, " -nt, --no-timestamps [%-7s] do not print timestamps\n", params.no_timestamps ? "true" : "false");
fprintf(stderr, " --vocab-lc [%-7s] fold all vocab tokens to lowercase\n", params.vocab_lc ? "true" : "false");

@ulatekh
Copy link
Contributor Author

ulatekh commented Apr 11, 2024

I have no idea what's wrong with the Java bindings. I loaded them all into Visual Studio Code and fixed all the errors it reported (which didn't seem related to my changes), but still the Java-related tests fail. FYI, I haven't programmed in Java in over 10 years.

@ggerganov
Copy link
Owner

I'm also not good with Java, but I think we are probably observing an issue similar to this one: ggerganov/llama.cpp#1902 (comment)

In short, even though the two structs whisper_context_params (C) and WhisperContextParams (Java) have the same members, there are likely different paddings between the members which is causing UB in the following code:

/**
* Provides default params which can be used with `whisper_init_from_file_with_params()` etc.
* Because this function allocates memory for the params, the caller must call either:
* - call `whisper_free_context_params()`
* - `Native.free(Pointer.nativeValue(pointer));`
*/
public WhisperContextParams getContextDefaultParams() {
paramsPointer = lib.whisper_context_default_params_by_ref();
WhisperContextParams params = new WhisperContextParams(paramsPointer);
params.read();
return params;
}

The proper solution is to order the members in decreasing size (i.e. keep the bools at the end of the struct). Or maybe avoid bool and simply replace them with int - this seems much easier change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants