[BUG] Can not generate LLM structured inference output with exllamav2 version >0.20 #709

debasish-mihup · 2025-01-02T10:32:05Z

OS

Windows

GPU Library

CUDA 12.x

Python version

3.11

Pytorch version

2.5.1

Model

llama3.1

Describe the bug

Getting this error

AttributeError: 'ExLlamaV2TokenEnforcerFilter' object has no attribute 'background_drop'

at the below code:

outputs = self.generator.generate(
			prompt=prompts,
			filters=filters,
			filter_prefer_eos=True,
			max_new_tokens=1024,
			add_bos=add_bos,
			stop_conditions=get_phi4_stop_conditions(self.tokenizer), #get_llama3_stop_conditions(self.tokenizer),
			completion_only=True,
			encode_special_tokens=encode_special_tokens,
		)

As a bonus question:

I am trying to run inference with phi4 model. I could not locate any example for the same. Can you check below three functions for their correctness?

def format_phi4_prompt(system_prompt, user_prompt):
	return (
		f"<|system|>{system_prompt}<|endoftext|>\n"
		f"<|user|>{user_prompt}<|endoftext|>\n"
		f"<|assistant|>"
	)


def phi4_encoding_options():
	return False, False, True


def get_phi4_stop_conditions(tokenizer):
	return [tokenizer.single_id("<|endoftext|>")]

Reproduction steps

N/A

Expected behavior

N/A

Logs

N/A

Additional context

N/A

Acknowledgements

I have looked for similar issues before submitting this one.
I understand that the developers have lives and my issue will be answered when possible.
I understand the developers of this program are human, and I will ask my questions politely.

The text was updated successfully, but these errors were encountered:

turboderp · 2025-01-02T10:38:36Z

See #696.

debasish-mihup · 2025-01-02T11:56:06Z

See #696.

@turboderp Can you look into the format_phi4_prompt function?

debasish-mihup added the bug Something isn't working label Jan 2, 2025

debasish-mihup closed this as completed Jan 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Can not generate LLM structured inference output with exllamav2 version >0.20 #709

[BUG] Can not generate LLM structured inference output with exllamav2 version >0.20 #709

debasish-mihup commented Jan 2, 2025

turboderp commented Jan 2, 2025

debasish-mihup commented Jan 2, 2025

[BUG] Can not generate LLM structured inference output with exllamav2 version >0.20 #709

[BUG] Can not generate LLM structured inference output with exllamav2 version >0.20 #709

Comments

debasish-mihup commented Jan 2, 2025

OS

GPU Library

Python version

Pytorch version

Model

Describe the bug

Reproduction steps

Expected behavior

Logs

Additional context

Acknowledgements

turboderp commented Jan 2, 2025

debasish-mihup commented Jan 2, 2025