Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: 'NoneType' object is not iterable #35719

Open
0xD4rky opened this issue Jan 15, 2025 · 1 comment
Open

TypeError: 'NoneType' object is not iterable #35719

0xD4rky opened this issue Jan 15, 2025 · 1 comment

Comments

@0xD4rky
Copy link

0xD4rky commented Jan 15, 2025

I was trying to fine-tune the qwen-vl2-7b instruct model on a custom dataset. For that, I wrote the following collate function:

@dataclass
class Collator:
    processor: ProcessorMixin

    def __call__(self, batch: List[Dict[str, Any]]) -> Dict[str, torch.Tensor]:

        images = [item["image"] for item in batch]
        texts = [item["text"] for item in batch]
        
        conversations = []
        for txt in texts:
            conversation = [
                {
                    "role": "user",
                    "content": [
                        {
                            "type": "image",
                        },
                        {
                            "type": "text",
                            "text": "Given this pallet load manifest image, provide the following information as JSON: " + txt
                        }
                    ]
                }
            ]
            conversations.append(conversation)
        
        text_prompts = [
            self.processor.apply_chat_template(conv, add_generation_prompt=True)
            for conv in conversations
        ]
        vision_inputs = self.processor.image_processor(
            images=images,
            return_tensors="pt",
            do_resize=True,
            size={'height': 448, 'width': 448}
        )

        inputs = self.processor(
            text=text_prompts,
            padding=True,
            return_tensors="pt"
        )
        
        processed_inputs = {
            "input_ids": inputs.input_ids,
            "attention_mask": inputs.attention_mask,
            "pixel_values": vision_inputs.pixel_values,
            "labels": inputs.input_ids.clone() 
        }
        
        return processed_inputs

When I am trying to run the training script, this error pops up no matter how I handle image processing:

[/usr/local/lib/python3.11/dist-packages/transformers/models/qwen2_vl/modeling_qwen2_vl.py](https://localhost:8080/#) in rot_pos_emb(self, grid_thw)
    994     def rot_pos_emb(self, grid_thw):
    995         pos_ids = []
--> 996         for t, h, w in grid_thw:
    997             hpos_ids = torch.arange(h).unsqueeze(1).expand(-1, w)
    998             hpos_ids = hpos_ids.reshape(

TypeError: 'NoneType' object is not iterable

Please help me out on this!

@0xD4rky 0xD4rky changed the title #TypeError: 'NoneType' object is not iterable TypeError: 'NoneType' object is not iterable Jan 16, 2025
@zucchini-nlp
Copy link
Member

Hey @0xD4rky !

I'd recommend to pass images and text to inputs = self.processor(images, text) so instead of doing a separate image_processor call. In case you need to overwrite any image processor attributes, you can directly pass them as processor(do_resize=True, **other_kwargs_for_image)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants