You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current languageModel context/prefix is too specific and doesn't account for future AI capabilities like image/voice/video interactions. I'd suggest to have a look at OpenAI (and others) an see how they don't strictly tight to a "language model"
OpenAI
asyncfunctionmain(){constresponse=awaitopenai.chat.completions.create({model: "gpt-4o-mini",messages: [{role: "user",content: [{type: "text",text: "What’s in this image?"},{type: "image_url",image_url: {"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",},},],},],});}
Gemini Nano XS claims itself to be multimodal but I did not find any corresponding API in Chrome on desktop. Could you add such APIs? Thank you.
The text was updated successfully, but these errors were encountered: