Friday, July 25, 2025

Release Notes: Gemini's multimodality

 


"Ani Baddepudi, Gemini Model Behavior Product Lead, joins host Logan Kilpatrick for a deep dive into Gemini's multimodal capabilities. Their conversation explores why Gemini was built as a natively multimodal model from day one, the future of proactive AI assistants, and how we are moving towards a world where "everything is vision." Learn about the differences between video and image understanding and token representations, higher FPS video sampling, and more."

No comments:

Post a Comment

Large Language Models explained briefly

 Great video which explains LLM simply, by 3Blue1Brown youtube channel which explains maths using animation. atin math