Friday, July 25, 2025

Release Notes: Gemini's multimodality

 


"Ani Baddepudi, Gemini Model Behavior Product Lead, joins host Logan Kilpatrick for a deep dive into Gemini's multimodal capabilities. Their conversation explores why Gemini was built as a natively multimodal model from day one, the future of proactive AI assistants, and how we are moving towards a world where "everything is vision." Learn about the differences between video and image understanding and token representations, higher FPS video sampling, and more."

No comments:

Post a Comment

Hugging Face - Upgrade to Xet

I received an informational email on - Hugging face upgrading the Hub's storage backend from Git LFS to Xet. It stated that "Xet is...