
Gemma 4 AI: The Future of On-Device Intelligence is Here
Google DeepMind has officially launched Gemma 4 AI, a groundbreaking family of state-of-the-art open models poised to redefine the possibilities of artificial intelligence directly on your devices. Available under the permissive Apache 2.0 license, Gemma 4 empowers developers with a robust toolkit for creating innovative on-device AI applications.
Beyond Chatbots: Unleashing Agentic AI
Gemma 4 isn’t just about chatbots. It unlocks the potential to build sophisticated agents and autonomous AI use cases that operate seamlessly on-device. This means enhanced privacy, reduced latency, and the ability to function even without a constant internet connection. Imagine applications capable of multi-step planning, autonomous action, offline code generation, and even advanced audio-visual processing – all without requiring specialized fine-tuning.
Global Reach with Multilingual Support
Built with a global audience in mind, Gemma 4 boasts support for over 140 languages, making it a truly versatile solution for developers worldwide. This broad language support ensures accessibility and inclusivity in your AI-powered applications.
Experience Gemma 4 on the Edge
You can start experiencing the expansive capabilities of Gemma 4 on the edge today! Access the built-in Gemma 4 model on Android through the new AICore Developer Preview, or leverage Google AI Edge to build agentic, in-app experiences across mobile, desktop, and edge devices.
Google AI Edge Gallery & Agent Skills
The Google AI Edge Gallery, available on iOS and Android, provides a platform to build and experiment with AI experiences that run entirely on-device. A key highlight is the launch of Agent Skills, one of the first applications to run multi-step, autonomous agentic workflows entirely on-device, powered by Gemma 4. Agent Skills can:
- Automate complex tasks.
- Provide personalized recommendations.
- Adapt to user needs in real-time.
Explore the Gemma 4 E2B and E4B models in action within the Google AI Edge Gallery app and start creating your own skills with the provided guide. Share your creations and collaborate with the community on Github Discussions!
LiteRT-LM: Performance Across the Hardware Spectrum
For developers seeking to deploy Gemma 4 in-app or across a wider range of devices, LiteRT-LM delivers stellar performance. Building on the trusted LiteRT framework, LiteRT-LM adds GenAI-specific libraries and introduces new features:
- Extended Context Lengths: Optimized GPU processing handles 4,000 input tokens across distinct skills in under 3 seconds.
- IoT & Edge Device Support: Smaller Gemma 4 models run efficiently on platforms like the Raspberry Pi 5 (133 prefill/7.6 decode tokens/s on CPU, 3,700 prefill/31 decode tokens/s with Qualcomm Dragonwing IQ8 NPU acceleration).
Dive deeper with the LiteRT-LM documentation for a complete guide and device-specific performance metrics. You can also review the individual model cards for Gemma 4 E2B and Gemma 4 E4B.
Platform Support & Developer Tools
Gemma 4 is available today with support across an unprecedented range of platforms. To further simplify development, Google has launched a new Python package and CLI tool for easy experimentation and integration into Python pipelines for IoT devices. The litert-lm CLI, available on Linux, macOS, and Raspberry Pi, allows you to explore Gemma 4’s capabilities without writing any code, and now supports tool calling powered by Agent Skills in the Google AI Edge Gallery.
The era of agentic experiences on-device is here. Start building on the edge with Agent Skills examples in the Google AI Edge Gallery and the LiteRT-LM getting started guide. We can’t wait to see what you create!
Acknowledgements
A huge thank you to the dedicated contributors who made this project possible: Advait Jain, Alice Zheng, Amber Heinbockel, Andrew Zhang, Byungchul Kim, Cormac Brick, Daniel Ho, Derek Bekebrede, Dillon Sharlet, Eric Yang, Fengwu Yao, Frank Barchard, Grant Jensen, Hriday Chhabria, Jae Yoo, Jenn Lee, Jing Jin, Jingxiao Zheng, Juhyun Lee, Lu Wang, Lin Chen, Majid Dadashi, Marissa Ikonomidis, Matthew Chan, Matthew Soulanille, Matthias Grundmann, Milen Ferev, Misha Gutman, Mohammadreza Heydary, Pradeep Kuppala, Qidong Zhao, Quentin Khan, Ram Iyengar, Raman Sarokin, Renjie Wu, Rishika Sinha, Rodney Witcher, Ronghui Zhu, Sachin Kotwani, Suleman Shahid, Tenghui Zhu, Terry Heo, Tiffany Hsiao, Wai Hon Law, Weiyi Wang, Xiaoming Hu, Xu Chen, Yishuang Pang, Yi-Chun Kuo, Yu-Hui Chen, Zichuan Wei, and the gTech team.




