Google launches Gemma 4 open-source LLM family
Google launched Gemma 4, an open model built for agentic AI workflows in four sizes.
The move comes as the US lags in open large language models relative to China, which counts DeepSeek and Qwen as just two entrants. Nvidia has pushed its Nemotron models to develop the open source AI ecosystem and Google's Gemma model has been downloaded more than 400 million times. Simply put, there's demand for open source LLMs.
For enterprises, the open source model space is worth watching since these models can be tailored to business use cases. For instance, Google said Gemma has more than 100,000 variants.
Gemma 4 is licensed under Apache 2.0 and includes technology from Gemini 3. Gemma 4 will come in four sizes:
- Effective 2B (E2B).
- Effective 4B (E4B).
- 26B Mixture of Experts (MoE).
- 31B Dense.
Google said its largest Gemma 4 model, 31B Dense, would be ranked No. 3 on the Arena AI text leaderboard. In the open source category on Arena AI, the top spots are dominated by Chinese open source models.
- Nvidia Nemotron: Much needed open-source model champion in US
- Why enterprise AI leaders need to bank on open-source LLMs
- Nvidia launches Nemotron 3 open models to enable multi-agent systems
- Nvidia GTC 2026: Nvidia launches NemoClaw, eyes to pair with DGX Spark, DGX Station
- AI Forum 2026: "There are claws everywhere now"
According to Google, Gemma 4's 26B MoE and 31B Dense models provide more intelligence per parameter and outcompete much larger models to achieve "frontier-level capabilities with significantly less hardware overhead."
Indeed, Google said the 26B and 31B models are designed for offline usage including consumer GPUs. State of the art features can run on a single 80GB Nvidia H100 GPU. The E2B and E4B models are designed for mobile, IoT and edge devices including Raspberry Pi and Jetson Nano.
Here's a look at what you need to know about Gemma 4:
- Gemma 4 models are designed to run on everything from Android devices to laptop GPUs to workstations and accelerators.
- Google cited customizations of Gemma 4 that include a Bulgarian-first language model and Yale University's Cell2Sentence-Scale model for cancer research.
- The models support agentic workflows and tools such as function-calling and native system instructions.
- Code generation for offline code, native processing for video and images and native audio input.
- Longer context windows and training on more than 140 languages.
- Gemma 4 models are optimized for Nvidia GPUs, AMD GPUs and Google Cloud TPUs.
Gemma 4 is available in Google AI Studio (31B and 26B) and Google AI Edge Gallery (E4B and E2B), multiple tools including Hugging Face, Nvidia NIM and NeMo, Ollama, Docker and others.