Google brings Gemma AI models to Mac: AI Edge Gallery and Gemma 4 12B run locally without internet

June 5, 2026 Daniel Cesak

AI article illustration for ai-jarvis.eu

Google has released the AI Edge Gallery app for macOS and simultaneously introduced the new Gemma 4 12B model, which runs directly on your computer without an internet connection. What does this mean for everyday users, developers, and why is it another step toward local artificial intelligence that doesn't need the cloud.

Listen to this article:

AI Edge Gallery finally on Mac: Google's Gemma models now run locally

On June 4, 2026, Google officially expanded its AI Edge Gallery app to the macOS platform. Previously available only for iPhone, Mac owners can now download it and run language models from the Gemma family directly on their device — without needing a cloud connection.

AI Edge Gallery is a free tool that serves as a local player for Google's language models. After downloading from Google's official website, you can choose from several Gemma models and run them immediately. The app is designed so that even less technically savvy users can try artificial intelligence running on their own hardware.

Gemma 4 12B: A new model that fits in a laptop

Along with the expansion of AI Edge Gallery to Mac, Google is also introducing the new Gemma 4 12B model. It's a mid-sized model with 12 billion parameters, filling the gap between the lightweight E4B version designed for mobile devices and the more powerful 26B model with a Mixture of Experts (MoE) architecture.

The key innovation is a unified architecture without external encoders. Traditional multimodal models use separate encoders for processing images and audio — but Gemma 4 12B processes them directly within the language core. For image input, a single matrix operation is enough; for audio, a direct projection of the raw signal into the text token space. The result is lower latency and reduced memory requirements.

The model handles agentic multimodal intelligence — it understands text, images, and audio and can perform more complex multi-step tasks. In benchmarks, its performance approaches the larger 26B model, despite taking up less than half the memory.

Hardware requirements: 16 GB of RAM is enough

Gemma 4 12B requires at least 16 GB of VRAM or unified memory, which means it works on all modern Macs with Apple Silicon chips (M1 and newer). The only exception is the MacBook Neo, which doesn't have this capacity. For comparison: you don't need a dedicated graphics card costing tens of thousands — a regular MacBook Air will do.

Google also confirmed that models from the Gemma 4 family have already surpassed 150 million downloads across platforms like Hugging Face, Kaggle, LM Studio, and Ollama. The community is building everything from wearable robotic arms to enterprise security systems with them.

AI Edge Eloquent: A smart dictation tool built into the system

In addition to AI Edge Gallery, Google also introduced AI Edge Eloquent — a dictation and text editing app that also runs entirely on-device. Eloquent works across all macOS applications and launches with a keyboard shortcut. Users can set their preferred writing style, add custom words to the dictionary, and use AI for corrections and text reformulation.

At launch, however, Eloquent is available only in English. Google promises to add more languages but hasn't provided specific timelines yet. For Czech users, this means they'll have to wait for full functionality — unless they write in English. The app is available for free on Mac and iPhone.

Privacy and speed: Why run AI locally

Running language models locally brings three key advantages. Offline mode — the model works without internet, which you'll appreciate when traveling or in places with poor connectivity. Privacy — your data never leaves the device; nothing is sent to Google's or any other company's servers. Speed — response is often faster than communicating with a cloud API because network latency is eliminated.

For Czech users and businesses that work with sensitive data (such as law firms, healthcare, or the financial sector), a local AI model represents a way to leverage advanced language capabilities without the risk of information leaks.

Gemma vs. Gemini: What's the difference

Following the release of AI Edge Gallery, there has been considerable confusion about the terminology. It's important to distinguish: Gemma is a family of open, lightweight models built on the same technology as Gemini, but they are separate products. Gemini remains proprietary and runs exclusively on Google's infrastructure. Gemma is open-source (Apache 2.0 license), portable, and optimized for running on edge devices — from phones to laptops.

Thanks to the open license, you can not only download and use the model but also modify it, fine-tune it on your own data, and integrate it into your own applications. Support is available for Hugging Face Transformers, llama.cpp, MLX (the framework for Apple Silicon), SGLang, and vLLM.

What it means for Czechia

While Gemma models don't support Czech at the level of specialized models, their open license allows anyone to fine-tune the model on Czech data. For Czech developers and startups, this means the ability to build their own language applications without depending on cloud APIs from OpenAI or Google itself. With a 12-billion-parameter model that runs on a MacBook, the path opens to local AI assistants, translators, or analytics tools — without monthly API fees.

Google also released the official Gemma Skills repository — a skills library that makes it easier for developers to build agents powered by Gemma models.

Are AI Edge Gallery and Gemma 4 12B really free?

Yes, both AI Edge Gallery and the Gemma 4 12B model are completely free. The model is also released under the open-source Apache 2.0 license, so you can freely use it even for commercial purposes. You don't need any Google subscription or Gemini API.

Does Gemma 4 12B support Czech?

Gemma 4 12B is primarily trained on English and does not officially support Czech. However, thanks to the open license, it can be fine-tuned on Czech data. It will handle basic Czech comprehension, but for professional Czech texts, we recommend models like Llama or specialized European language models instead.

Does AI Edge Gallery work on Windows or Linux?

Currently, AI Edge Gallery is officially available only for macOS and iOS. However, Windows and Linux users can run Gemma models through alternative tools like LM Studio, Ollama, or directly via llama.cpp. The model itself is platform-independent.