Gemma: Google's Open Models Based on Gemini Technology

Gábor Bíró February 26, 2024
4 min read

Google has unveiled Gemma, a new family of open models for artificial intelligence. The Gemma models are built upon the same research and technology used to create Google's flagship Gemini models, offering a state-of-the-art, accessible alternative specifically aimed at developers and researchers looking to work directly with AI models.

Gemma: Google's Open Models Based on Gemini Technology
Source: Own work

Gemma Models Key Features

  • Model Variants: Gemma models are available in two sizes: Gemma 2B and Gemma 7B, both offered in pre-trained and instruction-tuned versions. These models are designed to be lightweight enough to potentially run on a developer's laptop or desktop computer, making them accessible for a wide range of applications and significantly lowering the barrier to entry compared to larger models.

The terms "2B" and "7B" indicate the size of the model, specifically the number of parameters it contains. "B" stands for billion, so a "7B" model has approximately 7 billion parameters, while a "2B" model has around 2 billion. These parameters are the weights within the model that are optimized during the training process and determine how the model performs tasks like language processing or image generation. Generally, a higher number of parameters correlates with better performance across various tasks, but also requires more computational resources.

  • Cross-Platform and Framework Compatibility: Gemma models support multi-framework tools (like JAX, PyTorch, TensorFlow via Keras 3.0) and are compatible across various devices, including laptops, desktops, IoT devices, mobiles, and cloud platforms. They are optimized for performance on NVIDIA GPUs and Google Cloud TPUs, ensuring broad accessibility and industry-leading performance for their size class.
  • Responsible AI Toolkit: Alongside the Gemma models, Google has released a Responsible Generative AI Toolkit. This toolkit provides guidance and tools for developers to create safer AI applications, helping to filter harmful inputs/outputs and encouraging responsible use and innovation, aligning with Google's AI Principles.
  • Open Model Philosophy: Unlike some traditional open-source models, Gemma models come with usage terms that permit responsible commercial use and distribution. While offering broad access, this approach uses a custom license rather than a standard OSI-approved one (like Apache 2.0). Google aims to strike a balance between the benefits of open access and the need to mitigate the risks of misuse, promoting responsible innovation within the AI community.

The term "state-of-the-art" signifies the most advanced technology, method, or product currently available in a particular field, representing the highest level of development achieved to date.

Applications and Accessibility

Gemma models are designed for various language-based tasks such as text generation, summarization, question answering, and powering chatbots. They are particularly suitable for developers seeking high performance in smaller, more cost-effective models that can be fine-tuned for specific needs. Google claims that Gemma models, despite their relatively small size, significantly outperform *some* larger models on key benchmarks while requiring fewer resources.

Developers and researchers can access Gemma models through platforms like Kaggle, Hugging Face, NVIDIA NeMo, and Google's Vertex AI. Google is providing free access to Gemma on Kaggle, a free tier for Colab notebooks, $300 in credits for first-time Google Cloud users, and researchers may be eligible for up to $500,000 in Google Cloud credits.

Comparing Gemma and Gemini Models

  1. Accessibility and Usage:

    • Gemini: Primarily accessed by end-users through web/mobile apps, APIs, and Google Vertex AI for closed-model usage. Optimized for ease of use without direct model manipulation.
    • Gemma: Designed for developers, researchers, and businesses for experimentation, fine-tuning, and integration into applications; openly accessible for download and modification under specific terms.
  2. Model Size and Capabilities:

    • Gemini: A family of larger, highly capable closed AI models (Ultra, Pro, Flash) suitable for complex, general-purpose tasks, competing directly with models like GPT-4.
    • Gemma: Lightweight open models (2B and 7B parameters) optimized for specific tasks like chatbots, summarization, or RAG, delivering strong performance *for their size* on key benchmarks.
  3. Deployment and Compatibility:

    • Gemini: Typically accessed via API, requiring no local deployment by the end-user; backend runs on Google's specialized data center hardware.
    • Gemma: Can potentially run on laptops, workstations, or easily deployed on Google Cloud (e.g., Vertex AI, Google Kubernetes Engine); optimized for various hardware including NVIDIA GPUs and Google Cloud TPUs.
  4. Licensing and Philosophy:

    • Gemini: Closed models with restricted access via APIs and Google products.
    • Gemma: "Open models" with usage terms allowing responsible commercial use and distribution, emphasizing a balance between open access and risk mitigation, rather than a fully permissive open-source license.
  5. Use Cases:

    • Gemini: Best for highest capability needs, ease of use via API, complex reasoning, multi-turn conversation, general knowledge tasks without needing custom infrastructure.
    • Gemma: Ideal for tasks requiring model customization/fine-tuning, lower cost, lower latency, on-device or local deployment needs (due to privacy or offline requirements), research, and education.

Google's Strategic Pivot

The release of Gemma marks a significant strategic pivot for Google towards embracing open models for AI. This move is widely seen as a response to the growing demand in the developer and research communities for accessible, high-quality AI models, fueled partly by the success of open models from competitors like Meta (Llama) and Mistral AI. It's a way for Google to foster innovation, collaboration, and capture developer mindshare within the broader AI ecosystem. By offering Gemma as open models, Google aims to empower developers and researchers to build upon its technology while maintaining its commitment to responsible AI development.

Gábor Bíró February 26, 2024