y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#gemma News & Analysis

18 articles tagged with #gemma. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

18 articles
AINeutralarXiv – CS AI · Mar 277/10
🧠

How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models

Researchers conducted the first systematic study of how weight pruning affects language model representations using Sparse Autoencoders across multiple models and pruning methods. The study reveals that rare features survive pruning better than common ones, suggesting pruning acts as implicit feature selection that preserves specialized capabilities while removing generic features.

🧠 Llama
AIBullisharXiv – CS AI · Mar 177/10
🧠

FlashHead: Efficient Drop-In Replacement for the Classification Head in Language Model Inference

Researchers introduce FlashHead, a training-free replacement for classification heads in language models that delivers up to 1.75x inference speedup while maintaining accuracy. The innovation addresses a critical bottleneck where classification heads consume up to 60% of model parameters and 50% of inference compute in modern language models.

🧠 Llama
AIBearisharXiv – CS AI · Mar 177/10
🧠

Narrow Fine-Tuning Erodes Safety Alignment in Vision-Language Agents

Research reveals that fine-tuning aligned vision-language AI models on narrow harmful datasets causes severe safety degradation that generalizes across unrelated tasks. The study shows multimodal models exhibit 70% higher misalignment than text-only evaluation suggests, with even 10% harmful training data causing substantial alignment loss.

AIBullisharXiv – CS AI · Mar 47/104
🧠

You Only Fine-tune Once: Many-Shot In-Context Fine-Tuning for Large Language Models

Researchers propose Many-Shot In-Context Fine-tuning (ManyICL), a novel approach that significantly improves large language model performance by treating multiple in-context examples as supervised training targets rather than just prompts. The method narrows the performance gap between in-context learning and dedicated fine-tuning while reducing catastrophic forgetting issues.

AINeutralarXiv – CS AI · Mar 37/103
🧠

Reward Models Inherit Value Biases from Pretraining

A comprehensive study of 10 leading reward models reveals they inherit significant value biases from their base language models, with Llama-based models preferring 'agency' values while Gemma-based models favor 'communion' values. This bias persists even when using identical preference data and training processes, suggesting that the choice of base model fundamentally shapes AI alignment outcomes.

AINeutralGoogle DeepMind Blog · Oct 257/106
🧠

T5Gemma: A new collection of encoder-decoder Gemma models

Google introduces T5Gemma, a new collection of encoder-decoder large language models (LLMs) based on the Gemma architecture. This represents an expansion of Google's Gemma model family to include encoder-decoder capabilities alongside the existing decoder-only models.

AIBullishGoogle DeepMind Blog · Oct 237/103
🧠

How a Gemma model helped discover a new potential cancer therapy pathway

Google has launched a new 27 billion parameter foundation model for single-cell analysis, built on the Gemma family of open models. The model has reportedly helped discover a new potential cancer therapy pathway, demonstrating practical medical applications of AI technology.

AIBullisharXiv – CS AI · Apr 76/10
🧠

LangFIR: Discovering Sparse Language-Specific Features from Monolingual Data for Language Steering

Researchers introduce LangFIR, a method that enables better language control in multilingual AI models using only monolingual data instead of expensive parallel datasets. The technique identifies sparse language-specific features and achieves superior performance in controlling language output across multiple models including Gemma and Llama.

🧠 Llama
AINeutralarXiv – CS AI · Mar 276/10
🧠

Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory

Researchers introduce a new framework to evaluate how well Large Language Models understand their own knowledge limitations, finding that traditional confidence metrics miss key differences between models. The study reveals that models showing similar accuracy can have vastly different metacognitive abilities - their capacity to know what they don't know.

🧠 Llama
AINeutralarXiv – CS AI · Mar 36/107
🧠

Graph-theoretic Agreement Framework for Multi-agent LLM Systems

Researchers propose a graph-theoretic framework for securing multi-agent LLM systems by analyzing consensus in signed, directed interaction networks. The study addresses vulnerabilities in distributed AI architectures where hidden system prompts can act as 'topological Trojan horses' that destabilize cooperative consensus among AI agents.

AIBullishGoogle DeepMind Blog · Oct 256/107
🧠

Introducing Gemma 3n: The developer guide

Gemma 3n is a new development release specifically created for the developer community that contributed to shaping the Gemma AI model. This represents a continuation of Google's open-source AI model family with enhanced developer-focused features.

AIBullishHugging Face Blog · Jun 266/107
🧠

Gemma 3n fully available in the open-source ecosystem!

Google has made Gemma 3n fully available in the open-source ecosystem. This release expands access to Google's AI model capabilities for developers and researchers in the open-source community.

AIBullishHugging Face Blog · Jul 316/106
🧠

Google releases Gemma 2 2B, ShieldGemma and Gemma Scope

Google has released Gemma 2 2B, a smaller 2-billion parameter version of its open-source AI model, alongside ShieldGemma for safety filtering and Gemma Scope for model interpretability. These releases expand Google's Gemma family with more accessible and transparent AI tools for developers and researchers.

AINeutralTechCrunch – AI · Apr 64/10
🧠

Google quietly releases an offline-first AI dictation app on iOS

Google has quietly launched a new offline-first AI dictation app for iOS that utilizes Gemma AI models. The app appears to be positioning itself as a competitor to existing dictation solutions like Wispr Flow by offering offline functionality.

AINeutralHugging Face Blog · Sep 45/106
🧠

Welcome EmbeddingGemma, Google's new efficient embedding model

Google has released EmbeddingGemma, a new efficient embedding model designed to improve text representation and semantic understanding tasks. This release continues Google's expansion of its Gemma model family, focusing on computational efficiency while maintaining performance quality.

AINeutralGoogle DeepMind Blog · Apr 144/106
🧠

DolphinGemma: How Google AI is helping decode dolphin communication

Google has developed DolphinGemma, a large language model designed to help scientists decode dolphin communication patterns. The AI tool aims to analyze and potentially understand what dolphins are saying through their vocalizations.

AINeutralHugging Face Blog · Feb 215/106
🧠

Welcome Gemma - Google’s new open LLM

The article title suggests Google has released a new open-source large language model called Gemma. However, the article body is empty, preventing detailed analysis of the announcement's specifics or implications.

AINeutralHugging Face Blog · Feb 231/107
🧠

Fine-Tuning Gemma Models in Hugging Face

The article title suggests content about fine-tuning Gemma models using Hugging Face platform, but no article body content was provided for analysis. Without the actual article content, a comprehensive analysis of the technical details, implications, or market impact cannot be performed.