y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#on-device-ai News & Analysis

12 articles tagged with #on-device-ai. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

12 articles
AIBullisharXiv – CS AI · 1d ago7/10
🧠

RePAIR: Interactive Machine Unlearning through Prompt-Aware Model Repair

Researchers introduce RePAIR, a framework enabling users to instruct large language models to forget harmful knowledge, misinformation, and personal data through natural language prompts at inference time. The system uses a training-free method called STAMP that manipulates model activations to achieve selective unlearning with minimal computational overhead, outperforming existing approaches while preserving model utility.

AIBullisharXiv – CS AI · 1d ago7/10
🧠

Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices

Researchers introduce Vec-LUT, a novel vector-based lookup table technique that dramatically improves ultra-low-bit LLM inference on edge devices by addressing memory bandwidth underutilization. The method achieves up to 4.2x performance improvements over existing approaches, enabling faster LLM execution on CPUs than specialized NPUs.

AINeutralarXiv – CS AI · Mar 177/10
🧠

An Alternative Trajectory for Generative AI

Researchers propose shifting from large monolithic AI models to domain-specific superintelligence (DSS) societies due to unsustainable energy costs and physical constraints of current generative AI scaling approaches. The alternative involves smaller, specialized models working together through orchestration agents, potentially enabling on-device deployment while maintaining reasoning capabilities.

AIBullisharXiv – CS AI · Mar 56/10
🧠

LiteVLA-Edge: Quantized On-Device Multimodal Control for Embedded Robotics

Researchers developed LiteVLA-Edge, a deployment-oriented Vision-Language-Action model pipeline that enables fully on-device inference on embedded robotics hardware like Jetson Orin. The system achieves 150.5ms latency (6.6Hz) through FP32 fine-tuning combined with 4-bit quantization and GPU-accelerated inference, operating entirely offline within a ROS 2 framework.

AIBullisharXiv – CS AI · Mar 37/104
🧠

ROMA: a Read-Only-Memory-based Accelerator for QLoRA-based On-Device LLM

Researchers propose ROMA, a new hardware accelerator for running large language models on edge devices using QLoRA. The system uses ROM storage for quantized base models and SRAM for LoRA weights, achieving over 20,000 tokens/s generation speed without external memory.

AIBullishHugging Face Blog · Aug 87/108
🧠

Releasing Swift Transformers: Run On-Device LLMs in Apple Devices

The article title suggests Apple has released Swift Transformers, a framework for running large language models locally on Apple devices. This would enable on-device AI inference without requiring cloud connectivity, potentially improving privacy and performance for iOS/macOS applications.

AINeutralarXiv – CS AI · Mar 27/1017
🧠

RooflineBench: A Benchmarking Framework for On-Device LLMs via Roofline Analysis

Researchers introduce RooflineBench, a framework for measuring performance capabilities of Small Language Models on edge devices using operational intensity analysis. The study reveals that sequence length significantly impacts performance, model depth causes efficiency regression, and structural improvements like Multi-head Latent Attention can unlock better hardware utilization.

AIBullishGoogle DeepMind Blog · Jun 246/103
🧠

Gemini Robotics On-Device brings AI to local robotic devices

Gemini Robotics has announced an on-device AI model designed for local robotic devices, featuring general-purpose dexterity and rapid task adaptation capabilities. This development represents a move toward decentralized AI processing in robotics applications.

AIBullishHugging Face Blog · Jul 226/104
🧠

WWDC 24: Running Mistral 7B with Core ML

The article discusses running Mistral 7B, a large language model, using Apple's Core ML framework as presented at WWDC 24. This demonstrates Apple's continued focus on bringing AI capabilities to their hardware ecosystem through optimized inference tools.

AIBullishHugging Face Blog · Jun 156/105
🧠

Faster Stable Diffusion with Core ML on iPhone, iPad, and Mac

Apple has announced faster Stable Diffusion implementation using Core ML framework for iPhone, iPad, and Mac devices. This development enables on-device AI image generation with improved performance and efficiency across Apple's ecosystem.

AIBullishGoogle Research Blog · Oct 14/105
🧠

Introducing interactive on-device segmentation in Snapseed

Google's Snapseed photo editing app introduces interactive on-device segmentation technology, allowing users to select and edit specific objects in photos directly on their device. This represents an advancement in mobile AI-powered image processing capabilities without requiring cloud connectivity.