12 articles tagged with #on-device-ai. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBullisharXiv – CS AI · 1d ago7/10
🧠Researchers introduce RePAIR, a framework enabling users to instruct large language models to forget harmful knowledge, misinformation, and personal data through natural language prompts at inference time. The system uses a training-free method called STAMP that manipulates model activations to achieve selective unlearning with minimal computational overhead, outperforming existing approaches while preserving model utility.
AIBullisharXiv – CS AI · 1d ago7/10
🧠Researchers introduce Vec-LUT, a novel vector-based lookup table technique that dramatically improves ultra-low-bit LLM inference on edge devices by addressing memory bandwidth underutilization. The method achieves up to 4.2x performance improvements over existing approaches, enabling faster LLM execution on CPUs than specialized NPUs.
AINeutralarXiv – CS AI · Mar 177/10
🧠Researchers propose shifting from large monolithic AI models to domain-specific superintelligence (DSS) societies due to unsustainable energy costs and physical constraints of current generative AI scaling approaches. The alternative involves smaller, specialized models working together through orchestration agents, potentially enabling on-device deployment while maintaining reasoning capabilities.
AIBullisharXiv – CS AI · Mar 56/10
🧠Researchers developed LiteVLA-Edge, a deployment-oriented Vision-Language-Action model pipeline that enables fully on-device inference on embedded robotics hardware like Jetson Orin. The system achieves 150.5ms latency (6.6Hz) through FP32 fine-tuning combined with 4-bit quantization and GPU-accelerated inference, operating entirely offline within a ROS 2 framework.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers propose ROMA, a new hardware accelerator for running large language models on edge devices using QLoRA. The system uses ROM storage for quantized base models and SRAM for LoRA weights, achieving over 20,000 tokens/s generation speed without external memory.
AIBullishHugging Face Blog · Aug 87/108
🧠The article title suggests Apple has released Swift Transformers, a framework for running large language models locally on Apple devices. This would enable on-device AI inference without requiring cloud connectivity, potentially improving privacy and performance for iOS/macOS applications.
AINeutralBlockonomi · Mar 256/10
🧠HP stock has declined 34% annually as the company announces HP IQ, a privacy-focused on-device AI platform set to launch in Spring 2026. The platform positions HP as a direct competitor to Apple's AI approach ahead of Apple's WWDC event.
AINeutralarXiv – CS AI · Mar 27/1017
🧠Researchers introduce RooflineBench, a framework for measuring performance capabilities of Small Language Models on edge devices using operational intensity analysis. The study reveals that sequence length significantly impacts performance, model depth causes efficiency regression, and structural improvements like Multi-head Latent Attention can unlock better hardware utilization.
AIBullishGoogle DeepMind Blog · Jun 246/103
🧠Gemini Robotics has announced an on-device AI model designed for local robotic devices, featuring general-purpose dexterity and rapid task adaptation capabilities. This development represents a move toward decentralized AI processing in robotics applications.
AIBullishHugging Face Blog · Jul 226/104
🧠The article discusses running Mistral 7B, a large language model, using Apple's Core ML framework as presented at WWDC 24. This demonstrates Apple's continued focus on bringing AI capabilities to their hardware ecosystem through optimized inference tools.
AIBullishHugging Face Blog · Jun 156/105
🧠Apple has announced faster Stable Diffusion implementation using Core ML framework for iPhone, iPad, and Mac devices. This development enables on-device AI image generation with improved performance and efficiency across Apple's ecosystem.
AIBullishGoogle Research Blog · Oct 14/105
🧠Google's Snapseed photo editing app introduces interactive on-device segmentation technology, allowing users to select and edit specific objects in photos directly on their device. This represents an advancement in mobile AI-powered image processing capabilities without requiring cloud connectivity.