#edge-computing News & Analysis

214 articles tagged with #edge-computing. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

214 articles

AIBullisharXiv – CS AI · Jun 237/10

🧠

Delay-Adaptive Speculation Control for Low-Latency Edge-Cloud LLM Inference

Researchers develop a delay-adaptive algorithm for optimizing speculative decoding in distributed LLM inference across edge-cloud systems. The study proves optimal draft length follows a finite threshold policy and introduces UCB-SpecStop, an online control algorithm that reduces per-token latency by up to 22.4% compared to existing methods while adapting to varying network conditions.

🧠 Llama

AIBullisharXiv – CS AI · Jun 237/10

🧠

GRINQH: Graded Input-based Quantization Hierarchy for Efficient LLM Generation

GRINQH introduces a weight-only quantization framework that optimizes large language model inference by dynamically assigning different precision levels to weight channels based on activation magnitudes. The approach achieves state-of-the-art performance on Llama3 and Qwen3 models at 2-4 bit settings, addressing the GPU memory bandwidth bottleneck that constrains decoding speed in edge-computing environments.

🧠 Llama

AIBullisharXiv – CS AI · Jun 237/10

🧠

FleetAgent: Teleoperation Assistant for Autonomous Fleets via Vectorized V2N Messages

FleetAgent is a cloud-based AI system that uses compact vectorized vehicle-to-network messages to assist remote operators in managing autonomous vehicle fleets. The system reduces data transmission costs by up to 625x compared to raw images while improving teleoperation monitoring accuracy and decision-making efficiency.

AIBullisharXiv – CS AI · Jun 237/10

🧠

Less is More: Lightweight Prompt Compression for Question Answering Applications on Edge Devices

Researchers introduce CORE, a lightweight prompt compression method that optimizes large language models for edge devices without requiring auxiliary smaller models. The approach achieves 30% accuracy improvements while reducing memory usage by 50% and cutting energy consumption by 95% on smartphones compared to existing methods.

🏢 Nvidia

AIBullisharXiv – CS AI · Jun 117/10

🧠

Physics-Distilled Neural Network enabled by Large Language Models for Manufacturing Process-Property Predictive Modeling

Researchers have developed a physics-informed neural network framework that uses Large Language Models to extract scientific knowledge from literature, enabling accurate manufacturing predictions with minimal data. The lightweight student model achieves real-time inference speeds exceeding 6000 Hz while maintaining robust performance even when LLM-derived physics priors are incomplete.

AIBullishArs Technica – AI · Jun 107/10

🧠

Google DeepMind releases DiffusionGemma, a model that runs local AI 4x faster

Google DeepMind released DiffusionGemma, a new AI model that leverages diffusion techniques to accelerate local text generation by 4x compared to traditional approaches. The breakthrough applies diffusion methods—commonly used in image generation—to language tasks, enabling faster inference speeds for on-device AI applications.

🏢 Google

AIBullisharXiv – CS AI · Jun 107/10

🧠

From Volume to Value: Preference-Aligned Memory Construction for On-Device RAG

Researchers introduce EPIC, a novel approach to on-device Retrieval-Augmented Generation (RAG) that prioritizes user preferences as compact personal context while operating under strict memory constraints. The method achieves dramatic efficiency gains—reducing memory usage by 2,404x and latency by 32x—while improving preference-following accuracy by 18.79 percentage points across multiple benchmarks.

AIBullisharXiv – CS AI · Jun 107/10

🧠

Achieving Cloud-Grade SLOs for Local Mixture-of-Experts Inference through CPU-GPU Hybrid Design

Researchers present a CPU-GPU hybrid system enabling local deployment of large Mixture-of-Experts models with cloud-level performance, achieving 1,800 tokens/s throughput and supporting 45K-token prompts within 30 seconds using consumer hardware. The breakthrough addresses critical gaps in local inference including latency, throughput, and concurrent workload handling without requiring quantization or model distillation.

AIBullisharXiv – CS AI · Jun 107/10

🧠

Sigma-Branch: Hierarchical Single-Path Network Reconstruction for Dynamic Inference with Reduced Active Parameters

Researchers introduce Sigma-Branch, a neural network restructuring framework that reduces per-inference active parameters by 58-60% while maintaining full model capacity in memory. The approach uses hierarchical routing and binary tree architecture to enable efficient edge deployment without permanent model compression trade-offs.

AIBullishCrypto Briefing · Jun 97/10

🧠

Apple unveils AFM 3 Core Advanced with 20 billion parameters for on-device AI at WWDC26

Apple announced the AFM 3 Core Advanced, a 20 billion parameter on-device AI model at WWDC26, marking a significant step in bringing advanced AI capabilities directly to consumer devices. The move underscores the industry's shift toward specialized hardware designed to support sophisticated AI processing without relying on cloud infrastructure.

AIBullisharXiv – CS AI · Jun 97/10

🧠

I-Segmenter: Integer-Only Vision Transformer for Efficient Semantic Segmentation

Researchers introduce I-Segmenter, the first fully integer-only Vision Transformer framework for semantic segmentation that eliminates floating-point operations to enable efficient deployment on resource-constrained devices. The model achieves only 5.1% accuracy loss compared to standard floating-point versions while reducing model size by 3.8x and improving inference speed by 1.2x, with a novel activation function addressing quantization challenges.

AIBullisharXiv – CS AI · Jun 97/10

🧠

CT-VAM: A Cerebello-Thalamic-Inspired Vision-Action Model for Efficient Visuomotor Control

Researchers introduce CT-VAM, a compact 68M-parameter neural network inspired by cerebellar-thalamic brain architecture for robotic manipulation tasks. The model processes visual inputs and proprioception to predict action sequences efficiently on edge devices, matching larger vision-language-action models while reducing latency and enabling practical deployment on resource-constrained robots.

AIBullishCrypto Briefing · Jun 47/10

🧠

Nvidia takes AI battle from data center to laptop with new RTX Spark superchip

Nvidia is expanding its AI chip portfolio beyond data centers by launching the RTX Spark superchip for consumer laptops. This move threatens to disrupt the traditional PC market by enabling on-device AI capabilities and challenging incumbents like Intel and AMD in the consumer segment.

🏢 Nvidia

AIBullishArs Technica – AI · Jun 37/10

🧠

Google's new Gemma 4 open AI model is sized for your laptop

Google has released Gemma 4 12B, a lightweight open-source AI model designed to run efficiently on consumer laptops using a new encoding scheme and token prediction capabilities. The model represents a significant step toward democratizing access to advanced AI technology by reducing computational barriers for developers and individual users.

🏢 OpenAI

AIBullishThe Verge – AI · Jun 27/10

🧠

Microsoft Build 2026: the 7 biggest announcements

Microsoft Build 2026 featured major announcements including the Surface RTX Spark Dev Box, an AI-optimized mini PC for developers powered by Nvidia's Arm-based Spark RTX chip, alongside updates to AI models and a new always-on personal assistant. The event signals Microsoft's continued push to democratize local AI development and compete in the accelerating AI hardware market.

🏢 Nvidia

AI × CryptoBullisharXiv – CS AI · Jun 27/10

🤖

GRANITE : a Byzantine-Resilient Dynamic Gossip Learning Framework

GRANITE is a new Byzantine-resilient framework for decentralized gossip learning that addresses vulnerabilities in dynamic peer sampling protocols used in distributed machine learning. The system demonstrates resilience against coordinated attacks where malicious nodes both poison models and manipulate network topology, achieving near-optimal accuracy with up to 30% Byzantine nodes while reducing communication costs by 9x.

AIBullisharXiv – CS AI · Jun 27/10

🧠

AI-IoT-Robotics Integration: Survey of Frameworks, Emerging Trends, and the Path Toward Connected Robotics

A comprehensive survey examines the convergence of AI, IoT, and robotics, identifying Small Language Models (SLMs) and Large Language Models (LLMs) as critical components for distributed cognition in edge and cloud environments. The research proposes unified design frameworks and modular architectures to address interoperability gaps, advancing the emerging field of Connected Robotics and Physical AI.

AINeutralarXiv – CS AI · Jun 27/10

🧠

Network Distributed Multi-Agent Reinforcement Learning for Consensus Control of Quadcopters

Researchers propose Network Distributed Multi-Agent Reinforcement Learning (ND-MARL), a framework that enables quadcopter swarms to achieve consensus control using only local 2-neighbor communication. The approach demonstrates zero-shot scalability, with policies trained on 3 agents successfully deployed to swarms of up to 250 agents without retraining, marking a significant advancement in distributed autonomous systems.

AIBullisharXiv – CS AI · Jun 27/10

🧠

FreqLite: A Lightweight Frequency-Decomposed Linear Model with Adaptive Reversible Normalization for Robust Long-Term Time-Series Forecasting

FreqLite is a new lightweight linear model for long-term time-series forecasting that uses frequency decomposition and adaptive normalization to achieve better accuracy than larger transformer models while requiring 4x fewer parameters and significantly less computational resources. The method introduces Adaptive Reversible Instance Normalization (A-RevIN) to handle non-stationary data more effectively than existing approaches.

AI × CryptoBullishCrypto Briefing · Jun 17/10

🤖

Tether AI open-sources TurboQuant, reducing LLM KV cache memory use by 5x

Tether AI has open-sourced TurboQuant, a technology that reduces large language model KV cache memory consumption by 5x. The release aims to democratize AI development by enabling efficient local deployment and reducing dependence on centralized cloud infrastructure.

AIBullishCrypto Briefing · Jun 17/10

🧠

Nvidia enters the personal computer market with a new AI chip, challenging Intel and AMD

Nvidia is entering the personal computer market with a new AI chip designed to compete against Intel and AMD's processors. This move could shift AI processing from cloud servers to individual devices, potentially improving user data privacy and creating a significant competitive challenge for established PC chipmakers.

🏢 Nvidia

AIBullishCrypto Briefing · Jun 17/10

🧠

Nvidia enters personal computer market with new AI chip that can run 120 billion parameter models locally

Nvidia has launched a new AI chip designed for personal computers that can run 120 billion parameter models locally, marking the company's strategic entry into the consumer PC market. This development prioritizes on-device AI processing, potentially shifting how users interact with AI applications while addressing data privacy concerns by reducing reliance on cloud computing.

🏢 Nvidia

AI × CryptoBullishCrypto Briefing · Jun 17/10

🤖

Tether releases open source version of Google’s TurboQuant to cut AI memory use

Tether has released an open-source version of Google's TurboQuant, a technology designed to reduce AI memory consumption. This move aims to decentralize AI development by enabling local devices to run sophisticated AI models without relying on centralized cloud infrastructure.

AIBullishCrypto Briefing · Jun 17/10

🧠

Nvidia unveils first laptops designed for AI agents with RTX Spark

Nvidia has launched its first laptops specifically designed for AI agents, featuring the RTX Spark technology. This move represents Nvidia's expansion beyond GPUs into consumer hardware, potentially disrupting the PC market and establishing new standards for local AI processing capabilities.

🏢 Nvidia

AIBullishCrypto Briefing · Jun 17/10

🧠

Nvidia unveils GB10 Grace Blackwell Superchip to challenge Apple and Intel in personal AI computing

Nvidia has unveiled the GB10 Grace Blackwell Superchip, a new processor designed to democratize AI computing by reducing costs and enabling broader access to powerful AI capabilities. The chip positions Nvidia to compete directly with Apple and Intel in the personal AI computing market, representing a significant shift toward making advanced AI technology more accessible to businesses and developers.

🏢 Nvidia

Page 1 of 9Next →