67 articles tagged with #edge-computing. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBullisharXiv – CS AI · 3d ago7/10
🧠Researchers introduce Ge²mS-T, a novel Spiking Vision Transformer architecture that optimizes energy efficiency while maintaining training and inference performance through multi-dimensional grouped computation. The approach addresses fundamental limitations in existing SNN paradigms by balancing memory overhead, learning capability, and energy consumption simultaneously.
AIBullisharXiv – CS AI · Mar 277/10
🧠Researchers have published a comprehensive review of Large Language Models for Autonomous Driving (LLM4AD), introducing new benchmarks and conducting real-world experiments on autonomous vehicle platforms. The paper explores how LLMs can enhance perception, decision-making, and motion control in self-driving cars, while identifying key challenges including latency, security, and safety concerns.
AIBullishTechCrunch – AI · Mar 267/10
🧠Mistral has released a new open-source speech generation model that is lightweight enough to run on mobile devices including smartwatches and smartphones. This represents a significant advancement in making AI speech capabilities more accessible and portable for edge computing applications.
AIBullisharXiv – CS AI · Mar 267/10
🧠Researchers developed the Cognitive Firewall, a hybrid edge-cloud defense system that protects browser-based AI agents from indirect prompt injection attacks. The three-stage architecture reduces attack success rates to below 1% while maintaining 17,000x faster response times compared to cloud-only solutions by processing simple attacks locally and complex threats in the cloud.
AIBullisharXiv – CS AI · Mar 177/10
🧠SPARQ introduces a unified framework combining spiking neural networks, quantization-aware training, and reinforcement learning-guided early exits for energy-efficient edge AI. The system achieves up to 5.15% higher accuracy than conventional quantized SNNs while reducing system energy consumption by over 330 times and cutting synaptic operations by over 90%.
AIBullisharXiv – CS AI · Mar 177/10
🧠PrototypeNAS is a new zero-shot neural architecture search method that rapidly designs and optimizes deep neural networks for microcontroller units without requiring extensive training. The system uses a three-step approach combining structural optimization, ensemble zero-shot proxies, and Hypervolume subset selection to identify efficient models within minutes that can run on resource-constrained edge devices.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers propose HO-SFL (Hybrid-Order Split Federated Learning), a new framework that enables memory-efficient fine-tuning of large AI models on edge devices by eliminating backpropagation on client devices while maintaining convergence speed comparable to traditional methods. The approach significantly reduces communication costs and memory requirements for distributed AI training.
AIBullisharXiv – CS AI · Mar 127/10
🧠Researchers introduce MoE-SpAc, a new framework for efficient Mixture-of-Experts model inference on edge devices that achieves 42% improvement over existing baselines. The system uses speculative decoding as a memory management tool and demonstrates 4.04x average speedup across benchmarks.
AIBullisharXiv – CS AI · Mar 117/10
🧠Researchers have developed Meissa, a lightweight 4B-parameter medical AI model that brings advanced agentic capabilities offline for healthcare applications. The system matches frontier models like GPT in medical benchmarks while operating with 25x fewer parameters and 22x lower latency, addressing privacy and cost concerns in clinical settings.
🧠 Gemini
AINeutralarXiv – CS AI · Mar 97/10
🧠Researchers propose a framework for decentralized resource allocation in real-time AI services across device-edge-cloud infrastructure. The study shows that dependency graph topology determines whether price-based allocation can work at scale, with hierarchical structures enabling stable pricing while complex dependencies cause instability.
AIBullisharXiv – CS AI · Mar 67/10
🧠A research paper presents a 10-year roadmap for coordinated AI and hardware co-development, targeting 1000x efficiency improvements in AI training and inference by 2035. The vision emphasizes energy efficiency over raw compute scaling, proposing integrated solutions across algorithms, architectures, and systems to enable sustainable AI deployment from cloud to edge environments.
AIBullisharXiv – CS AI · Mar 67/10
🧠Researchers developed a memory management system for multi-agent AI systems on edge devices that reduces memory requirements by 4x through 4-bit quantization and eliminates redundant computation by persisting KV caches to disk. The solution reduces time-to-first-token by up to 136x while maintaining minimal impact on model quality across three major language model architectures.
🏢 Perplexity🧠 Llama
AIBullisharXiv – CS AI · Mar 56/10
🧠Researchers developed LiteVLA-Edge, a deployment-oriented Vision-Language-Action model pipeline that enables fully on-device inference on embedded robotics hardware like Jetson Orin. The system achieves 150.5ms latency (6.6Hz) through FP32 fine-tuning combined with 4-bit quantization and GPU-accelerated inference, operating entirely offline within a ROS 2 framework.
AINeutralarXiv – CS AI · Mar 57/10
🧠Researchers propose a new framework for Agentic Peer-to-Peer Networks where AI agents on edge devices can collaborate by sharing capabilities and actions rather than static files. The system introduces tiered verification methods to ensure security and reliability when AI agents delegate tasks to untrusted peers in decentralized networks.
AINeutralarXiv – CS AI · Mar 57/10
🧠Researchers propose an Adaptive Quantized Planetary Crater Detection System (AQ-PCDSys) that uses quantized neural networks and multi-sensor fusion to enable real-time AI-powered crater detection on resource-constrained space exploration hardware. The system addresses the critical bottleneck of deploying sophisticated deep learning models on power-limited, radiation-hardened space computers.
AIBullisharXiv – CS AI · Mar 46/102
🧠Researchers developed TinyIceNet, a compact AI model for real-time sea ice mapping using satellite SAR imagery, designed specifically for on-board FPGA processing in space. The system achieves 75.216% F1 score while consuming 50% less energy than GPU baselines, demonstrating practical AI deployment for maritime navigation in polar regions.
$NEAR
AIBullisharXiv – CS AI · Mar 47/102
🧠NeuroSkill is a new open-source AI system that models human mental states in real-time using brain-computer interfaces and biophysical signals. The system runs offline on edge devices and can engage with humans on cognitive and emotional levels through its NeuroLoop harness technology.
AIBullisharXiv – CS AI · Mar 37/105
🧠Researchers developed HierarchicalPrune, a compression framework that reduces large-scale text-to-image diffusion models' memory footprint by 77.5-80.4% and latency by 27.9-38.0% while maintaining image quality. The technique enables billion-parameter AI models to run efficiently on resource-constrained devices through hierarchical pruning and knowledge distillation.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers developed NANOMIND, a software-hardware framework that optimizes Large Multimodal Models for battery-powered devices by breaking them into modular components and mapping each to optimal accelerators. The system achieves 42.3% energy reduction and enables 20.8 hours of operation running LLaVA-OneVision on a compact device without network connectivity.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers propose ROMA, a new hardware accelerator for running large language models on edge devices using QLoRA. The system uses ROM storage for quantized base models and SRAM for LoRA weights, achieving over 20,000 tokens/s generation speed without external memory.
AIBullisharXiv – CS AI · Mar 37/102
🧠ButterflyMoE introduces a breakthrough approach to reduce memory requirements for AI expert models by 150× through geometric parameterization instead of storing independent weight matrices. The method uses shared ternary prototypes with learned rotations to achieve sub-linear memory scaling, enabling deployment of multiple experts on edge devices.
AIBullisharXiv – CS AI · Mar 37/103
🧠CSRv2 introduces a new training approach for ultra-sparse embeddings that reduces inactive neurons from 80% to 20% while delivering 14% accuracy gains. The method achieves 7x speedup over existing approaches and up to 300x improvements in compute and memory efficiency compared to dense embeddings.
AIBullisharXiv – CS AI · Feb 277/106
🧠Researchers propose 'Intelligence per Watt' (IPW) as a metric to measure AI efficiency, finding that local AI models can handle 71.3% of queries while being 1.4x more energy efficient than cloud alternatives. The study demonstrates that smaller local language models (≤20B parameters) can redistribute computational demand from centralized cloud infrastructure.
AIBullisharXiv – CS AI · Feb 277/106
🧠Researchers developed TT-SEAL, a selective encryption framework for compressed AI models using Tensor-Train Decomposition that maintains security while encrypting only 4.89-15.92% of parameters. The system achieves the same robustness as full encryption while reducing AES decryption overhead in end-to-end latency from 58% to as low as 2.76%.
AIBullisharXiv – CS AI · Feb 277/106
🧠Researchers developed a runtime-reconfigurable bitwise systolic array architecture for multi-precision quantized neural networks on FPGA hardware accelerators. The system achieves 1.3-3.6x speedup on mixed-precision models while supporting higher clock frequencies up to 250MHz, addressing the trade-off between hardware efficiency and inference accuracy.