AIBearisharXiv – CS AI · May 117/10
🧠Researchers introduce CloudWeb, an adversarial attack that manipulates remote sensing images with realistic cloud and haze patterns to hijack vision-language retrieval systems in multimodal RAG pipelines. The attack achieves significant success rates—increasing weather-related evidence injection from 0.71% to 43.29% on benchmark tests—demonstrating that input-space threats to retrieval stages remain largely undefended in production systems.
🏢 OpenAI
AIBullisharXiv – CS AI · Apr 107/10
🧠Researchers introduce RS-EoT (Remote Sensing Evidence-of-Thought), a novel framework that enables vision-language models to reason more effectively about satellite imagery by iteratively seeking visual evidence rather than relying on linguistic patterns. The approach uses a self-play multi-agent system called SocraticAgent and reinforcement learning to address the 'Glance Effect,' where models superficially analyze large-scale remote sensing images, achieving state-of-the-art performance on multiple benchmarks.
AIBullisharXiv – CS AI · Mar 56/10
🧠Researchers introduce GeoSeg, a zero-shot, training-free framework for AI-driven segmentation of remote sensing imagery that uses multimodal language models for reasoning without requiring specialized training data. The system addresses domain-specific challenges in satellite and aerial image analysis through bias-aware coordinate refinement and dual-route prompting mechanisms.
AINeutralarXiv – CS AI · 5d ago6/10
🧠Researchers introduce MineC2FNet, a deep learning framework that leverages abundant coarse-grained remote sensing data to improve fine-grained mining footprint segmentation in multispectral imagery. The approach uses domain incremental learning with attentive distillation to bridge the gap between coarse and fine datasets, addressing a critical gap in environmental monitoring of global mining operations.
AINeutralarXiv – CS AI · 6d ago6/10
🧠FLORO is a multimodal geospatial foundation model that learns from diverse remote sensing data across multiple sensor types and resolutions with minimal pretraining data. Despite using significantly smaller datasets than competing models, FLORO demonstrates strong transfer learning performance on ecological and environmental applications, achieving competitive results on scene classification, segmentation, and regression tasks.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers introduce WATCH, a satellite-based framework using foundation models to detect disturbances at archaeological sites across months and years. The system combines three approaches—temporal embedding distance, self-supervised change detection, and weakly supervised learning—achieving up to 92.5% accuracy within three-month tolerance windows when monitoring 1,943 Afghan sites and cross-validating in Syria, Turkey, Pakistan, and Egypt.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers are using large language models combined with remote sensing imagery to analyze built environments for smart city applications, evaluating models like InternVL and Qwen for tasks including design suggestions, constructability assessment, and risk identification. The study demonstrates that multimodal AI systems can effectively process satellite imagery at multiple scales to support urban planning and infrastructure decision-making.
AINeutralarXiv – CS AI · May 116/10
🧠Researchers introduce DPG-CD, a deep learning framework that detects both 2D semantic and 3D structural changes in urban environments by fusing multi-temporal satellite imagery with Digital Surface Model data. The method addresses the challenge of combining different data modalities to enable high-frequency urban monitoring and disaster assessment without requiring expensive frequent 3D data collection.
AINeutralarXiv – CS AI · May 116/10
🧠LithoBench introduces a comprehensive benchmark dataset for evaluating large multimodal models on remote-sensing lithology interpretation, containing 10,000 expert-annotated instances across cognitive levels from identification to reasoning. The research reveals significant gaps in current vision-language models' ability to handle knowledge-intensive geological tasks, highlighting the challenges of applying general-purpose AI to specialized domain expertise.
AIBullisharXiv – CS AI · Mar 116/10
🧠Researchers introduce ARAS400k, a large-scale remote sensing dataset containing 400k images (100k real, 300k synthetic) with segmentation maps and descriptions. The study demonstrates that combining real and synthetic data consistently outperforms training on real data alone for semantic segmentation and image captioning tasks.
AIBullisharXiv – CS AI · Mar 36/108
🧠Researchers introduce GRAD-Former, a novel AI framework for detecting changes in satellite imagery that outperforms existing methods while using fewer computational resources. The system uses gated attention mechanisms and differential transformers to more efficiently identify semantic differences in very high-resolution satellite images.
AINeutralarXiv – CS AI · Mar 36/104
🧠Researchers developed a lightweight AI model using unsupervised deep learning to detect conflict-related fires in Sudan within 24-30 hours using commercially available satellite imagery. The Variational Auto-Encoder (VAE) approach outperformed traditional methods in identifying burn signatures from 4-band Planet Labs satellite data at 3-meter resolution.
$CRV$NEAR
AIBullisharXiv – CS AI · Feb 276/107
🧠Researchers developed FUSAR-GPT, a specialized Visual Language Model for Synthetic Aperture Radar (SAR) imagery that significantly outperforms existing models. The system introduces spatiotemporal feature embedding and a two-stage training strategy, achieving over 12% improvement on remote sensing benchmarks.
AINeutralIEEE Spectrum – AI · Jan 124/107
🧠Researchers developed a contactless machine-learning system that monitors patient pain during surgery by analyzing facial expressions and heart rate data via remote photoplethysmogram (rPPG). The system achieved 45% accuracy when tested on realistic surgical footage, offering a non-invasive alternative to traditional pain monitoring methods that require wired sensors.
AINeutralHugging Face Blog · Oct 134/105
🧠The article appears to discuss fine-tuning CLIP (Contrastive Language-Image Pre-training) models using satellite imagery and corresponding captions. However, the article body is empty, preventing detailed analysis of the methodology, results, or implications of this remote sensing AI application.