AIBullisharXiv – CS AI · 3d ago7/10
🧠Researchers introduce Logit-aware Final-block Quantization (LFQ), a technique that improves low-bit quantization of large language models by optimizing the final transformer block to preserve token probability distributions. This advancement addresses quality degradation in generative tasks while maintaining efficiency gains critical for deploying scaled LLMs.
AIBullisharXiv – CS AI · 5d ago7/10
🧠Researchers develop a systematic approach to quantization-aware training for large language models using 8-bit floating-point formats, identifying and solving two critical failure modes—amax saturation and catastrophic forgetting—that don't surface in standard training metrics. Their solution achieves near-lossless performance with only 0.43% degradation on benchmark tasks, advancing practical LLM deployment efficiency.
AINeutralarXiv – CS AI · May 97/10
🧠Researchers demonstrate that standard fine-tuning of transformer models on causal reasoning tasks causes catastrophic collapse where models learn trivial solutions while appearing accurate. They propose a semantic loss function with graph-based constraints that prevents collapse and achieves stable, context-dependent causal reasoning with 42.7% improvement over baseline models.
AIBullisharXiv – CS AI · May 47/10
🧠Researchers introduce Sentra-Guard, a real-time defense system that detects and mitigates jailbreak and prompt injection attacks on large language models with 99.96% accuracy. The multilingual framework combines FAISS-indexed semantic embeddings with fine-tuned transformers and human-in-the-loop feedback, significantly outperforming existing defenses like LlamaGuard-2 and OpenAI Moderation.
🏢 OpenAI
AIBearisharXiv – CS AI · May 17/10
🧠Researchers challenge the assumption that multi-agent AI systems benefit from the 'Wisdom of the Crowd' by demonstrating the Inverse-Wisdom Law: adding more logical agents to swarms can paradoxically increase the stability of errors rather than improve accuracy. Through 36 experiments across major benchmarks, the study reveals that architectural tribalism causes agents to prioritize internal agreement over external truth, with system integrity ultimately determined by the synthesizer's logic rather than individual agent quality.
🧠 GPT-5🧠 Claude🧠 Sonnet
AIBearisharXiv – CS AI · Apr 77/10
🧠Researchers present a new framework for AI safety that identifies a 57-token predictive window for detecting potential failures in large language models. The study found that only one out of seven tested models showed predictive signals before committing to problematic outputs, while factual hallucinations produced no detectable warning signs.
AINeutralarXiv – CS AI · Apr 67/10
🧠Researchers studied weight-space model merging for multilingual machine translation and found it significantly degrades performance when target languages differ. Analysis reveals that fine-tuning redistributes rather than sharpens language selectivity in neural networks, increasing representational divergence in higher layers that govern text generation.
AIBearisharXiv – CS AI · Mar 127/10
🧠Researchers have developed 'Amnesia,' a lightweight adversarial attack that bypasses safety mechanisms in open-weight Large Language Models by manipulating internal transformer states. The attack enables generation of harmful content without requiring fine-tuning or additional training, highlighting vulnerabilities in current LLM safety measures.
AIBullisharXiv – CS AI · Mar 57/10
🧠Researchers introduce ZipMap, a new AI model for 3D reconstruction that achieves linear-time processing while maintaining accuracy comparable to slower quadratic-time methods. The system can reconstruct over 700 frames in under 10 seconds on a single H100 GPU, making it more than 20x faster than current state-of-the-art approaches like VGGT.
AINeutralarXiv – CS AI · Mar 37/104
🧠New research reveals that large language models use a "Guess-then-Refine" framework, starting with high-frequency token predictions in early layers and refining them with contextual information in deeper layers. The study provides detailed insights into layer-wise computation dynamics through multiple-choice tasks, fact recall analysis, and part-of-speech predictions.
AINeutralarXiv – CS AI · Feb 277/107
🧠Researchers developed Compositional-ARC, a dataset to test AI models' ability to systematically generalize abstract spatial reasoning tasks. A small 5.7M parameter transformer model trained with meta-learning outperformed large language models like GPT-4o and Gemini 2.0 Flash on novel geometric transformation combinations.
AINeutralarXiv – CS AI · 13h ago6/10
🧠Researchers demonstrate that GPT-4o-generated paraphrases can improve sign language translation by augmenting training data while keeping video inputs unchanged. Testing across three sign language datasets reveals modest gains on PHOENIX14T (9.56 to 10.33 BLEU-4) but exposes fundamental limitations when data is sparse or highly controlled.
🧠 GPT-4
AINeutralarXiv – CS AI · 13h ago6/10
🧠Researchers propose a novel framework for controlling symbolic music generation in Transformer models through activation steering, enabling fine-grained control over musical attributes like pitch and duration without retraining. The approach uses latent space analysis and orthogonalization techniques to independently manipulate multiple attributes while reducing interference and maintaining generation quality.
AINeutralarXiv – CS AI · 3d ago6/10
🧠A research study comparing seven transformer-based language models of varying sizes (22M to 13B parameters) in topic modeling tasks found that model size has negligible impact on topic quality. This suggests smaller, more efficient models can match larger models' performance for topic coherence applications, potentially reducing computational costs without sacrificing output quality.
AINeutralarXiv – CS AI · 3d ago6/10
🧠Researchers developed a specialized Named Entity Recognition model for identifying disease-related clinical entities in immunology and infectious disease texts, achieving 0.89 F1 score through transformer-based architecture with clinical embeddings. The model outperforms general-purpose NLP systems and LLMs in extracting granular biomedical concepts from unstructured medical narratives, enabling improved cohort identification and clinical decision support.
AINeutralarXiv – CS AI · 3d ago6/10
🧠Researchers propose FHRFormer, a masked transformer-based autoencoder that reconstructs missing fetal heart rate data from wearable monitors using self-supervised learning. The method addresses signal dropout caused by sensor displacement and positional changes, preserving spectral characteristics better than traditional interpolation while enabling both data inpainting and forecasting for improved fetal risk assessment.
AINeutralarXiv – CS AI · 4d ago6/10
🧠Researchers propose a new interpretation method for Transformer models with heterogenous attention structures, which process information from multiple sources. The work addresses the growing need to understand complex AI systems, particularly as they integrate diverse data modalities and support increasingly sophisticated agent applications.
AINeutralarXiv – CS AI · 4d ago6/10
🧠LinkedIn researchers introduced LiDDA, a transformer-based machine learning approach for data-driven attribution that assigns conversion credits to marketing interactions across member-level data, aggregate data, and external macro factors. The framework has been implemented at scale at LinkedIn and demonstrates significant business impact, with methodologies applicable to the broader marketing and ad tech industries.
AINeutralarXiv – CS AI · 4d ago6/10
🧠Researchers introduce EvaluatorDPT, a decision-control model that predicts YES, NO, or TBD (to-be-determined) for high-stakes AI applications where uncertainty exists. The system learns deferral as an explicit outcome rather than hiding uncertainty in forced predictions, achieving 82.6% accuracy with auditable, policy-governed decision routing that can be inspected and controlled at inference time.
AINeutralarXiv – CS AI · 5d ago6/10
🧠Researchers introduce CmIVTP, a cross-modal AI framework that combines AIS and CCTV data to improve maritime vessel trajectory prediction. The system uses transformer-based architecture with attention mechanisms to model vessel-environment interactions, addressing limitations of single-source data in maritime navigation systems.
AINeutralarXiv – CS AI · May 126/10
🧠PathISE is a novel framework that enables knowledge graph question-answering systems to learn effective supervision signals from answer-level labels alone, eliminating the need for expensive intermediate annotations. By using a transformer-based estimator to identify informative relation paths and distilling them into LLM path generators, the approach achieves competitive state-of-the-art performance while reducing resource requirements for training.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers introduce CLEF, a foundation model for clinical EEG interpretation that processes full-length brain signal sessions alongside patient records and neurologist reports. The model achieves 74% mean AUROC across 234 clinical tasks, substantially outperforming prior EEG foundation models by integrating long-context signal analysis with clinically grounded embeddings.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers introduce IRIS-14B, a 14-billion-parameter LLM fine-tuned to translate compiler intermediate representations between GCC's GIMPLE and LLVM IR, achieving up to 44 percentage points higher accuracy than existing state-of-the-art models. The approach demonstrates how LLMs can function as interoperability layers in hybrid compiler architectures, enabling cross-toolchain workflows without modifying existing compiler infrastructure.
AINeutralarXiv – CS AI · May 125/10
🧠Researchers introduce KANMultiSign, a neural network framework that converts sign language notation into pose animations using Kolmogorov-Arnold Networks integrated with Transformers. The system achieves improved accuracy with fewer parameters across multiple sign languages, demonstrating that multi-scale supervision is the key driver of performance gains.
AINeutralarXiv – CS AI · May 116/10
🧠Researchers present PAMPOS, a causal transformer-based system that detects misbehavior in Vehicle-to-Everything (V2X) networks by identifying deviations from learned normal driving patterns, achieving up to 98% AUC without requiring labeled attack data during training. This unsupervised approach addresses a critical security gap where cryptographic mechanisms alone cannot prevent insider falsification attacks in connected vehicle systems.