AIBullisharXiv – CS AI · 2d ago7/10
🧠Eureka is an LLM-driven framework that automates feature engineering for machine learning by treating feature design as a code generation problem. The system combines expert agents, chain-of-thought reasoning, and reinforcement learning to generate and refine features iteratively, demonstrating 16% improvement in cloud resource prediction at Alibaba Cloud.
AINeutralarXiv – CS AI · May 117/10
🧠Researchers demonstrate that neural networks fail at out-of-distribution (OOD) generalization not due to insufficient training data, but because the choice of feature representation fundamentally determines what extrapolation patterns a model can learn. The same architecture achieving identical in-distribution loss can differ by 520x out-of-distribution depending on how features are encoded, showing that correct feature engineering is necessary but not sufficient without appropriate model class constraints.
AIBullisharXiv – CS AI · Mar 46/103
🧠Researchers introduce MedFeat, a new AI framework that uses Large Language Models for healthcare feature engineering in clinical tabular predictions. The system incorporates model awareness and domain knowledge to discover clinically meaningful features that outperform traditional approaches and demonstrate robustness across different hospital settings.
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers propose using genetic programming to evolve interpretable feature sets and tree structures for survival analysis models, demonstrating improved predictive performance while maintaining shallow, explainable decision trees. The approach addresses the fundamental trade-off between accuracy and interpretability in medical survival prediction by optimizing both feature construction and tree logic simultaneously.
AINeutralarXiv – CS AI · 3d ago5/10
🧠Researchers propose REED, a post-training representation editing method that improves linguistic steganalysis detection across different domains without modifying model architecture or updating parameters. The technique uses domain-offset vectors and source-domain cover-to-stego directions to adapt detectors to unseen domains with different vocabularies and writing styles.
AINeutralarXiv – CS AI · Apr 146/10
🧠Researchers demonstrate that large language models can extract predictive features from financial news with valid intermediate signals (Information Coefficient >0.15), yet these features fail to improve reinforcement learning trading agents during macroeconomic shocks. The findings reveal a critical gap between feature-level validity and downstream policy robustness, suggesting that valid signals alone cannot guarantee trading performance under distribution shifts.
AINeutralarXiv – CS AI · Mar 27/1017
🧠Researchers conducted a benchmark study on IoT botnet intrusion detection systems, finding that models trained on one network domain suffer significant performance degradation when applied to different environments. The study evaluated three feature sets across four IoT datasets and provided guidelines for improving cross-domain robustness through better feature engineering and algorithm selection.
AINeutralarXiv – CS AI · Mar 24/106
🧠Researchers propose a new multi-agent reinforcement learning framework that uses three cooperative agents with attention mechanisms to automate feature transformation for machine learning models. The approach addresses key limitations in existing automated feature engineering methods, including dynamic feature expansion instability and insufficient agent cooperation.
AINeutralarXiv – CS AI · Feb 273/106
🧠Researchers developed a machine learning method to predict professional tennis players' first serve directions, achieving 49% accuracy for male players and 44% for female players. The study provides evidence that top players use mixed-strategy serving decisions and suggests contextual information plays a larger role in tennis strategy than previously understood.