y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#hyperparameter-optimization News & Analysis

4 articles tagged with #hyperparameter-optimization. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

4 articles
AINeutralarXiv โ€“ CS AI ยท 6d ago6/10
๐Ÿง 

Temperature-Dependent Performance of Prompting Strategies in Extended Reasoning Large Language Models

Researchers systematically evaluated how sampling temperature and prompting strategies affect extended reasoning performance in large language models, finding that zero-shot prompting peaks at moderate temperatures (T=0.4-0.7) while chain-of-thought performs better at extremes. The study reveals that extended reasoning benefits grow substantially with higher temperatures, suggesting that T=0 is suboptimal for reasoning tasks.

๐Ÿง  Grok
AIBullisharXiv โ€“ CS AI ยท Mar 36/103
๐Ÿง 

Hyperparameter Trajectory Inference with Conditional Lagrangian Optimal Transport

Researchers introduce Hyperparameter Trajectory Inference (HTI), a method to predict how neural networks behave with different hyperparameter settings without expensive retraining. The approach uses conditional Lagrangian optimal transport to create surrogate models that approximate neural network outputs across various hyperparameter configurations.

AINeutralarXiv โ€“ CS AI ยท Mar 115/10
๐Ÿง 

When Learning Rates Go Wrong: Early Structural Signals in PPO Actor-Critic

Researchers introduce the Overfitting-Underfitting Indicator (OUI) to analyze learning rate sensitivity in PPO reinforcement learning systems. The metric can identify problematic learning rates early in training by measuring neural activation patterns, enabling more efficient hyperparameter screening without full training runs.

AINeutralHugging Face Blog ยท Nov 24/106
๐Ÿง 

Hyperparameter Search with Transformers and Ray Tune

The article discusses hyperparameter optimization techniques for transformer models using Ray Tune, a distributed hyperparameter tuning library. This approach enables efficient scaling of machine learning model training and optimization across multiple computing resources.