AIBearishOpenAI News ยท Aug 56/105
๐ง
Estimating worst case frontier risks of open weight LLMs
Researchers studied worst-case risks of releasing open-weight large language models by conducting malicious fine-tuning (MFT) experiments on gpt-oss. The study specifically examined how fine-tuning could maximize dangerous capabilities in biology and cybersecurity domains.