y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#gpt-2 News & Analysis

12 articles tagged with #gpt-2. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

12 articles
AIBearisharXiv โ€“ CS AI ยท 4d ago7/10
๐Ÿง 

Powerful Training-Free Membership Inference Against Autoregressive Language Models

Researchers have developed EZ-MIA, a training-free membership inference attack that dramatically improves detection of memorized data in fine-tuned language models by analyzing probability shifts at error positions. The method achieves 3.8x higher detection rates than previous approaches on GPT-2 and demonstrates that privacy risks in fine-tuned models are substantially greater than previously understood.

๐Ÿง  Llama
AIBullisharXiv โ€“ CS AI ยท Mar 47/102
๐Ÿง 

Generalized Discrete Diffusion with Self-Correction

Researchers propose Self-Correcting Discrete Diffusion (SCDD), a new AI model that improves upon existing discrete diffusion models by reformulating self-correction with explicit state transitions. The method enables more efficient parallel decoding while maintaining generation quality, demonstrating improvements at GPT-2 scale.

AIBullisharXiv โ€“ CS AI ยท Mar 46/102
๐Ÿง 

GPUTOK: GPU Accelerated Byte Level BPE Tokenization

Researchers developed GPUTOK, a GPU-accelerated tokenizer for large language models that processes text significantly faster than existing CPU-based solutions. The optimized version shows 1.7x speed improvement over tiktoken and 7.6x over HuggingFace's GPT-2 tokenizer while maintaining output quality.

AIBullisharXiv โ€“ CS AI ยท Mar 37/104
๐Ÿง 

Polynomial, trigonometric, and tropical activations

Researchers developed new activation functions for deep neural networks based on polynomial and trigonometric orthonormal bases that can successfully train models like GPT-2 and ConvNeXt. The work addresses gradient problems common with polynomial activations and shows these networks can be interpreted as multivariate polynomial mappings.

AINeutralarXiv โ€“ CS AI ยท Feb 277/105
๐Ÿง 

Transformers converge to invariant algorithmic cores

Researchers have discovered that transformer models, despite different training runs producing different weights, converge to the same compact 'algorithmic cores' - low-dimensional subspaces essential for task performance. The study shows these invariant structures persist across different scales and training runs, suggesting transformer computations are organized around shared algorithmic patterns rather than implementation-specific details.

AIBullishOpenAI News ยท May 97/106
๐Ÿง 

Language models can explain neurons in language models

Researchers used GPT-4 to automatically generate explanations for how individual neurons behave in large language models and to evaluate the quality of those explanations. They have released a comprehensive dataset containing explanations and quality scores for every neuron in GPT-2, advancing AI interpretability research.

AINeutralOpenAI News ยท Nov 57/105
๐Ÿง 

GPT-2: 1.5B release

OpenAI has released the largest version of GPT-2 with 1.5 billion parameters, completing their staged release process. The release includes code and model weights to help detect GPT-2 outputs and serves as a test case for responsible AI model publication.

AIBullisharXiv โ€“ CS AI ยท Apr 76/10
๐Ÿง 

MUXQ: Mixed-to-Uniform Precision MatriX Quantization via Low-Rank Outlier Decomposition

Researchers propose MUXQ, a new quantization technique for large language models that addresses activation outliers through low-rank decomposition. The method enables efficient INT8 quantization while maintaining accuracy close to FP16, making it suitable for edge device deployment with NPU-based hardware.

๐Ÿข Perplexity
AINeutralOpenAI News ยท Sep 196/106
๐Ÿง 

Fine-tuning GPT-2 from human preferences

OpenAI successfully fine-tuned a 774M parameter GPT-2 model using human feedback for tasks like summarization and text continuation. The research revealed challenges where human labelers' preferences didn't align with developers' intentions, with summarization models learning to copy text wholesale rather than generate original summaries.

AINeutralOpenAI News ยท Aug 206/104
๐Ÿง 

GPT-2: 6-month follow-up

OpenAI released the 774 million parameter GPT-2 language model, completing their staged release approach that began with smaller models earlier in the year. The release includes an open-source legal agreement for model-sharing partnerships and a technical report on coordinating AI research publication norms.

AIBullishOpenAI News ยท Apr 256/106
๐Ÿง 

MuseNet

OpenAI has created MuseNet, a deep neural network capable of generating 4-minute musical compositions using 10 different instruments and combining various musical styles from country to classical to rock. The system uses the same transformer technology as GPT-2, learning musical patterns through unsupervised training on hundreds of thousands of MIDI files rather than explicit musical programming.