y0news
← Feed
Back to feed
🧠 AI NeutralImportance 6/10

Target-Side Paraphrase Augmentation for Sign Language Translation with Large Language Models

arXiv – CS AI|Pedro Dal Bianco, Jean Paul Nunes Reinhold, Oscar Stanchi, Facundo Quiroga, Franco Ronchetti, Ulisses Brisolara Corr\^ea|
🤖AI Summary

Researchers demonstrate that GPT-4o-generated paraphrases can improve sign language translation by augmenting training data while keeping video inputs unchanged. Testing across three sign language datasets reveals modest gains on PHOENIX14T (9.56 to 10.33 BLEU-4) but exposes fundamental limitations when data is sparse or highly controlled.

Analysis

This research tackles a critical accessibility challenge: translating sign languages with machine learning when paired training data is scarce and target vocabularies follow heavy-tailed distributions. The approach leverages large language models to artificially expand the training corpus through controlled paraphrasing, allowing a Transformer-based pose recognition system to learn from richer textual variations without requiring new video annotations. This represents a practical solution to data scarcity that plagues low-resource translation tasks.

Sign language translation has historically lagged behind spoken language translation due to limited datasets and the complexity of extracting meaningful features from video. By decoupling target-side augmentation from the visual input, the researchers sidestep the challenge of generating synthetic sign language videos—a far more difficult problem. The two-stage training pipeline mirrors successful transfer learning patterns in NLP, combining diverse paraphrases during pretraining with original references during fine-tuning to anchor semantic fidelity.

The results reveal important nuances: PHOENIX14T showed measurable improvements because its moderate lexical diversity benefits from paraphrase exposure. However, GSL's controlled, repetitive nature and LSA-T's extreme sparsity exposed the approach's boundaries. These limitations suggest that augmentation works best in the middle ground—datasets large enough to show signal but constrained enough that diversity helps generalization. The introduction of LLM-as-a-Judge evaluation is noteworthy, as semantic correctness matters more than surface-level word overlap for accessibility applications where meaning preservation is critical.

Key Takeaways
  • LLM-generated paraphrases improved German Sign Language translation by 0.77 BLEU points through target-side augmentation without synthetic video generation.
  • The approach shows clear limits on highly controlled or extremely sparse datasets, revealing that augmentation benefits only moderately constrained language domains.
  • Semantic evaluation using LLM judges revealed translation quality gains that traditional lexical metrics underestimate, highlighting metrics-evaluation misalignment.
  • This is the first documented application of LLM paraphrasing specifically for sign language translation, establishing a new baseline approach.
  • Two-stage training combining diverse augmented data with original references preserved semantic fidelity better than augmentation alone.
Mentioned in AI
Models
GPT-4OpenAI
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles