y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#text-analysis News & Analysis

14 articles tagged with #text-analysis. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

14 articles
AINeutralarXiv – CS AI · Mar 37/102
🧠

Learn-to-Distance: Distance Learning for Detecting LLM-Generated Text

Researchers developed a new algorithm called Learn-to-Distance (L2D) that can detect AI-generated text from models like GPT, Claude, and Gemini with significantly improved accuracy. The method uses adaptive distance learning between original and rewritten text, achieving 54.3% to 75.4% relative improvements over existing detection methods across extensive testing.

AINeutralarXiv – CS AI · 3d ago6/10
🧠

Show, Don't TELL: Explainable AI-Generated Text Detection

Researchers have developed TELL, an AI-generated text detector that prioritizes explainability by showing users the specific linguistic markers indicating AI or human authorship rather than just providing an opaque numerical score. The system achieves competitive detection performance (AUROC 0.927) while generating human-evaluated explanations with a 72.3% mean win-rate across quality metrics, fundamentally reframing detection as a human-centric interpretability problem.

AINeutralarXiv – CS AI · May 125/10
🧠

Matching Meaning at Scale: Evaluating Semantic Search for 18th-Century Intellectual History through the Case of Locke

Researchers evaluate semantic search as a tool for analyzing 18th-century intellectual history, specifically tracking how John Locke's ideas circulated through paraphrases and implicit references. While semantic search substantially outperforms traditional lexical methods at capturing meaning-level correspondences, linguistic analysis reveals that retrieval remains constrained by surface-level vocabulary overlap, suggesting both promise and limitations for historical corpus analysis.

AINeutralarXiv – CS AI · May 16/10
🧠

The TEA Nets framework combines AI and cognitive network science to model targets, events and actors in text

Researchers introduce TEA Nets (Target-Event-Agent Networks), an open-source AI framework that extracts subjects, verbs, and objects from text to analyze emotional and semantic patterns. Testing across conspiracy narratives and psychotherapy transcripts reveals that highly conspiratorial texts link personal pronouns to actions twice as frequently as low-conspiracy texts, while LLMs express emotions with measurably lower intensity than humans.

🧠 Claude
AINeutralarXiv – CS AI · Apr 66/10
🧠

DocShield: Towards AI Document Safety via Evidence-Grounded Agentic Reasoning

Researchers introduce DocShield, a new AI framework that uses evidence-based reasoning to detect text-based image forgeries in documents. The system combines visual and logical analysis to identify, locate, and explain document manipulations, showing significant improvements over existing detection methods.

🧠 GPT-4
AINeutralarXiv – CS AI · Mar 174/10
🧠

Interpretable Predictability-Based AI Text Detection: A Replication Study

Researchers replicated and improved upon an AI text detection system from the AuTexTification 2023 shared task, adding stylometric features and newer language models like Qwen and mGPT. The study achieved comparable or better performance than language-specific models while emphasizing the importance of clear documentation for reliable AI research replication.

🏢 Meta
AINeutralarXiv – CS AI · Mar 54/10
🧠

TopicENA: Enabling Epistemic Network Analysis at Scale through Automated Topic-Based Coding

TopicENA is a new framework that combines BERTopic with Epistemic Network Analysis to automatically analyze concept relationships in large text datasets without manual coding. The research demonstrates that automated topic modeling can replace expert manual coding while maintaining analytical quality, making network analysis scalable for large corpora.

AINeutralarXiv – CS AI · Mar 44/103
🧠

Compact Prompting in Instruction-tuned LLMs for Joint Argumentative Component Detection

Researchers developed a novel approach using instruction-tuned Large Language Models to improve argumentative component detection in text analysis. The method reframes the task as language generation rather than traditional sequence labeling, achieving superior performance on standard benchmarks compared to existing state-of-the-art systems.

AINeutralarXiv – CS AI · Mar 35/104
🧠

Assessing Crime Disclosure Patterns in a Large-Scale Cybercrime Forum

Researchers analyzed over 3.5 million posts from a major cybercrime forum, finding that 25% of initial posts contain explicit crime-related content and over one-third of users disclose criminal activity. The study used large language models to classify content and revealed that most users show restraint by gradually escalating disclosure through ambiguous 'grey' content before explicit criminal posts.

AINeutralarXiv – CS AI · Feb 274/103
🧠

Scaling In, Not Up? Testing Thick Citation Context Analysis with GPT-5 and Fragile Prompts

Researchers tested GPT-5's ability to perform citation context analysis by examining how different prompt designs affect the model's interpretative readings of academic citations. The study found that while GPT-5 produces consistent surface classifications, prompt scaffolding significantly influences which interpretative frameworks and vocabularies the model emphasizes in deeper analysis.

AINeutralGoogle DeepMind Blog · Oct 244/108
🧠

Aeneas transforms how historians connect the past

Aeneas is a new AI model designed to help historians contextualize and interpret ancient inscriptions by assisting with attribution and restoration of fragmentary historical texts. This represents a specialized application of AI technology for academic research in historical studies.

AINeutralarXiv – CS AI · Mar 24/106
🧠

Rough Sets for Explainability of Spectral Graph Clustering

Researchers propose an enhanced methodology using rough set theory to improve explainability of Graph Spectral Clustering (GSC) algorithms. The approach addresses challenges in explaining clustering results, particularly when applied to text documents where spectral space embeddings lack clear relation to content.