#text-analysis News & Analysis

17 articles tagged with #text-analysis. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

17 articles

AINeutralarXiv – CS AI · Mar 37/102

🧠

Learn-to-Distance: Distance Learning for Detecting LLM-Generated Text

Researchers developed a new algorithm called Learn-to-Distance (L2D) that can detect AI-generated text from models like GPT, Claude, and Gemini with significantly improved accuracy. The method uses adaptive distance learning between original and rewritten text, achieving 54.3% to 75.4% relative improvements over existing detection methods across extensive testing.

AINeutralarXiv – CS AI · Jun 235/10

🧠

The Model as One Rater Among Several: Measuring Political Positions in Data-Sparse Regions with a Language-Model Panel

Researchers propose a novel method for measuring political positions in data-sparse regions by treating large language models as fallible raters within a panel system rather than standalone measurement devices. The approach achieves 0.86 Krippendorff's alpha reliability across nine models and demonstrates that written axis definitions improve inter-rater agreement, though the method still requires human validation.

AINeutralarXiv – CS AI · Jun 236/10

🧠

UnBias-Plus: Detect, Explain, and Rewrite Bias

Researchers have released UnBias-Plus, an open-source toolkit designed to detect, explain, and rewrite bias in natural language across human-written and AI-generated content. The platform offers multi-class bias classification, span localization, neutral text rewriting, and interpretable reasoning, addressing a significant gap in bias mitigation tools with publicly available models and multiple interface options.

AINeutralarXiv – CS AI · Jun 96/10

🧠

Implicit Causal Graph Construction in Text via Chain Discovery

Researchers develop a novel method for constructing implicit causal graphs from text by using large language models to infer intermediate causal events between observed cause-effect pairs. The study compares multiple approaches including chain discovery and iterative search processes, validated against a curated database of 1,560 scientifically verified causal relationships.

AINeutralarXiv – CS AI · May 286/10

🧠

Show, Don't TELL: Explainable AI-Generated Text Detection

Researchers have developed TELL, an AI-generated text detector that prioritizes explainability by showing users the specific linguistic markers indicating AI or human authorship rather than just providing an opaque numerical score. The system achieves competitive detection performance (AUROC 0.927) while generating human-evaluated explanations with a 72.3% mean win-rate across quality metrics, fundamentally reframing detection as a human-centric interpretability problem.

AINeutralarXiv – CS AI · May 125/10

🧠

Matching Meaning at Scale: Evaluating Semantic Search for 18th-Century Intellectual History through the Case of Locke

Researchers evaluate semantic search as a tool for analyzing 18th-century intellectual history, specifically tracking how John Locke's ideas circulated through paraphrases and implicit references. While semantic search substantially outperforms traditional lexical methods at capturing meaning-level correspondences, linguistic analysis reveals that retrieval remains constrained by surface-level vocabulary overlap, suggesting both promise and limitations for historical corpus analysis.

AINeutralarXiv – CS AI · May 16/10

🧠

The TEA Nets framework combines AI and cognitive network science to model targets, events and actors in text

Researchers introduce TEA Nets (Target-Event-Agent Networks), an open-source AI framework that extracts subjects, verbs, and objects from text to analyze emotional and semantic patterns. Testing across conspiracy narratives and psychotherapy transcripts reveals that highly conspiratorial texts link personal pronouns to actions twice as frequently as low-conspiracy texts, while LLMs express emotions with measurably lower intensity than humans.

🧠 Claude

AINeutralarXiv – CS AI · Apr 66/10

🧠

DocShield: Towards AI Document Safety via Evidence-Grounded Agentic Reasoning

Researchers introduce DocShield, a new AI framework that uses evidence-based reasoning to detect text-based image forgeries in documents. The system combines visual and logical analysis to identify, locate, and explain document manipulations, showing significant improvements over existing detection methods.

🧠 GPT-4

AINeutralarXiv – CS AI · Mar 174/10

🧠

Interpretable Predictability-Based AI Text Detection: A Replication Study

Researchers replicated and improved upon an AI text detection system from the AuTexTification 2023 shared task, adding stylometric features and newer language models like Qwen and mGPT. The study achieved comparable or better performance than language-specific models while emphasizing the importance of clear documentation for reliable AI research replication.

🏢 Meta

AINeutralarXiv – CS AI · Mar 54/10

🧠

TopicENA: Enabling Epistemic Network Analysis at Scale through Automated Topic-Based Coding

TopicENA is a new framework that combines BERTopic with Epistemic Network Analysis to automatically analyze concept relationships in large text datasets without manual coding. The research demonstrates that automated topic modeling can replace expert manual coding while maintaining analytical quality, making network analysis scalable for large corpora.

AINeutralarXiv – CS AI · Mar 44/103

🧠

Compact Prompting in Instruction-tuned LLMs for Joint Argumentative Component Detection

Researchers developed a novel approach using instruction-tuned Large Language Models to improve argumentative component detection in text analysis. The method reframes the task as language generation rather than traditional sequence labeling, achieving superior performance on standard benchmarks compared to existing state-of-the-art systems.

AINeutralarXiv – CS AI · Mar 35/104

🧠

Assessing Crime Disclosure Patterns in a Large-Scale Cybercrime Forum

Researchers analyzed over 3.5 million posts from a major cybercrime forum, finding that 25% of initial posts contain explicit crime-related content and over one-third of users disclose criminal activity. The study used large language models to classify content and revealed that most users show restraint by gradually escalating disclosure through ambiguous 'grey' content before explicit criminal posts.

AINeutralarXiv – CS AI · Mar 34/103

🧠

When Numbers Tell Half the Story: Human-Metric Alignment in Topic Model Evaluation

Researchers introduce Topic Word Mixing (TWM), a new human evaluation method for assessing topic models in specialized domains. The study reveals misalignment between automated metrics and human judgment, particularly in domain-specific corpora like philosophy of science publications.

AINeutralarXiv – CS AI · Feb 274/103

🧠

Scaling In, Not Up? Testing Thick Citation Context Analysis with GPT-5 and Fragile Prompts

Researchers tested GPT-5's ability to perform citation context analysis by examining how different prompt designs affect the model's interpretative readings of academic citations. The study found that while GPT-5 produces consistent surface classifications, prompt scaffolding significantly influences which interpretative frameworks and vocabularies the model emphasizes in deeper analysis.

AINeutralGoogle DeepMind Blog · Oct 244/108

🧠

Aeneas transforms how historians connect the past

Aeneas is a new AI model designed to help historians contextualize and interpret ancient inscriptions by assisting with attribution and restoration of fragmentary historical texts. This represents a specialized application of AI technology for academic research in historical studies.

AINeutralarXiv – CS AI · Mar 24/105

🧠

ARGUS: Seeing the Influence of Narrative Features on Persuasion in Argumentative Texts

Researchers introduce ARGUS, a framework for studying how narrative features influence persuasion in online arguments. The study analyzes a ChangeMyView corpus using both traditional classifiers and large language models to identify which storytelling elements make arguments more convincing.

AINeutralarXiv – CS AI · Mar 24/106

🧠

Rough Sets for Explainability of Spectral Graph Clustering

Researchers propose an enhanced methodology using rough set theory to improve explainability of Graph Spectral Clustering (GSC) algorithms. The approach addresses challenges in explaining clustering results, particularly when applied to text documents where spectral space embeddings lack clear relation to content.