11 articles tagged with #text-analysis. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AINeutralarXiv โ CS AI ยท Mar 37/102
๐ง Researchers developed a new algorithm called Learn-to-Distance (L2D) that can detect AI-generated text from models like GPT, Claude, and Gemini with significantly improved accuracy. The method uses adaptive distance learning between original and rewritten text, achieving 54.3% to 75.4% relative improvements over existing detection methods across extensive testing.
AINeutralarXiv โ CS AI ยท Apr 66/10
๐ง Researchers introduce DocShield, a new AI framework that uses evidence-based reasoning to detect text-based image forgeries in documents. The system combines visual and logical analysis to identify, locate, and explain document manipulations, showing significant improvements over existing detection methods.
๐ง GPT-4
AINeutralarXiv โ CS AI ยท Mar 174/10
๐ง Researchers replicated and improved upon an AI text detection system from the AuTexTification 2023 shared task, adding stylometric features and newer language models like Qwen and mGPT. The study achieved comparable or better performance than language-specific models while emphasizing the importance of clear documentation for reliable AI research replication.
๐ข Meta
AINeutralarXiv โ CS AI ยท Mar 54/10
๐ง TopicENA is a new framework that combines BERTopic with Epistemic Network Analysis to automatically analyze concept relationships in large text datasets without manual coding. The research demonstrates that automated topic modeling can replace expert manual coding while maintaining analytical quality, making network analysis scalable for large corpora.
AINeutralarXiv โ CS AI ยท Mar 44/103
๐ง Researchers developed a novel approach using instruction-tuned Large Language Models to improve argumentative component detection in text analysis. The method reframes the task as language generation rather than traditional sequence labeling, achieving superior performance on standard benchmarks compared to existing state-of-the-art systems.
AINeutralarXiv โ CS AI ยท Mar 35/104
๐ง Researchers analyzed over 3.5 million posts from a major cybercrime forum, finding that 25% of initial posts contain explicit crime-related content and over one-third of users disclose criminal activity. The study used large language models to classify content and revealed that most users show restraint by gradually escalating disclosure through ambiguous 'grey' content before explicit criminal posts.
AINeutralarXiv โ CS AI ยท Mar 34/103
๐ง Researchers introduce Topic Word Mixing (TWM), a new human evaluation method for assessing topic models in specialized domains. The study reveals misalignment between automated metrics and human judgment, particularly in domain-specific corpora like philosophy of science publications.
AINeutralarXiv โ CS AI ยท Feb 274/103
๐ง Researchers tested GPT-5's ability to perform citation context analysis by examining how different prompt designs affect the model's interpretative readings of academic citations. The study found that while GPT-5 produces consistent surface classifications, prompt scaffolding significantly influences which interpretative frameworks and vocabularies the model emphasizes in deeper analysis.
AINeutralGoogle DeepMind Blog ยท Oct 244/108
๐ง Aeneas is a new AI model designed to help historians contextualize and interpret ancient inscriptions by assisting with attribution and restoration of fragmentary historical texts. This represents a specialized application of AI technology for academic research in historical studies.
AINeutralarXiv โ CS AI ยท Mar 24/105
๐ง Researchers introduce ARGUS, a framework for studying how narrative features influence persuasion in online arguments. The study analyzes a ChangeMyView corpus using both traditional classifiers and large language models to identify which storytelling elements make arguments more convincing.
AINeutralarXiv โ CS AI ยท Mar 24/106
๐ง Researchers propose an enhanced methodology using rough set theory to improve explainability of Graph Spectral Clustering (GSC) algorithms. The approach addresses challenges in explaining clustering results, particularly when applied to text documents where spectral space embeddings lack clear relation to content.