TurkColBERT: A Benchmark of Dense and Late-Interaction Models for Turkish Information Retrieval Paper • 2511.16528 • Published 20 days ago • 17
Parrot: Persuasion and Agreement Robustness Rating of Output Truth -- A Sycophancy Robustness Benchmark for LLMs Paper • 2511.17220 • Published 20 days ago • 17