TRLawBench
Turkish law LLM benchmark: Measures large language models' Turkish legal knowledge and reasoning across 3 stages. Comprehensive evaluation of 297 questions covering ÖSYM exam items, advanced reasoning, and open-ended analysis.
Overview
TRLawBench is an open three-stage benchmark that measures how well large language models (GPT, Claude, Gemini, Gemma, Qwen, Grok, DeepSeek, Mistral, Llama and others) handle Turkish law. Stages 2 and 3, co-authored with attorney Esat Erbil Tavus, were compiled from print sources and never published online, eliminating training-data contamination.
Why TRLawBench?
- Turkish legal reasoning focus: Not general language fluency but grounded evaluation against Turkish Code of Obligations (TBK), Turkish Penal Code (TCK), the Constitution, and related legislation.
- Contamination-free: 200 questions sourced from print, never indexed online, so they cannot appear in training data.
- Transparent methodology: Correct-answer detection is deterministic; the LLM judge scores only reasoning quality and is blind to the model identity.
Evaluation Methodology
- Stage 1: Foundational Legal Knowledge. 97 multiple-choice questions from ÖSYM legal exams. Correct answers derived from the official ÖSYM answer key.
- Stage 2: Advanced Turkish Legal Reasoning. 100 advanced multiple-choice questions across 14 categories. Print-sourced, never online.
- Stage 3: Open-Ended Questions. 100 open-ended questions spanning 14 branches of law. Deterministic letter-match plus Gemini 3.1 Pro blind-judge scoring on a 0 to 10 scale.
Per-stage leaderboards, accuracy rates, and sample Q&A are available in the interactive panel above.
ÖSYM Legal Questions
97-question evaluation set drawn from ÖSYM legal qualification exams. Measures models' foundational knowledge of Turkish law and exam-style problem-solving.
Evaluation run in April 2025. Gemma 4 31B IT (reasoning on/off) results added in April 2026.
Score Comparison
Sample Question
All 97 questions- ASözleşmenin niteliğine ve işin özelliğine yabancı olan genel işlem koşulları yazılmamış sayılır.
- BSözleşmenin yazılmamış sayılan genel işlem koşulları dışındaki hükümleri geçerliliğini korur.
- CGenel işlem koşullarına, dürüstlük kurallarına aykırı olarak karşı tarafın aleyhine veya onun durumunu ağırlaştırıcı nitelikte hükümler konulamaz.
- DAynı amaçla düzenlenen sözleşmelerin metinlerinin özdeş olmaması, bu sözleşmelerin içerdiği hükümlerin genel işlem koşulu sayılmasını engellemez.
- EGenel işlem koşullarında yer alan bir hüküm, açık ve anlaşılır değilse veya birden çok anlama geliyorsa düzenleyenin aleyhine ve karşı tarafın lehine yorumlanır.