[1]

Prof. Tareq N. Hashem, “Reasoning capabilities of large language models: a systematic review and meta-analysis of benchmarks, methods, and emergent behaviours (2018–2025)”, JAIMLNN, vol. 5, no. 1, pp. 63–75, Mar. 2025.