Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... arXiv preprint arXiv:2206.04615, 2022 | 744 | 2022 |
Lima: Less is more for alignment C Zhou, P Liu, P Xu, S Iyer, J Sun, Y Mao, X Ma, A Efrat, P Yu, L Yu, ... Advances in Neural Information Processing Systems 36, 2024 | 406 | 2024 |
A simple and effective model for answering multi-span questions E Segal, A Efrat, M Shoham, A Globerson, J Berant EMNLP 2020, 2019 | 84* | 2019 |
Scrolls: Standardized comparison over long language sequences U Shaham, E Segal, M Ivgi, A Efrat, O Yoran, A Haviv, A Gupta, W Xiong, ... EMNLP 2022, 2022 | 76 | 2022 |
The turking test: Can language models understand instructions? A Efrat, O Levy arXiv preprint arXiv:2010.11982, 2020 | 76 | 2020 |
Zeroscrolls: A zero-shot benchmark for long text understanding U Shaham, M Ivgi, A Efrat, J Berant, O Levy arXiv preprint arXiv:2305.14196, 2023 | 33 | 2023 |
Lmentry: A language model benchmark of elementary language tasks A Efrat, O Honovich, O Levy arXiv preprint arXiv:2211.02069, 2022 | 16 | 2022 |
Cryptonite: A cryptic crossword benchmark for extreme ambiguity in language A Efrat, U Shaham, D Kilman, O Levy EMNLP 2021, 2021 | 8 | 2021 |
How Optimal is Greedy Decoding for Extractive Question Answering? O Castel, O Ram, A Efrat, O Levy AKBC 2022, 2021 | 1 | 2021 |