Naoyuki Kanda

Cited by

	All	Since 2019
Citations	4001	3592
h-index	29	27
i10-index	56	49

1400

700

350

1050

200620072008200920102011201220132014201520162017201820192020202120222023202415 20 15 15 14 18 28 34 43 31 41 61 64 107 199 489 872 1363 553

Co-authors

Xiaofei WangMicrosoftVerified email at jhu.edu
Yusuke FujitaLY Corp.Verified email at linecorp.com
Shota HoriguchiNTT CorporationVerified email at ntt.com
Shinji WatanabeCarnegie Mellon UniversityVerified email at cmu.edu
Hiroshi G OkunoProfessor Emeritus, Kyoto University, Adjunct Researcher, Waseda UniversityVerified email at nue.org
Kazunori KomataniProfessor, Osaka UniversityVerified email at sanken.osaka-u.ac.jp
Hiroshi TsujinoHonda R&D Co., Ltd.Verified email at jp.honda
Kazuhiro NakadaiTokyo Institute of TechnologyVerified email at ra.sc.e.titech.ac.jp
Christoph BoeddekerPaderborn UniversityVerified email at mail.upb.de
Aswin Shanmugam SubramanianMicrosoftVerified email at microsoft.com
Vimal ManoharMeta Platforms Inc.Verified email at meta.com
Mikio NakanoC4A Research Institute, Inc.Verified email at c4a.jp
Tetsuya OgataProfessor, Waseda University / Joint-appointed Fellow, AISTVerified email at waseda.jp
Reinhold Haeb-UmbachProfessor of Communications Engineering, University of PaderbornVerified email at nt.uni-paderborn.de
Jens HeitkaemperResearch Scientist, Google IncVerified email at google.com
Matthew MaciejewskiJohns Hopkins UniversityVerified email at mmaciejewski.com
Szu-Jui ChenUniversity of Texas at DallasVerified email at UTDallas.edu
Ruizhi LiMicrosoftVerified email at microsoft.com
Leibny Paola GarciaJohns Hopkins UniversityVerified email at jhu.edu
Chiori HoriMERLVerified email at merl.com

Naoyuki Kanda

Microsoft

Verified email at microsoft.com

Speech Recognition Speech and Language Processing Machine Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Wavlm: Large-scale self-supervised pre-training for full stack speech processing S Chen, C Wang, Z Chen, Y Wu, S Liu, Z Chen, J Li, N Kanda, T Yoshioka, ... IEEE Journal of Selected Topics in Signal Processing 16 (6), 1505-1518, 2022	976	2022
A review of speaker diarization: Recent advances with deep learning TJ Park, N Kanda, D Dimitriadis, KJ Han, S Watanabe, S Narayanan Computer Speech & Language 72, 101317, 2022	287	2022
CHiME-6 Challenge: Tackling multispeaker speech recognition for unsegmented recordings S Watanabe, M Mandel, J Barker, E Vincent, A Arora, X Chang, ... arXiv preprint arXiv:2004.09249, 2020	287	2020
End-to-end neural speaker diarization with self-attention Y Fujita, N Kanda, S Horiguchi, Y Xue, K Nagamatsu, S Watanabe 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019	230	2019
End-to-end neural speaker diarization with permutation-free objectives Y Fujita, N Kanda, S Horiguchi, K Nagamatsu, S Watanabe arXiv preprint arXiv:1909.05952, 2019	223	2019
Elastic spectral distortion for low resource speech recognition with deep neural networks N Kanda, R Takeda, Y Obuchi Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on …, 2013	143	2013
Internal language model estimation for domain-adaptive end-to-end speech recognition Z Meng, S Parthasarathy, E Sun, Y Gaur, N Kanda, L Lu, X Chen, R Zhao, ... 2021 IEEE Spoken Language Technology Workshop (SLT), 243-250, 2021	95	2021
Serialized output training for end-to-end overlapped speech recognition N Kanda, Y Gaur, X Wang, Z Meng, T Yoshioka arXiv preprint arXiv:2003.12687, 2020	94	2020
Integration of speech separation, diarization, and recognition for multi-speaker meetings: System description, comparison, and analysis D Raj, P Denisov, Z Chen, H Erdogan, Z Huang, M He, S Watanabe, J Du, ... 2021 IEEE spoken language technology workshop (SLT), 897-904, 2021	83	2021
Guided source separation meets a strong ASR backend: Hitachi/Paderborn University joint investigation for dinner party ASR N Kanda, C Boeddeker, J Heitkaemper, Y Fujita, S Horiguchi, ... arXiv preprint arXiv:1905.12230, 2019	71	2019
Joint speaker counting, speech recognition, and speaker identification for overlapped speech of any number of speakers N Kanda, Y Gaur, X Wang, Z Meng, Z Chen, T Zhou, T Yoshioka arXiv preprint arXiv:2006.10930, 2020	70	2020
Microsoft speaker diarization system for the voxceleb speaker recognition challenge 2020 X Xiao, N Kanda, Z Chen, T Zhou, T Yoshioka, S Chen, Y Zhao, G Liu, ... ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021	68	2021
A two-layer model for behavior and dialogue planning in conversational service robots M Nakano, Y Hasegawa, K Nakadai, T Nakamura, J Takeuchi, T Torii, ... 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems …, 2005	68	2005
Maximum a posteriori Based Decoding for CTC Acoustic Models N Kanda, X Lu, H Kawai Interspeech 2016, 1868-1872, 2016	54	2016
Multi-domain spoken dialogue system with extensibility and robustness against speech recognition errors K Komatani, N Kanda, M Nakano, K Nakadai, H Tsujino, T Ogata, ... Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue, 9-17, 2006	54	2006
The Hitachi/JHU CHiME-5 system: Advances in speech recognition for everyday home environments using multiple microphone arrays N Kanda, R Ikeshita, S Horiguchi, Y Fujita, K Nagamatsu, X Wang, ... Proc. CHiME-5, 6-10, 2018	53	2018
Internal language model training for domain-adaptive end-to-end speech recognition Z Meng, N Kanda, Y Gaur, S Parthasarathy, E Sun, L Lu, X Chen, J Li, ... ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021	47	2021
A multi-expert model for dialogue and behavior control of conversational robots and agents M Nakano, Y Hasegawa, K Funakoshi, J Takeuchi, T Torii, K Nakadai, ... Knowledge-Based Systems 24 (2), 248-256, 2011	47	2011
Face-voice matching using cross-modal embeddings S Horiguchi, N Kanda, K Nagamatsu Proceedings of the 26th ACM international conference on Multimedia, 1011-1019, 2018	44	2018
Acoustic modeling for distant multi-talker speech recognition with single-and multi-channel branches N Kanda, Y Fujita, S Horiguchi, R Ikeshita, K Nagamatsu, S Watanabe ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019	43	2019

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors