|Efficient embedded computing|
WJ Dally, J Balfour, D Black-Shaffer, J Chen, RC Harting, V Parikh, J Park, ...
Computer 41 (7), 27-32, 2008
|Navigating the maze of graph analytics frameworks using massive graph datasets|
N Satish, N Sundaram, MMA Patwary, J Seo, J Park, MA Hassaan, ...
Proceedings of the 2014 ACM SIGMOD international conference on Management of …, 2014
|Faster cnns with direct sparse convolutions and guided pruning|
J Park, S Li, W Wen, PTP Tang, H Li, Y Chen, P Dubey
arXiv preprint arXiv:1608.01409, 2016
|Distributed socialite: A datalog-based language for large-scale graph analysis|
J Seo, J Park, J Shin, MS Lam
Proceedings of the VLDB Endowment 6 (14), 1906-1917, 2013
|An energy-efficient processor architecture for embedded systems|
J Balfour, W Dally, D Black-Schaffer, V Parikh, JS Park
IEEE Computer Architecture Letters 7 (1), 29-32, 2008
|Glow: Graph lowering compiler techniques for neural networks|
N Rotem, J Fix, S Abdulrasool, G Catron, S Deng, R Dzhabarov, N Gibson, ...
arXiv preprint arXiv:1805.00907, 2018
|Deep learning recommendation model for personalization and recommendation systems|
M Naumov, D Mudigere, HJM Shi, J Huang, N Sundaraman, J Park, ...
arXiv preprint arXiv:1906.00091, 2019
|Sparsifying synchronization for high-performance shared-memory sparse triangular solver|
J Park, M Smelyanskiy, N Sundaram, P Dubey
International Supercomputing Conference, 124-140, 2014
|Deep learning inference in facebook data centers: Characterization, performance optimizations and hardware implications|
J Park, M Naumov, P Basu, S Deng, A Kalaiah, D Khudia, J Law, P Malani, ...
arXiv preprint arXiv:1811.09886, 2018
|FROSTT: The formidable repository of open sparse tensors and tools|
S Smith, JW Choi, J Li, R Vuduc, J Park, X Liu, G Karypis
|Efficient shared-memory implementation of high-performance conjugate gradient benchmark and its application to unstructured matrices|
J Park, M Smelyanskiy, K Vaidyanathan, A Heinecke, DD Kalamkar, X Liu, ...
SC'14: Proceedings of the International Conference for High Performance …, 2014
|CloudRAMSort: fast and efficient large-scale distributed RAM sort on shared-nothing cluster|
C Kim, J Park, N Satish, H Lee, P Dubey, J Chhugani
Proceedings of the 2012 ACM SIGMOD International Conference on Management of …, 2012
|Parallel efficient sparse matrix-matrix multiplication on multicore platforms|
MMA Patwary, NR Satish, N Sundaram, J Park, MJ Anderson, ...
International Conference on High Performance Computing, 48-57, 2015
|Tera-scale 1D FFT with low-communication algorithm and Intel® Xeon Phi™ coprocessors|
J Park, G Bikshandi, K Vaidyanathan, PTP Tang, P Dubey, D Kim
Proceedings of the International Conference on High Performance Computing …, 2013
|Hardware/software co-optimization to improve performance and energy for inter-vm communication for nfvs and other producer-consumer workloads|
R Wang, AJ Herdrich, YC Liu, HH Hum, JS Park, CJ Hughes, ...
US Patent App. 14/583,389, 2016
|Automating wavefront parallelization for sparse matrix computations|
A Venkat, MS Mohammadi, J Park, H Rong, R Barik, MM Strout, M Hall
SC'16: Proceedings of the International Conference for High Performance …, 2016
|Efficient backprojection-based synthetic aperture radar computation with many-core processors|
J Park, PTP Tang, M Smelyanskiy, D Kim, T Benson
SC'12: Proceedings of the International Conference on High Performance …, 2012
|A study of bfloat16 for deep learning training|
D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ...
arXiv preprint arXiv:1905.12322, 2019
|Two-step approach to scheduling quantum circuits|
GG Guerreschi, J Park
Quantum Science and Technology 3 (4), 045003, 2018
|Improving concurrency and asynchrony in multithreaded MPI applications using software offloading|
K Vaidyanathan, DD Kalamkar, K Pamnany, JR Hammond, P Balaji, ...
SC'15: Proceedings of the International Conference for High Performance …, 2015