- Title
- Application-aware System Optimization Lab (Professor: Park, Yongjun (박영준)) Publishes Two Papers at CGO and Chairs a Sess
- Date
- 2025.03.25
- Writer
- 첨단컴퓨팅학부
- 게시글 내용
-
Application-aware System Optimization Lab (Professor: Park, Yongjun (박영준)) Publishes Two Papers at CGO and Chairs a Session
Students from the Application-aware System Optimization Lab (Professor: Park, Yongjun (박영준)) presented two papers at the International Symposium on Code Generation and Optimization 2025 (CGO ‘25), one of the top international conferences in the field of compilers. Additionally, Prof. Park, Yongjun (박영준) served as the session chair for the Architectures and Code Generation session at the conference.
The first paper, titled “CUrator: An Efficient LLM Execution Engine with Optimized Integration of CUDA Libraries,” proposes a technique for efficiently executing large language model (LLM) inference by leveraging cuBLAS and CUTLASS libraries on various modern GPUs. This research demonstrates peak inference performance for LLMs across multiple GPUs and is expected to guide the future direction of next-generation optimization frameworks.
The second paper, titled “Accelerating LLMs using an Efficient GEMM Library and Target-Aware Optimizations on Real-world PIM Devices,” introduces an optimized GEMM library designed for Processing-in-Memory (PIM) architectures and proposes additional optimization techniques to accelerate LLM inference. This study effectively utilizes PIM architectures to address the inference slowdown caused by the high data requirements of large language models, playing a crucial role in overcoming this challenge.
- Attachments
- 250325-박영준교수님-썸네일.jpg