모바일 메뉴 닫기
 

Announcements

Title
Application-aware System Optimization Lab (Professor: Park, Yongjun (박영준)) Publishes Two Papers at CGO and Chairs a Sess
Date
2025.03.25
Writer
첨단컴퓨팅학부
게시글 내용

Application-aware System Optimization Lab (Professor: Park, Yongjun (박영준)) Publishes Two Papers at CGO and Chairs a Session


Students from the Application-aware System Optimization Lab (Professor: Park, Yongjun (박영준)) presented two papers at the International Symposium on Code Generation and Optimization 2025 (CGO ‘25), one of the top international conferences in the field of compilers. Additionally, Prof. Park, Yongjun (박영준) served as the session chair for the Architectures and Code Generation session at the conference.


The first paper, titled “CUrator: An Efficient LLM Execution Engine with Optimized Integration of CUDA Libraries,” proposes a technique for efficiently executing large language model (LLM) inference by leveraging cuBLAS and CUTLASS libraries on various modern GPUs. This research demonstrates peak inference performance for LLMs across multiple GPUs and is expected to guide the future direction of next-generation optimization frameworks.


Paper link



The second paper, titled “Accelerating LLMs using an Efficient GEMM Library and Target-Aware Optimizations on Real-world PIM Devices,” introduces an optimized GEMM library designed for Processing-in-Memory (PIM) architectures and proposes additional optimization techniques to accelerate LLM inference. This study effectively utilizes PIM architectures to address the inference slowdown caused by the high data requirements of large language models, playing a crucial role in overcoming this challenge.


Paper link

Attachments
250325-박영준교수님-썸네일.jpg