Loading…
Attending this event?
October 28-29, 2024 | Tokyo, Japan
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for Open Source Summit + AI_dev Japan 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Japan Standard Time (UTC +9). To see the schedule in your preferred timezone, please select from the drop-down located at the bottom of the menu to the right.
Monday October 28, 2024 14:50 - 15:30 JST
This talk will share the story of quantization, its rise in popularity, and its current status in the open-source community. We'll begin by reviewing key quantization papers, such as QLoRA by Tim Dettmers and GPTQ by Elias Frantar. Next, we'll demonstrate how quantization can be applied at various stages of model development, including pre-training, fine-tuning, and inference. Specifically, we'll share our experience in pre-training a 1.58-bit model, show how fine-tuning is achievable using PEFT + QLoRA, and discuss optimizing inference performance with torch.compile or custom kernels. Finally, we'll highlight efforts within the community to make quantized models more accessible, including how transformers incorporate state-of-the-art quantization schemes and how to run GGUF models from llama.cpp.
Speakers
avatar for Marc Sun

Marc Sun

Machine Learning Engineer, HuggingFace
Marc is a ML Engineer working on the Open Source team at Hugging Face and he collaborates with researchers and developers to add new exciting features in the HF ecosystem and have contributed to various libraries in the HF ecosystem (transformers, accelerate, PEFT). Marc is also deeply... Read More →
Monday October 28, 2024 14:50 - 15:30 JST
Hall B (4)

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link