Loading…
Attending this event?
October 28-29, 2024 | Tokyo, Japan
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for Open Source Summit + AI_dev Japan 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Japan Standard Time (UTC +9). To see the schedule in your preferred timezone, please select from the drop-down located at the bottom of the menu to the right.
AI_dev clear filter
arrow_back View All Dates
Monday, October 28
 

11:15 JST

Democratizing Diffusion Models with Diffusers - Sayak Paul, Hugging Face
Monday October 28, 2024 11:15 - 11:55 JST
The talk “Democratizing Diffusion Models with Diffusers” will explore the diverse applications of the open-source Python library Diffusers in the image and video generation space. The talk will showcase how Diffusers, based on diffusion models, enables fast and high-quality image and video generation, making it accessible to a wide range of users. The presentation will cover various use cases, including image inpainting, image editing, and scene composition, demonstrating the capabilities of Diffusers in enabling users to create and edit photo-realistic images with minimum effort. The audience will gain insights into the potential of Diffusers in revolutionizing the way images and videos are generated and edited, making it a must-attend session for anyone interested in the latest advancements in this field.
Speakers
avatar for Sayak Paul

Sayak Paul

Research Engineer, Hugging Face
Sayak works on diffuson models at Hugging Face, focusing on training them, maintaining the diffusers library, and leading some applied research efforts. Off the work, he likes to binge-watch ICML tutorials and Suits.
Monday October 28, 2024 11:15 - 11:55 JST
Hall B (4)

12:05 JST

Data Prep Kit: A Comprehensive Cloud-Native Toolkit for Scalable Data Preparation in GenAI App - Daiki Tsuzuku & Takuya Goto, IBM
Monday October 28, 2024 12:05 - 12:45 JST
Every conversation on AI starts with models and ends with data. Data preparation is emerging as a very important phase of the GenAI journey, as high quantity and quality text and code corpora for GenAI model training have shown to play a crucial role in producing high performing Large Language Models (LLMs). The data preparation phase in the Generative AI lifecycle aims to clean, filter, and transform the datasets of text and code that are acquired from various sources into a tokenized form that is suitable for the training of LLMs, be it pre-training, or constructing LLM apps via fine-tuning or instruct tuning. The latter poses unique challenges, as each use case may necessitate tailored data preparation approaches. Given the enduring and evolving demand for data preparation techniques in LLM applications, we are introducing Data Prep Kit as an open-source software asset. This endeavour is geared towards fostering collaborative efforts within the community, enabling collective development and utilization, and ultimately reducing time to value. DPK has been instrumental in powering the IBM open-source Granite models.
Speakers
avatar for Takuya Goto

Takuya Goto

Software Engineer, IBM
Takuya is a software engineer at IBM where he works on software product development, and open-source development. Takuya specializes in NLP, ML, and text-based data processing. In his free time, Takuya likes running, and traveling with my wife and son.
avatar for Daiki Tsuzuku

Daiki Tsuzuku

Software Developer, IBM
I have been working in IBM as a software developer for about 7 years. I have been the backend developer, and sometimes frontend developer, of Watson Explorer, Watson Discovery, and watsonx Orchestrate. My field is to develop the application of processing a wide variety and large volume... Read More →
Monday October 28, 2024 12:05 - 12:45 JST
Hall B (4)

14:00 JST

Optimize Your AI Cloud Infrastructure: A Hardware Perspective - Liang Yan, CoreWeave
Monday October 28, 2024 14:00 - 14:40 JST
GPU Cloud has become a ubiquitous component of contemporary AI infrastructure, especially for distributed machine learning scenarios. While conversations around AI infrastructure optimization typically revolve around the application layer, such as machine learning tasks and distributed job schedulers, delving into the underhood of the GPU cloud is essential. Numerous factors, including POD Scheduler, Device Plugin, GPU/NUMA topology, ROCE/NCCL Stack, and more, can significantly impact performance.

This session will thoroughly explore the tuning of various machine models(CNN/RNN/Transformer) from MLPerf using an H100 Cluster as a reference. We will analyze the correlation between model performance and device operator configuration in nodes by presenting first-hand experimental results to unveil the hidden potential within a K8S GPU Cloud.
Speakers
avatar for Liang Yan

Liang Yan

Sr. Software Engineer, CoreWeave
Liang Yan is a senior software engineer at Coreweave, specializing in AI Infra, heterogeneous architecture acceleration and distributed machine learning systems from the cloud base. He collaborates closely with upstream communities and leading vendors like NVIDIA, AMD and ARM, delivering... Read More →
Monday October 28, 2024 14:00 - 14:40 JST
Hall B (4)
  AI_dev

14:50 JST

A Next-generation IoT Platform for Edge AI Apps Leveraging Sensors and Wasm - Munehiro Shimomura, Sony Semiconductor Solutions Corporation & Kenji Shimizu, Midokura
Monday October 28, 2024 14:50 - 15:30 JST
In this session, we will introduce the construction of a comprehensive platform that uses Edge AI and sensors to cover everything from devices to the cloud. The platform enables advanced cooperation between sensors and AI control, and emphasizes seamless and dynamic replacement of AI models by using WebAssembly (Wasm). Furthermore, through open sourcing, we aim to expand the ecosystem and form a technical community. Through technical details and real-world scenarios, we will provide insights that participants can apply to their own projects.
Speakers
avatar for Kenji Shimizu

Kenji Shimizu

Manager, Midokura
After spending more than 20 years in Japanese telecom & mobile company as an R&D engineering researcher and manager, he, inspired by Midokura's vision, has joined and started to play a role to expand the ecosystem for open source for an edge distributed AI sensing infrastructure which... Read More →
avatar for Munehiro Shimomura

Munehiro Shimomura

Open Source Program Manager, Sony Semiconductor Solutions Corporation
Munehiro is the division's OSPO Open Source Program Manager, where he leads open source strategy development and execution. He believes it is important to create a culture in which organizations can strategically and proactively utilize open source, and is working hard to achieve... Read More →
Monday October 28, 2024 14:50 - 15:30 JST
Meeting Room 1

14:50 JST

Unlocking Local LLMs with Quantization - Marc Sun, Hugging Face
Monday October 28, 2024 14:50 - 15:30 JST
This talk will share the story of quantization, its rise in popularity, and its current status in the open-source community. We'll begin by reviewing key quantization papers, such as QLoRA by Tim Dettmers and GPTQ by Elias Frantar. Next, we'll demonstrate how quantization can be applied at various stages of model development, including pre-training, fine-tuning, and inference. Specifically, we'll share our experience in pre-training a 1.58-bit model, show how fine-tuning is achievable using PEFT + QLoRA, and discuss optimizing inference performance with torch.compile or custom kernels. Finally, we'll highlight efforts within the community to make quantized models more accessible, including how transformers incorporate state-of-the-art quantization schemes and how to run GGUF models from llama.cpp.
Speakers
avatar for Marc Sun

Marc Sun

Machine Learning Engineer, HuggingFace
Marc is a ML Engineer working on the Open Source team at Hugging Face and he collaborates with researchers and developers to add new exciting features in the HF ecosystem and have contributed to various libraries in the HF ecosystem (transformers, accelerate, PEFT). Marc is also deeply... Read More →
Monday October 28, 2024 14:50 - 15:30 JST
Hall B (4)

15:40 JST

From Design to Launch: Implementing AI Governance at Scale - Nathália Kuromiya & Martin Winkler, Google
Monday October 28, 2024 15:40 - 16:20 JST
What does it take to design, implement, and roll out a comprehensive AI governance program? What challenges and opportunities do companies encounter when scaling AI governance across diverse products and AI applications, and how can AI governance programs be designed to stay agile in a dynamic technological and regulatory environment? Insights on scaling AI governance programs progressively across multiple jurisdictions while keeping them agile in a dynamic technological and regulatory landscape. Practical learnings, effective solutions and challenges to watch out for. Topics discussed: - What are the building blocks of your company’s AI Governance program? - What challenges did we face when building an AI Governance program? - The AI Governance landscape is evolving rapidly – both through technical innovation and regulatory advances. How to keep an agile approach to AI governance? - What kinds of risks are front and center when you’re building an AI Governance program? What kinds of opportunities?
Speakers
avatar for Martin Winkler

Martin Winkler

Software Engineer, Google
Martin Winkler is a Software Engineer at Google as part of the Privacy, Safety and Security team. He works as a lead for privacy and governance tooling and tackled cross-company privacy challenges to ensure the safety of users and their data. Additionally, he is establishing company... Read More →
avatar for Nathália Kuromiya

Nathália Kuromiya

Software Engineer, Google
Nathália Harumi Kuromiya is a Software Engineer at Google as part of the Privacy, Safety and Security team. She works as a lead for privacy and governance tooling and had a privacy reviewer role. For the past year, she's been working on AI safety space as one of the technical contributors... Read More →
Monday October 28, 2024 15:40 - 16:20 JST
Hall B (4)

16:40 JST

Leveraging Zephyr and ML to Bring Smart Devices to Market, Faster - Kate Stewart, The Linux Foundation
Monday October 28, 2024 16:40 - 17:20 JST
End point devices are resource constrained, either in terms of power, memory or communication capabilities - sometimes all three. However, being able to apply machine learning on these end point devices is possible and when applied strategically enables system wide efficiencies to be realized. This talk will explore the requirements and tradeoffs for such system to be considered when using the Zephyr RTOS and Tensorflow Lite for Embedded Microcontrollers projects.
Speakers
avatar for Kate Stewart

Kate Stewart

VP Dependable Embedded Systems, Linux Foundation
Kate Stewart works with the safety, security and license compliance communities to advance the adoption of best practices into embedded open source projects. Since joining The Linux Foundation, she has launched the ELISA and Zephyr Projects, as well as supporting other embedded projects... Read More →
Monday October 28, 2024 16:40 - 17:20 JST
Hall B (4)

17:30 JST

GPU Distributed Caching for PyTorch Leveraging NVMe, GDS, and RDMA - Hope Wang, Alluxio
Monday October 28, 2024 17:30 - 18:10 JST
As GPUs become increasingly powerful, the separation between compute and storage often results in underutilized GPUs waiting for data. Meanwhile, high-performance components on GPU machines, such as NVMe storage and fast networks leveraging InfiniBand or special NICs, remain idle. Effectively leveraging these hardware resources to address GPU underutilization is a critical challenge. In this talk, we introduce a Kubernetes-native distributed caching layer that leverages NVMe disks and fast networks to optimize PyTorch training data access. Utilizing stateless workers for scalability and ETCD for membership services, this caching layer efficiently manages and serves data. Cached data is rapidly and efficiently fed into GPU memory using NVIDIA's DALI data loader, GPUDirect Storage (GDS), and Remote Direct Memory Access (RDMA), significantly reducing data transfer bottlenecks and improving overall training performance.
Speakers
avatar for Hope Wang

Hope Wang

Developer Advocate, Alluxio
Hope Wang is a Presto Contributor and a Developer Advocate at Alluxio. She has a decade of experience in Data, AI, and Cloud. An open-source contributor to PrestoDB, Trino, and Alluxio, she currently works at Alluxio as a developer advocate and previously worked in venture capital... Read More →
Monday October 28, 2024 17:30 - 18:10 JST
Hall B (4)
 
  • Filter By Date
  • Filter By Venue
  • Filter By Type
  • Audience
  • Timezone

Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date -