-
Notifications
You must be signed in to change notification settings - Fork 186
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Minor update the github io page, for KDD'24 tutorial placeholder (#310)
* init for KDD'24 tutorial placeholder
- Loading branch information
Showing
5 changed files
with
122 additions
and
17 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,107 @@ | ||
<!DOCTYPE html> | ||
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]--> | ||
<!--[if gt IE 8]><!--> <html class="no-js" lang="en" > <!--<![endif]--> | ||
<head> | ||
<meta charset="utf-8"> | ||
<meta http-equiv="X-UA-Compatible" content="IE=edge"> | ||
<meta name="viewport" content="width=device-width, initial-scale=1.0"> | ||
<title>Multi-modal Data Processing for Foundation Models: Practical Guidances and Use Cases</title> | ||
<!-- Bootstrap --> | ||
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/css/bootstrap.min.css" integrity="sha384-Gn5384xqQ1aoWXA+058RXPxPg6fy4IWvTNh0E263XmFcJlSAwiGgFAW/dAiS6JXm" crossorigin="anonymous"> | ||
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/js/bootstrap.min.js" integrity="sha384-JZR6Spejh4U02d8jOt6vLEHfe/JQGiRRSQQxSfFWpi1MquVdAyjUar5+76PVCmYl" crossorigin="anonymous"></script> | ||
</head> | ||
|
||
<body> | ||
<nav class="navbar navbar-expand-lg navbar-light bg-light"> | ||
<a class="navbar-brand" href="#">KDD 2024 Hands-on Tutorial</a> | ||
<button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#navbarSupportedContent" aria-controls="navbarSupportedContent" aria-expanded="false" aria-label="Toggle navigation"> <span class="navbar-toggler-icon"></span> </button> | ||
<div class="collapse navbar-collapse" id="navbarSupportedContent"> | ||
<ul class="navbar-nav mr-auto"> | ||
<li class="nav-item active"> <a class="nav-link" href="#">Home <span class="sr-only">(current)</span></a> </li> | ||
<li class="nav-item"> <a class="nav-link" href="#Schedule">Schedule</a> </li> | ||
<li class="nav-item"> <a class="nav-link" href="#Organizers">Organizers</a> </li> | ||
</ul> | ||
</div> | ||
</nav> | ||
<header> | ||
<div class="jumbotron"> | ||
<div class="container"> | ||
<div class="row"> | ||
<div class="col-10 col-lg-12"> | ||
<h2 class="text-center">KDD 2024 Hands-on Tutorial</h2> | ||
<h1 class="text-center"><strong>Multi-modal Data Processing for Foundation Models: Practical Guidances and Use Cases</strong></h1> | ||
<h4 class="text-center"><em><strong>Date & Time</strong></em>: X:XX pm - Y:YY pm, August XX, 2024</h4> | ||
<h4 class="text-center"><em><b>Location</b></em>: To be updated</h4> | ||
</div> | ||
</div> | ||
</div>In the era of foundation models, the ability to process multi-modal data efficiently and effectively has become paramount. | ||
In this tutorial, participants will dive into the essential techniques for processing multi-modal data. We will explore how large-scale high-quality data enhances model performance and introduce the open-sourced Data-Juicer system, designed to tackle the complexities of data variety, quality and scale. | ||
Attendees will gain practical experience with Data-Juicer's operators, mastering data formatting, mapping, filtering, deduplication and selection. | ||
A significant portion of the tutorial is dedicated to the Data-Juicer Sandbox Lab and typical use cases for static and dynamic data, including text, image, audio, and video. The lab is a playground integrated with unified models and evaluators, and facilitates experiments with data recipes that represent methodical sequences of operators and streamline the creation of scalable data processing pipelines. This experience is designed to not only solidify the concepts discussed but also to provide a space for innovation and exploration, highlighting how data recipes can be optimized and deployed in high-performance distributed environments. | ||
<p></p>By the end of this tutorial, attendees will be equipped with the practical knowledge and skills to navigate the complexities of multi-modal data processing. They will leave with actionable knowledge with an industrial open-source system and an enriched perspective on the importance of high-quality data in AI, poised to implement sustainable and scalable solutions in their projects. | ||
</div> | ||
<div class="container"> | ||
<div class="row"> | ||
<div class="col-sm-6 col-lg-12"> | ||
<p class="text-justify"> | ||
</p> | ||
<p><h5>Tutorial Slides</h5> <a href="to_be_uploaded" target="_blank"><b>slides.pdf</b></a> </p> | ||
</div> | ||
</div> | ||
</div> | ||
|
||
</header> | ||
<section> | ||
|
||
<div class="container"> | ||
<div class="row"> | ||
<div class="col-12 mb-2 text-center"> | ||
<h2><a id="Schedule">Schedule</a></h2> | ||
</div> | ||
</div> | ||
<div class="row"> | ||
<div class="col-sm-6 col-lg-12" style="margin-bottom: 3em;"> | ||
<h6 class="text-left"><b>Date</b>: August XX, 2024</h6> | ||
<h6 class="text-left"><b>Location</b>: To be updated.</h6> | ||
<h6 class="text-left">(xx min) | Introduction and Overview: Multi-modal Data Processing and the | ||
Data-Juicer System</h6> | ||
<h6 class="text-left">(xx min) | Building Blocks of Data Processing: Data-Juicer’s Operators</h6> | ||
<h6 class="text-left">(xx min) | Composing Atomic Capabilities: Data-Juicer’s Data Recipes</h6> | ||
<h6 class="text-left">(xx min) | Exploring Data Recipes: The Data-Juicer Sandbox Lab</h6> | ||
<h6 class="text-left">(xx min) | From Exploration to Production: High-Performance Data Factory</h6> | ||
<h6 class="text-left">(xx min) | Static Data Use Cases: Text and Image Data Processing</h6> | ||
<h6 class="text-left">(xx min) | Dynamic Data Use Cases: Video and Audio Data Processing</h6> | ||
<h6 class="text-left">(xx min) | Conclusion and Resources</h6> | ||
<p> </p> | ||
</div> | ||
</div> | ||
</div> | ||
<div class="container"> | ||
|
||
<div class="row"> | ||
<div class="col-lg-12 mb-4 mt-2 text-center"> | ||
<h2><a id="Organizers">Organizers</a></h2> | ||
<h5>We are the <a href="https://github.com/modelscope/data-juicer" target="_blank">Data-Juicer</a> team from Alibaba Tongyi</h5> | ||
<img src="https://img.alicdn.com/imgextra/i3/O1CN017Eq5kf27AlA2NUKef_!!6000000007757-0-tps-1280-720.jpg" width = "640" height = "360" alt="Data-Juicer"/> | ||
|
||
</div> | ||
</div> | ||
</div> | ||
|
||
|
||
</section> | ||
<div class="container"> </div> | ||
<footer class="text-center"> | ||
<div class="container"> | ||
<div class="row"> | ||
<div class="col-12"> </div> | ||
</div> | ||
</div> | ||
</footer> | ||
<!-- jQuery (necessary for Bootstrap's JavaScript plugins) --> | ||
<script src="js/jquery-3.2.1.min.js"></script> | ||
<!-- Include all compiled plugins (below), or include individual files as needed --> | ||
<script src="js/popper.min.js"></script> | ||
<script src="js/bootstrap-4.0.0.js"></script> | ||
</body> | ||
</html> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters