From abe1e352a1c1941db8fc9b9e16a770412ace10e2 Mon Sep 17 00:00:00 2001 From: Yunfei Bai Date: Sun, 18 Aug 2024 22:23:10 -0700 Subject: [PATCH] Add files via upload --- index.html | 87 +++++++++++++++++++++++++++++++++++------------------- 1 file changed, 57 insertions(+), 30 deletions(-) diff --git a/index.html b/index.html index 4502f1e..8370035 100644 --- a/index.html +++ b/index.html @@ -7,40 +7,67 @@ -

<!DOCTYPE html>

-

<html>

-

<head>

-

  <title>Domain-Driven LLM Development: Insights into RAG and Fine- Tuning Practices</title>

-

</head>

+

Domain-Driven LLM Development: Insights into RAG and Fine- Tuning Practices


-

<body>

-

<h1> Domain-Driven LLM Development: Insights into RAG and Fine- Tuning Practices </h1> 

-


-

<h2>Abstract</h2>

-

To improve Large Language Model (LLM) performance on domain specific applications, ML developers often leverage Retrieval Augmented Generation (RAG) and LLM Fine-Tuning. RAG extends the capabilities of LLMs to specific domains or an organization's internal knowledge base, without the need to retrain the model. On the other hand, Fine-Tuning approach updates LLM weights with domain-specific data to improve performance on specific tasks. The fine-tuned model is particularly effective to systematically learn new comprehensive knowledge in a specific domain that is not covered by the LLM pre-training. This tutorial walks through the RAG and Fine-Tuning techniques, discusses the insights of their advantages and limitations, and provides best practices of adopting the methodologies for the LLM tasks anduse cases. The hands-on labs demonstrate the advanced techniques to optimize the RAG and fine-tuned LLM architecture that handles domain specific LLM tasks. The labs in the tutorial are designed by using a set of open-source python libraries to implement the RAG and fine-tuned LLM architecture.

-


-

<h2>Agenda</h2>

-

Lab 1: Advanced Techniques in RAG.

-

Lab 2: LLM Fine-Tuning

-

Lab 3: RAG and Fine-Tuned Model Benchmarking.

-


-

</h2>Presenters </h2>

-

José Cassio dos Santos Junior 

-

A member of the Amazon Machine Learning University team. He is responsible for Curriculum Development for Advanced Modules. As a previous Senior Data Scientist on the AWS LATAM Professional Services Data Science team, he was responsible for experiments and MVP engagements with important clients in the Brazilian market. Cassio has over 20 years of experience working as a software engineer. As a business process management expert, he participated in BPO projects for more than 7 years. Cass also has more than 10 years of teaching experience at colleges and acting as instructor for Linux certification preparation and Microsoft Innovation Center bootcamps. He holds a Master’s degree in Computer Engineering, a Bachelor’s degree in Physics, and a Bachelor’s degree in Business Administration, specialized in IT Quantitative Methods.

-

Rachel Hu 

-

The Co-founder & CEO of CambioML, which builds open-source libraries to prepare high quality data for RAG and LLM finetuning. Previously she was an Applied Scientist at AWS AI, an ML instructor at Amazon Machine Learning University, and a speaker at top conferences including KDD, NVIDIA GTC, AWS re:Invent, MLOps Summit, etc. Rachel co-authored Dive into Deep Learning (D2L.ai), an open-source interactive textbook adopted by over 500 universities around the world. Rachel received her master degree of statistics from University of California, Berkeley, and bachelor of Math from University of Waterloo, Canada.

-

Richard Song 

-

The Co-Founder and CEO of Epsilla Inc, a one- stop RAGaaS platform for building production ready LLM applications. With a background in big data, vector graph databases, and high performance computing, Richard helps customers build production-ready RAG systems connected with large scale proprietary data. Richard holds a Master’s degree in Computer Science from Cornell University.

-

Yunfei Bai 

-

A Senior Solutions Architect at Amazon Web Services. With over 15 years’ experience on AI/ML, Data Science and Analytics, Yunfei helps AWS customers adopt AI/ML and Generative AI services to deliver business results. Prior to AWS, he worked in various roles including product manager and solution consultant in multiple industries, designed and delivered AI/ML and data analytics solutions that overcome complex technical challenges and drive strategic objectives. Yunfei has a PhD in Electronic and Electrical Engineering. He has published research papers and blog posts, and serves as a journal reviewer.

-


-

</body>

-

</html>  

+

Hands On Tutorials at  2024 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Barcelona, Spain

+

Schedule: 14:00 – 17:00, August 25, 2024

+


+

Abstract

+


+

To improve Large Language Model (LLM) performance on domain specific applications, ML developers often leverage Retrieval Augmented Generation (RAG) and LLM Fine-Tuning. RAG extends the capabilities of LLMs to specific domains or an organization's internal knowledge base, without the need to retrain the model. On the other hand, Fine-Tuning approach updates LLM weights with domain-specific data to improve performance on specific tasks. The fine-tuned model is particularly effective to systematically learn new comprehensive knowledge in a specific domain that is not covered by the LLM pre-training. This tutorial walks through the RAG and Fine-Tuning techniques, discusses the insights of their advantages and limitations, and provides best practices of adopting the methodologies for the LLM tasks anduse cases. The hands-on labs demonstrate the advanced techniques to optimize the RAG and fine-tuned LLM architecture that handles domain specific LLM tasks. The labs in the tutorial are designed by using a set of open-source python libraries to implement the RAG and fine-tuned LLM architecture.

+


+

Agenda

+


+

Section 1: Introduction to RAG and LLM Fine-Tuning (20 mins)  

+

Section 2: Lab setup (10 mins)

+

Section 3: Lab 1: Advanced Techniques in RAG (40 mins) - Richard Song

+

Break (10 mins)

+

Section 4: Lab 2: LLM Fine-Tuning (40 mins) - Yunfei Bai, Rachel Hu 

+

Break (10 mins)

+

Section 5: Lab 3: RAG and Fine-Tuned Model Benchmarking (30 mins) - José Cassio dos Santos Junior 

+

Section 6: Conclusion and Q&A (20 mins) 

+


+

Presenters 

+


+

José Cassio dos Santos Junior 

+

A member of the Amazon Machine Learning University team. He is responsible for Curriculum Development for Advanced Modules. As a previous Senior Data Scientist on the AWS LATAM Professional Services Data Science team, he was responsible for experiments and MVP engagements with important clients in the Brazilian market. Cassio has over 20 years of experience working as a software engineer. As a business process management expert, he participated in BPO projects for more than 7 years. Cass also has more than 10 years of teaching experience at colleges and acting as instructor for Linux certification preparation and Microsoft Innovation Center bootcamps. He holds a Master’s degree in Computer Engineering, a Bachelor’s degree in Physics, and a Bachelor’s degree in Business Administration, specialized in IT Quantitative Methods.

+


+

Rachel Hu 

+

The Co-founder & CEO of CambioML, which builds open-source libraries to prepare high quality data for RAG and LLM finetuning. Previously she was an Applied Scientist at AWS AI, an ML instructor at Amazon Machine Learning University, and a speaker at top conferences including KDD, NVIDIA GTC, AWS re:Invent, MLOps Summit, etc. Rachel co-authored Dive into Deep Learning (D2L.ai), an open-source interactive textbook adopted by over 500 universities around the world. Rachel received her master degree of statistics from University of California, Berkeley, and bachelor of Math from University of Waterloo, Canada.

+


+

Richard Song 

+

The Co-Founder and CEO of Epsilla Inc, a one-stop RAGaaS platform for building production ready LLM applications. With a background in big data, vector graph databases, and high performance computing, Richard helps customers build production-ready RAG systems connected with large scale proprietary data. Richard holds a Master’s degree in Computer Science from Cornell University.

+


+

Yunfei Bai 

+

A Senior Solutions Architect at Amazon Web Services. With over 15 years’ experience on AI/ML, Data Science and Analytics, Yunfei helps AWS customers adopt AI/ML and Generative AI services to deliver business results. Prior to AWS, he worked in various roles including product manager and solution consultant in multiple industries, designed and delivered AI/ML and data analytics solutions that overcome complex technical challenges and drive strategic objectives. Yunfei has a PhD in Electronic and Electrical Engineering. He has published research papers and blog posts, and serves as a journal reviewer.

+


+

Materials

+


+

Slides: coming soon

+

Github: coming soon 

+


+


+