diff --git a/.DS_Store b/.DS_Store new file mode 100644 index 0000000..96d1a61 Binary files /dev/null and b/.DS_Store differ diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..4c27d80 --- /dev/null +++ b/LICENSE @@ -0,0 +1,29 @@ +BSD 3-Clause License + +Copyright (c) 2021, BioNLP Lab +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + +3. Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived from + this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE +FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER +CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, +OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. diff --git a/README.md b/README.md new file mode 100644 index 0000000..17d16f4 --- /dev/null +++ b/README.md @@ -0,0 +1,3 @@ +## AMIA 2023 Annual Symposium Tutorial on Development and Evaluation of Large Language Models in Healthcare Applications + +https://bionlplab.github.io/2024_AMIA_LLM_Tutorial/ diff --git a/css/style.css b/css/style.css new file mode 100644 index 0000000..5434683 --- /dev/null +++ b/css/style.css @@ -0,0 +1,195 @@ +/* CSS Document */ + + +body { + /*background: #f7f7f7;*/ + background: #e3e5e8; + color: #f7f7f7; + font-family: 'Lato', Verdana, Helvetica, sans-serif;; + font-weight: 300; + font-size:16px; +} + +/* Headings */ + +h1 { + font-size:30pt; +} + +h2 { + font-size:22pt; +} + +h3 { + font-size:14pt; +} + + +/* Hyperlinks */ + +a:link { + color: #1772d0; + text-decoration: none; +} + +a:visited { + color: #1772d0; + text-decoration: none; +} + +a:active { + color: red; + text-decoration: none; +} + +a:hover { + color: #f09228; + text-decoration: none; +} + + +/* Main page container */ + + +.container { + width: 1024px; + min-height: 200px; + margin: 0 auto; /* top and bottom, right and left */ + border: 1px hidden #000; + /* border: none; */ + text-align: center; + padding: 1em 1em 1em 1em; /* top, right, bottom, left */ + color: #4d4b59; + background: #f7f7f7; +} + +.overview { + text-align: left; +} + + +.containersmall { + width: 1024px; + min-height: 10px; + margin: 0 auto; /* top and bottom, right and left */ + border: 1px hidden #000; + /* border: none; */ + text-align: left; + padding: 1em 1em 1em 1em; /* top, right, bottom, left */ + color: #4d4b59; + background: #f7f7f7; +} + +.schedule { + width: 900px; + min-height: 200px; + margin: 0 auto; /* top and bottom, right and left */ + /*border: 1px solid #000;*/ + border: none; + text-align: left; + padding: 1em 1em 1em 1em; /* top, right, bottom, left */ + color: #4d4b59; + background: #f7f7f7; +} + +/* Title and menu */ + +.title{ + font-size: 22pt; + margin: 1px; +} + +.menubar { + white-space: nowrap; + margin-bottom: 0em; + text-align:center; + font-size:16px; +} + + +/* Announcements */ + +.announce_date { + font-size: .875em; + font-style: italic; +} +.announce { + font-size: inherit; +} +.schedule_week { + font-size: small; + background-color: #CCF; +} + + +/* Schedule */ + +table.schedule { + border-width: 1px; + border-spacing: 2px; + border-style: none; + border-color: #000; + border-collapse: collapse; + background-color: white; +} + +p.subtitle { + text-indent: -5em; + margin-left: 5em; +} + +/* Notes */ + +table.notes { + border: none; + border-collapse: collapse; +} + +.notes td { + border-bottom: 1px solid; + padding-bottom: 5px; + padding-top: 5px; +} + + +/* Problem sets */ + +table.psets { +/* border: none;*/ + border-collapse: collapse; +} + +.psets td { + border-bottom: 1px solid; + padding-bottom: 5px; + padding-top: 5px; +} + + +.acknowledgement +{ + font-size: .875em; +} + +.code { + font-family: "Courier New", Courier, monospace +} + +.instructorphoto img { + width: 120px; + border-radius: 120px; + margin-bottom: 10px; +} + +.instructorphotosmall img { + width: 60px; + border-radius: 60px; + margin-bottom: 10px; +} + +.instructor { + display: inline-block; + width: 200px; + text-align: center; + margin-right: 20px; +} diff --git a/figures/.DS_Store b/figures/.DS_Store new file mode 100644 index 0000000..5008ddf Binary files /dev/null and b/figures/.DS_Store differ diff --git a/figures/hua.jpg b/figures/hua.jpg new file mode 100644 index 0000000..4053cf9 Binary files /dev/null and b/figures/hua.jpg differ diff --git a/figures/user.png b/figures/user.png new file mode 100644 index 0000000..307e2d7 Binary files /dev/null and b/figures/user.png differ diff --git a/figures/yanshan.jpg b/figures/yanshan.jpg new file mode 100644 index 0000000..ab07f57 Binary files /dev/null and b/figures/yanshan.jpg differ diff --git a/figures/yifan_peng.jpg b/figures/yifan_peng.jpg new file mode 100644 index 0000000..a303972 Binary files /dev/null and b/figures/yifan_peng.jpg differ diff --git a/index.html b/index.html new file mode 100644 index 0000000..de41f54 --- /dev/null +++ b/index.html @@ -0,0 +1,111 @@ + + + + + +AMIA 2024 Annual Symposium Tutorial on Development and Evaluation of Large Language Models in Healthcare Applications + + + + + + + + +
+ + + + + +

+ + +

AMIA 2024 Annual Symposium Tutorial on

+ Development and Evaluation of Large Language Models in Healthcare Applications

+ Location: San Francisco, CA, USA
+ Time: November 9 - 13, 2024 +
+ +
+ +
+ +
+

Panelists

+
+ + + + + +
+
+ +
+ +
+

Overview

+
+

Language models are being increasingly used in natural language processing (NLP) applications, which require neither the development of a task-specific architecture nor customized training on large datasets. In particular, large language models (LLMs), such as the GPT1, PaLM2, and Llama-23, have demonstrated significant advances in NLP tasks 4–6. On the other hand, concerns have also been raised about the impact of these tools in health care, education, research, and beyond. One notable concern is the potential for LLMs to reinforce disparities in healthcare, as these models are typically trained on data that is historically biased against certain disadvantaged groups. Another concern is the potential for LLMs to be applied for malicious purposes. Although it is widely accepted that LLMs should be used with integrity, transparency, and honesty, how to appropriately do so and, if needed, regulate the development, and use of this technology needs further discussion.

+ +

This course provides students with an understanding of LLMs, using ChatGPT, Llama-2, and other models as examples, and their applications in health. Students will acquire knowledge of natural language processing, large language models, chain-of-though, Retrieval-Augmented Generation (RAG), and the range of prompting methods available for processing clinical text. Hands-on experience and a toolkit will provide useful skills for managing text data to solve a variety of problems in the health domain.

+ +

We believe that the proposed tutorial is timely and urgently needed for AMIA stakeholders, including informaticists from a broad array of disciplines, clinicians, software developers, and IT professionals, to learn how to develop and use these models to ensure that their potential benefits are realized while any potential risks and negative consequences are minimized. This tutorial will also likely be one of many conversations at AMIA 2024 about this issue as we learn more about LLMs, their capacity, and their potential impact on healthcare.

+
+
+ +
+ +
+

Tentative Schedule

+
+

45 min. Topic 1: An introduction to LLMs and their development in the medical domain (Hua Xu)

+

45 min. Topic 2: Integration of LLMs into NLP and other clinical decision-making tasks (Yanshan Wang)

+

45 min. Topic 3: Multimodal LLMs and their applications (Yifan Peng)

+

30 min. Open discussion

+
+
+ +
+ +
+

About the speakers

+
+

Hua Xu, Ph.D., FACMI, is Robert T. McCluskey Professor and Vice Chair for Research and Development at the Section of Biomedical Informatics and Data Science of Yale School of Medicine. He also serves as Assistant Dean for Biomedical Informatics at Yale School of Medicine. He has worked on different clinical NLP topics and has built multiple clinical NLP systems. Dr. Xu served as the Chair of the AMIA NLP working group between 2014-2015 and currently leads the OHDSI NLP working group. He taught NLP tutorials at various conferences such as AMIA, Medinfo, AIME, etc. Recently, Dr. Xu has worked on building foundation medical LLMs including the recently released Me LLaMA models based on the open LLaMA2 model. He will provide a generation introduction to LLMs and hands-on experience in developing medical LLMs and their applications in clinical NLP tasks such as information extraction. +

+ +

Yanshan Wang, Ph.D., FAMIA, is vice chair of Research and assistant professor within the Department of Health Information Management at the University of Pittsburgh. His research interests focus on artificial intelligence (AI), natural language processing (NLP), and machine/deep learning methodologies and applications in health care. Dr. Wang has led several NIH-funded projects, which aimed to develop NLP and AI algorithms to automatically extract information from free-text electronic health records (EHRs). He has over 60 peer-reviewed publications. Dr. Wang has been actively serving the informatics and NLP communities. He has served on a Student Paper Competition Committee for the AMIA Annual Symposium and was an associate editor for MedInfo conference. He is also a regular reviewer for a dozen of prestigious journals, such as Nature Communications, JAMIA, and JBI. Wang also organized several shared tasks, including the first BioCreative/OHNLP challenge in 2018 and the second n2c2/OHNLP challenge in 2019, to encourage the informatics and NLP communities to tackle NLP problems in the clinical domain. He is also a steering committee member for the HealthNLP workshop. In 2020, he was inducted into the Fellows of AMIA (FAMIA). Dr. Wang serves as the Chair of the AMIA NLP working group between 2023-2024. +

+ +

Yifan Peng (Moderator), Ph.D., Assistant Professor in the Division of Health Sciences Department of Population Health Sciences at Weill Cornell Medicine. + Dr. Peng's main research interests include BioNLP and medical image analysis. To facilitate research on language representations in the biomedicine domain, one of his studies present the Biomedical Language Understanding Evaluation (BLUE) benchmark, a collection of resources for evaluating and analyzing biomedical natural language representation models. Detailed analysis shows that BLUE can be used to evaluate the capacity of the models to understand the biomedicine text and, moreover, to shed light on the future directions for developing biomedicine language representations. As the panel moderator, Dr. Peng will describe the current state of LLMs and list their unique opportunities and challenges compared to other language models. +

+
+
+ +
+ +
+

Please contact Yifan Peng if you have question. The webpage template is by the courtesy of awesome Georgia.

+
+ + + + +