generated from greenelab/lab-website-template
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add and update initial data on research projects
- Loading branch information
Showing
5 changed files
with
58 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
--- | ||
title: Designing and Optimizing Cache System for New Hardware and Infrastructure | ||
tags: Cache, New-hardware, ZNS, LSM-KVS, Disaggregation, CXL, RDMA | ||
--- | ||
|
||
Cache systems are core components of today’s data infrastructure. Our group focuses on designing and optimizing various types of cache systems for new hardware and infrastructure. Our goal is to improve the cache hit ratio, reduce costs, and achieve better management capabilities. | ||
|
||
Specifically, ASU-IDI is working on the following three sub-areas: | ||
|
||
- Persistent cache optimization for Zoned Namespace SSDs: Traditionally, persistent caches have been designed based on regular SSDs, which suffer from high write amplification (i.e., shorter lifespan) and performance penalties. We aim to replace these SSDs with emerging ZNS SSDs, which can achieve larger capacity, better management capability, and almost no write amplification. | ||
- Improving cache system performance with application hints: Historically, caches have been isolated from upper-layer applications. We are exploring various ways to use application-level information to improve the cache hit ratio and cost-effectiveness. We have investigated this with graph databases, LSM-KVS, and file systems. | ||
- Cache in disaggregated infrastructure: With the development of memory and storage disaggregation, cache systems need to be redesigned for the new disaggregated memory (CXL-based or RDMA-based) and used to speed up disaggregated storage access. Our focus is on issues related to failure recovery, cache sharing, and cache management. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,11 +1,13 @@ | ||
--- | ||
title: LLM-Assisted Configuration Tuning for Log-Structured Merge-tree-based Key-Value Stores | ||
tags: LSM-KVS, Tuning, AI | ||
tags: LSM-KVS, Tuning, LLM | ||
--- | ||
|
||
<!-- A single line examplation of the project --> | ||
Design and develop an LLM-assisted auto-tuning framework for Log-Structured Merge-tree-based Key-Value Stores (LSM-KVS) to achieve better performance. | ||
Storage and Memory systems have undergone a variety of modifications and transformations, and are widely used in today's IT infrastructure. These systems usually have over 100 options (e.g. HBase and RocksDB) to tune performance for particular hardware (e.g., CPU, Memory, and Storage), software, and workloads (e.g., random, skewed, and read/write intensive). ASU-IDI focuses on developing an LLM-assisted auto-tuning framework for storage and memory systems to enhance performance. | ||
|
||
Log-Structured Merge-tree-based Key-Value Stores (LSM-KVS) are widely used in today's IT infrastructure , and usually have over 100 options (e.g., HBase and RocksDB ) to tune performance for particular hardware (e.g., CPU, memory, and storage), software, and workloads (e.g., random, skewed, and read/write intensive) . However, tuning the LSM-KVS with appropriate configurations is always challenging, usually requiring IT professionals with LSM-KVS expertise to run hundreds of benchmarking evaluations. Existing related studies on LSM-KVS tuning solutions are still limited, lacking generality, adaptiveness to the versions and deployments. We believe the recent advancements of Large-Language-Models (LLMs) like OpenAI's GPT-4 can be a promising solution to achieve LSM-KVS auto-tuning: 1) LLMs are trained using collections of LSM-KVS-related blog, publications, and almost all the open-sourced code, which makes the LLMs a real "expert" of LSM-KVS; 2) LLMs has the strong inferential capability to analyze the benchmarking results and achieve automatic and interactive adjustments for LSM-KVS on particular hardware and workloads. However, how to design the auto-tuning framework based on LLMs and benchmarking tools, how to generate appropriate prompts for LLMs, and how to calibrate the unexpected errors and wrong configurations are three main challenges to be addressed. | ||
Tuning Storage and Memory systems Log-Structured Merge-tree-based Key-Value Stores (LSM-KVS) like RocksDB and HBase with appropriate configurations is challenging, usually requiring IT professionals with appropriate expertise to run hundreds of benchmarking evaluations. Existing related studies on tuning solutions are still limited, lacking generality, adaptiveness to the versions and deployments. We believe the recent advancements of Large-Language-Models (LLMs) like OpenAI's GPT-4 can be a promising solution to achieve auto-tuning: | ||
|
||
We propose to design and develop an LLM-assisted auto-tuning framework as shown in Figure with the following workflow: 1) Use default options file and a collection of system and hardware information as initial input. 2) Use a feedback loop with the LLM API and create new prompts for LLM with option changes and the processed benchmarking results in the previous iterations. 3) The newly generated options from LLM are calibrated (cleaned and corrected) for a new round of benchmarking; And 4) after several iterations, the benchmarking results have converged and it generates the final optimized option configurations. Note that the whole process is automatically deployed and executed without human intervention. We implemented the framework prototype on RocksDB v8.8.1 and OpenAI's GPT-4-1106 model, and open-sourced. Our preliminary evaluations show that with 5 iterations of auto-tuning, our framework achieves up to 20% of throughput improvement compared with default configurations. | ||
1. LLMs are trained using collections of LSM-KVS-related blog, publications, and almost all the open-sourced code, which makes the LLMs a real "expert"; | ||
2. LLMs has the strong inferential capability to analyze the benchmarking results and achieve automatic and interactive adjustments on particular hardware and workloads. | ||
|
||
However, how to design the auto-tuning framework based on LLMs and benchmarking tools, how to generate appropriate prompts for LLMs, and how to calibrate the unexpected errors and wrong configurations are three main challenges to be addressed. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
--- | ||
title: System Optimizations for LLM Inferencing | ||
tags: LLM, Cache, Memory, Storage | ||
--- | ||
|
||
Fast LLM inferencing requires sufficient GPU memory to accommodate the entire model. However, many users find low-end GPUs inadequate or even unusable for this purpose. Our focus is on developing system methodologies and solutions to speed up LLM inferencing, such as offloading, data traffic optimization, neuron and weight redistribution, and token batching and rescheduling. | ||
|
||
To address the challenges of LLM inferencing on low-end GPUs, we propose a series of system optimizations to enhance performance and efficiency. These optimizations include: | ||
|
||
- Offloading: Delegating parts of the model computation to external systems or devices to alleviate GPU memory constraints. | ||
- Data Traffic Optimization: Reducing latency and improving data transfer efficiency to accelerate inferencing. | ||
- Neuron and Weight Redistribution: Balancing the computational load across different hardware components for better resource utilization. | ||
- Token Batching and Rescheduling: Streamlining the processing flow to enhance performance and throughput. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
--- | ||
title: LSM-based Key-Value Store Redesign for Disaggregated Infrastructure | ||
tags: LSM-KVS, Disaggregation | ||
--- | ||
|
||
LSM-KVS (e.g., RocksDB) is a key component in today’s data infrastructure for storing unstructured data. Initially designed for legacy monolithic servers, it now faces issues like resource wasting and low scalability as workloads grow. Deploying LSM-KVS in a disaggregated infrastructure is essential for better performance and resource management. Our vision is to fully restructure LSM-KVS, decouple the components, and execute them at different disaggregated compute, memory, and storage nodes for better performance, resource utilization, and management capability. Remote compaction, remote flush, and offloading the block cache are just the beginning. | ||
|
||
Transitioning LSM-KVS-based systems to a disaggregated infrastructure involves significant changes and improvements. By decoupling the components of LSM-KVS and running them on separate compute, memory, and storage nodes, it is possible to enhance performance and scalability. This approach allows for better resource allocation, more efficient load distribution, and improved overall system management. Remote compaction, remote flush, and offloading the block cache are initial steps in this restructuring process. | ||
|
||
- Remote Compaction: Moving the compaction process to a remote node can free up local resources, reducing bottlenecks and improving system performance. | ||
- Remote Flush: Performing flush operations remotely can enhance data management and improve the efficiency of storage utilization. | ||
- Offloading the Block Cache: By offloading the block cache to a remote node, it is possible to optimize memory usage and improve data retrieval times. | ||
|
||
We envision a future where LSM-KVS is fully restructured and optimized for disaggregated infrastructure, enabling more efficient and scalable data storage solutions. By decoupling the components of LSM-KVS and distributing them across different nodes, we can address the limitations of traditional monolithic server-based systems and unlock new possibilities for data management and analysis. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
--- | ||
title: Novel Indexing to Optimize Database and Data Analytic Queries | ||
tags: Indexing, Database, Query | ||
--- | ||
|
||
In the big data era, speeding up query performance is crucial. We focus on designing novel indexing structures and storage-memory co-designs to enhance query speed, particularly in scientific computing, big data analytics, and ML systems. | ||
|
||
To achieve these goals, we are focusing on several key strategies: | ||
|
||
- Novel Indexing Structures: Developing advanced indexing methods to organize and access data more efficiently. | ||
- Storage-Memory Co-Designs: Integrating storage and memory optimizations to reduce latency and speed up data retrieval. | ||
- Application Focus: Enhancing query performance specifically in scientific computing, big data analytics, and ML systems. |