Skip to content

LASER‐GPT: Early Discoveries

Jonathan Bloedow edited this page Dec 17, 2024 · 8 revisions

Introduction

We have been experimenting with a true Custom GPT for 2 or 3 weeks now and early results seem very promising. And we are also learning a lot along the way. This page attempts to capture some of these findings.

Quick Overview

  • Link is here: https://chatgpt.com/g/g-674f5fd33aec8191bcdc1a2736fb7c8d-laser-gpt
  • We are using the Enterprise account, accessed normally via myapplications.microsoft.com, use your BMGF creds.
  • The primary input to the Custom GPT is the laser-core PDF which is all our docs, generated by sphinx, into a single PDF file.
  • There are a few additional instructions given to the custom GPT, such as this introduction:

This GPT is a software co-pilot designed to guide users in creating agent-based infectious disease models using the LASER disease modeling framework and the laser-core library. It helps users translate disease modeling ideas into Python code by guiding them through a structured, step-by-step process. First, the GPT helps users define model properties, such as infection_state, timers (e.g., incubation_timer, immunity_timer), and demographic attributes (e.g., date_of_birth, date_of_death). These properties are added using add_scalar_property(), and when creating a PropertySet, the GPT ensures that all keys are treated as strings, similar to a dictionary structure, according to the documented API and design document. Components should be as granular as possible. For example, if someone wants to model a disease which has an intrahost progression and an inter-host transmission, the GPT defaults to creating separate components for each. The design assumes a nearly 1:1 ratio of properties to components, ensuring modularity, unless the modeler specifies otherwise. If the user doesn't specify the details of transmission, assume transmission rates are a function of the infectious fraction and susceptible fraction in the node (or patch). The GPT ensures the model structure aligns with LASER best practices, outputs useful CSV reports, and provides troubleshooting support.

and this recommended starter prompt:

Build an SI/SIS/SIR/SEIR/SEIRW agent-based spatiotemporal disease model using LASER, including laser-core functions whenever possible and adhering strictly to LASER design principles.

Process

  • We have been continuously improving the documentation and rebuilding the Custom GPT with a new iteration of the laser-core.pdf and retesting, as we find ways in which the GPT makes mistakes, or is missing knowledge, or needs extra prompting. In particular we have added an "Architecture and Design" section, and recently a "Principles" section to that chapter. We have also added one complete working example (SIR, single node, no vital dynamics). We plan to add a couple more complete working examples, including spatial and vital dynamics.

Results

Early results are very promising. We are able to get working code for simple models within 1 or 2 prompts. And we're able to get working spatial models with vital dynamics within half a dozen prompts.

Examples

Prompt 1:

Using any reference examples, give me a single-node SIR model with 100,000 agents with the output tracked, saved, and plotted. Adhere to LASER design principles and use laser-core code wherever possible.

Response 1 (code omitted):

Here is a complete implementation of a single-node SIR model with 100,000 agents, following LASER design principles and using the laser-core library. The simulation will initialize a population, simulate disease transmission and recovery over time, save the results to a CSV file, and visualize the outputs.

Prompt 2:

We want to break our steps into individual components wherever possible. And we want to avoid the use of append() wherever possible and instead use LaserFrames, including for reporting.

Response 2 (code omitted):

To align with LASER design principles, I have modularized the steps into individual components and eliminated the use of append(). Instead of dictionaries, LaserFrames are used for both simulation state tracking and reporting. Each component is responsible for a single step, making the model more granular and modular.

Miscellaneous

  • Reproducibility is a big issue. It's not obvious that the same input gets exactly the same output. Which makes testing (and validation) difficult. And makes bug regression difficult. And certainly slight changes in one's prompt can have importantly different code generation. And we don't really want the LASER-GPT to have just 1 correct answer to a given prompt, so what does it mean exactly to say that the GPT "is working"? What would it look like to have a standard set of prompts and the ability to test the outputs?
  • Relatedly, should we require that our AI Assistant produce working code after 1 (valid) prompt for any valid set of model features? What if it takes a few prompts and bug reports to get to working code? Is that good enough? What if complex models require a few stepping stones along the way, but gets you there? Is that OK?
  • VaccinationComponent (TBD).
  • Starting with big long ambitious prompts seems to be able to get the GPT to convince itself that laser-core contains code it simply doesn't.

Way Ahead

Clone this wiki locally