-
Notifications
You must be signed in to change notification settings - Fork 5
Status
- KR approved for oral presentation on her abstract
- CL approved for poster presentation on LASER
- JB: no obvious performance issues to investigate → put further performance tuning on the back burner
- JB: started looking at model outputs for England + Wales w.r.t. CCS expectations and outbreak periodicity resembling historical data
- JB: used Optuna for high level calibration to
- Mean New Infections Per Year (total incidence across all communities): 666,200
- Mean New Infections Per Year (London only): 52,000
- Mean fraction of time with zero incidence in the five largest communities (London, Liverpool, Manchester, …): 0.01
- Median fraction of time with zero incidence: 0.6444
- Sigmoid slope for log mean population vs. fraction of zero case time: -1.78
- JB: wrestling with matching outbreak periodicity from simulation with historical data and theoretical results
-
JB: experimented with Docker-izing model code and running multi-core simulations on IDM Tanzu Kubernetes cluster (model runs as web-service on the cluster, accepting parameters [infectivity, migration fraction, and seasonal multiplier], and returning S, I, R channels)
-
CL: coming to the conclusion that USEIRD grouping (with memory moves) is not performant - needs wrap up and write up
-
CL: working on demographics classes to address different types of input data, e.g.
- known historical data (E+W)
- point in time census data + CBR + mortality (e.g. Nigeria)
- synthetic populations
-
CL: working on consolidating "front-end" for Numpy+Numba, GPU w/Taichi, and USEIRD model implementations to select E+W, Nigeria, or synthetic scenarios
-
CL: looked at periodicity in E+W simulation after implementing births and immigration based on historical data
- KM: writeup on vital dynamics/demographics here -KR: EMOD England and Wales example.
- CL: England + Wales Measles with "grouped communities": https://github.com/InstituteforDiseaseModeling/laser/tree/clorton/gc-eng-wal-model/proto
- CL: MESA review
- JB: RI implementation via index management
- KM: Required Model Features table update
- KR: Sandbox ABM Microbenchmarks
- Agents.jl ABM comparison
- Using NetLogo for country-scale disease modeling
- CL: Initial FLAMEGPU investigation
- all: short discussion on file formats, suggestion to use GDAL supported formats, questions about GDAL installation and utility, maybe develop to RasterTools, currently private to IDM
- JB: EULA mortality
- all: distribution/installation/maintenance issues with Python+ (whatever the '+' is)
- Halide
- Tips and Tricks from Game Programming for Performance Code
- Probability Generating Function resources
- "The Hunt for the Missing Data Type" (a commentary on the tradeoffs between representation and performance and the multitude of user scenarios/requirements)
- KM: "Fundamental entities of a LASER model"
- KR: Composable examples based on this blog
- Kevin and Jonathan discussed a new architecture for a service that runs on GPU hardware. They also talked about the importance of researchers feeling like their files are accessible and not just stored remotely. Kevin introduced a research project in EMOD that involves modeling measles with seasonality and aging.
- Kevin introduced Catherine's research project in EMOD, which models measles with seasonality, aging, and age-dependent vaccine take.
- Kevin and Christopher discussed testing and validating an England and Wales measles model. Christopher reported a bug in his code and suspected that the
k
parameter was too small, leading to local elimination in some places. Kevin suggested constructing a gravity matrix and normalizing it.- Christopher and Kevin discussed the best way to burn in the simulation. Christopher suggested starting with total initial population over R0 as susceptible and the remainder are recovered at the outset, so he's not burning in from an entirely naïve population.
- Kevin discussed the different types of graphs and data types that would be suitable for the project. He mentioned that an edge list would be more appropriate for larger datasets. He also talked about the importance of having a graph with a giant component or a small diameter.
- Kevin and Jonathan discussed the difficulty of traveling to remote locations such as North Korea and South Sudan. They debated the number of hops required to reach these destinations and the challenges of obtaining necessary travel documents.
- Kevin and Jonathan discussed the characteristics of a transmission network and the importance of concurrency in HIV. They also talked about the book "Networks, an Introduction" and the potential benefits of representing groups in a graph way.
- Kevin and Christopher discussed the possibility of conducting a research study on when COVID was first detected. They talked about the challenges of testing in rural areas and the potential for selection bias.
- Kevin and Jonathan discussed Joel's work on epidemics on networks and how it could inform their work. Kevin suggested that Joel's work is more of a person-to-person network approach, and that there might be some stuff there that is worth looking at and taking.
- Christopher and Jonathan discussed the use of Docker in environments. They also discussed the possibility of having a simplistic browser client as the interface.
- Christopher and Kevin discussed the differences in overhead between running Docker on Linux, Windows with WSL, and Macs. Christopher mentioned that Docker is heavier weight on Macs because it requires running a full VM to go run Linux and then run Docker inside that VM.
Spatial model for Nigeria:
- Design towards a functional equivalent of the EMOD model and compare the performance and ease of use (Katherine) Gravity model:
- Sweep over the
k
parameter and fix the bug in the code (Christopher) Network science:- Dive into network science literature and find useful metrics and algorithms for graph representation (Kevin) Docker experience:
- Try the Docker image and run the model locally or on the cloud (Jonathan)
- Katherine and Room attendee(s) discussed the performance of the simulation and the importance of having control over the resolution of the simulation. They also talked about the different time steps and the need to have a simulation step that defines the smallest step in the simulation. Room attendee(s) and Jonathan discussed the need to optimize and subsample in time and space for speed, but also the importance of doing full resolution everything to ensure that nothing is lost. Katherine discussed the importance of having control over the resolution of the simulation and the potential speedup that comes from shaping agents and memory. She emphasized the need for a full resolution view to avoid adding forcing on other time scales that could potentially mess up the simulation.
- Katherine discussed the importance of making technology choices and language choices that are being made, GPU versus CPU, C and Python, and R. Room attendee(s) discussed the importance of making sure that the work done is communicated and moves in the right direction. They also talked about the reasons why one would choose R over Python and vice versa. Jonathan mentioned PyCharm and GitHub copilot.
- Room attendee(s) discussed the importance of measuring performance and allowing researchers to make decisions about their comfort level.
- Room attendee(s) talked about the low level implementers and the people who are using what's in LASER to develop disease models. They discussed the expectations from the low-level implementers and what they need to know to add new features.
- Room attendee(s) discussed the low-level implementers and the people who use LASER to develop disease models. They identified three groups of users, low-level implementers, disease model developers, and black box users.
- Room attendee(s) discussed the possibility of using an AI copilot and a corpus of some existing LASER models to build a transmission function that takes advantage of LASER performance optimizations.
- Katherine and Room attendee(s) discussed the use of R over Python for inference and stats. They also talked about the calibration and uncertainty quantification processes in modeling software. No specific decisions were made.
- Room attendee(s) and Katherine discussed the challenges of understanding and mapping the architecture of a model. They also talked about the importance of ensuring that the internal state is efficiently accessible.
- Room attendee(s) discussed the difference between reactive interventions and reactive campaigns. They also talked about the usefulness of particle filtering in calibration and how it can help simulate real-world scenarios.
- Room attendee(s) discussed the challenges of particle filtering and the need for a SIM to show transmission from point A to point B regardless of the probability of it.
- Room attendee(s) and Katherine discussed the upcoming data submission deadline and what is required in an abstract. They talked about the possibility of submitting a research topic that leverages a model and simulations for Northern Nigeria.
Technology document:
- Update the document to reflect the discussion on the purpose, the technology choices, and the user scenarios (Katherine)
GEOMED 2024:
- Submit abstracts for oral or poster presentations on measles elimination scenarios and LASER framework (Katherine and Chris)
User scenarios:
- Noodle on the different levels of users (low-level implementers, disease model developers, black box users) and their needs and expectations (All)
Discussed Docker and LASER including Docker environments for the technology assessment activities. Discussed visualization options inspired by 3rd party tools (loosely coupled architecture between simulation engine and visualization).
- JB: MESA (about a month ago), all Python
- Python installation hassles
- CL: Docker could be our friend here
- asked ChatGPT to provide an SIR agent based spatial model "Hello, World"
- got something simple quickly, hit a wall with ChatGPT guided work - grid classes ... hit pause
- JB: CL? CL: performance didn't look good enough for our 200M+ scale scenarios
- opens a port or something which can send status information - Jupyter like?
- KM: ¿You run the model in IE or Firefox?
- KR: Worth it to do some research into probability generating functions?
- KM: Talked to Nikket about this, he seemed reticent. Nikket's work hasn't been spatial. Never needed to represent susceptibles, just infected to infected generation. Seemed like it might make certain things analytic. Avoid looping over all agents. I infected ... sample next generation infected ... connecting PGF not just locally but also across the network.
- JB: NetLogo
- Coursera course on NetLogo, YouTube presence, etc. - established
- Java (Scala?)
- JB: ¿How much is Java a non-starter for us? Netbeans, Eclipse, ..., organizational comfort with Java ecosystem?
- KM: If it actually does all the things, I would rather learn a language rather than build a new tool from scratch. If the thing exists, awesome, we can move on to using it for research. Otherwise, still worth a dive into to understand what has been done.
- found a country scale disease model with NetLogo: 50K agents w/MCW 100, 2 hours wallclock time, ...
- KM: doesn't seem useful from a performance perspective, EMOD could run 50k agents for 1k years?
- CL: Waxes eloquent (?) on GPUs...
- KM: 4 steps or layers on this path
- KM: more bigger, faster is new - new approaches - true of 10x speedups, 2x doesn't let me qualitatively change my approach. 10x means a month long run is now over the weekend, that's transformative
- Numba to C or Rust
- Two 10x steps perhaps: 1) NumPy → NumPy + Numba|C|Rust and 2) → GPU
- KR: Agents.jl
- array or dictionary of agents (more similar to EMOD structure - doesn't really map onto what we have been talking about)
- "pretty cool": does libraries, like graph library, to build up spatial model
- add on agents, different states, different states at different times, or an agent could be a node
- not sure that you can have different types of agents, maybe one agent type with flags for behavior
- haven't seen any GPU integration - was going to take a look at that - could handle it automatically? potentially really easy but lost access to GPU on the VM in the self-service portal
- Decision Dependencies Graph
- Required Features Table
- LASER Language/Technology Decision Criteria Document
- CL: agent consolidation by groups PowerPoint code
- (performance expectations/implementation algorithms)
- JB: moving EULA populations to a separate table/data structure
- (performance expectations/implementation algorithms)
- KR: looked into 8-bit vs. 64-bit operations after the previous bit packing conversation. Write-up here.
- (performance expectations)
- CL: started on a more accessible version of England & Wales measles data along with pre-calculated distance between population centers (the expensive part of gravity model calculations)
- (other ABM tools research/technology decisions)
- JB: started looking into the GAMA Platform
- (other ABM tools)
- KM: spoke with Alex about agricultural modeling and required model features:
- (desired scenarios)
The smaller scale model is agent based:
- currently I’m simulating a field worth of agents (about 6k), but with a more efficient model it might be nice to simulate a village worth of plants
- Agent properties - viral titre, level of symptom expression. They both increase at set rates and are represented/updated in a similar way to what you mention.
- spatial homogeneity in terms of agent spacing and variety (which would impact things like susceptibility)
- The important longer term temporal dynamics are changes in infection level and the spatial distribution of infection between years after replanting
- Additionally: there’s an explicit vector layer, explicit simulation of surveys, simulation of several management interventions which involve removing agents based on symptom expression/replacing random agents with uninfected agents/selecting agents based on symptom expression to use for replanting the field
- There is a dispersal kernel for vector movement, and when vectors move from infected to uninfected plants, the probability of a new infection is dependent on the viral titre of the infected plant.
The larger scale model is not agent based:
- The scale is all of SSA divided into roughly 1km cells
- The only agent properties tracked are the proportion of a cell that is susceptible/infected/removed at each time step
- Spatial heterogeneity comes from a raster for host density and a separate raster for vector density
- The interventions included are trade restrictions, clean seed introduction, and mass culling
- Infection spread is controlled by a dispersal kernel that is dependent on host and vector density
- Kevin McCarthy emphasized the need for a decision criteria document to guide technology selection and its timeline.
- The team discussed the balance between exploring new technologies and the necessity to make timely decisions.
- There was a concern about the potential risk of limiting technology options prematurely.
- Consideration of whether to attend an upcoming conference as learners or presenters, with an upcoming deadline for presentation submissions.
- Stressed the significance of properly scoping problems before coding and the potential requirement to present prototypes to management.
- Plans were made to continue discussions on the conference participation and presentation opportunities through the team's communication channel.
The transcript outlines a detailed discussion on modeling strategies for a population, focusing on handling non-epidemic individuals (EULAs) in simulations. Key points include:
- Experimentation with down sampling EULAs to manage computational load, including various methods for representing these individuals in the model.
- Consideration of maintaining accurate demographic information, such as age and mortality rates, despite aggregation techniques.
- Discussion on the pros and cons of separating EULA individuals into different data structures or tables for more efficient computation and analysis.
- Acknowledgment of the complexity added by having multiple ways to handle population data, with a focus on finding a balance between model accuracy and performance.
This conversation highlights the team's efforts to optimize their modeling approach while preserving the integrity and usability of their data.
The transcript from the LASER Core team's weekly meeting on February 15, 2024, revolves around discussions on the integration of artificial intelligence and machine learning technologies into their software and workflows. The conversation includes perspectives on enhancing software packages with AI tutors to improve usability, exploring the potential of AI in debugging, executing models, and guiding users through modifying underlying models. Participants express optimism about removing barriers for model installation and execution, emphasizing the importance of user-friendly interfaces and the role of AI in making model building and execution more intuitive. The dialogue touches on the need for comprehensive documentation and FAQs to support users, with a focus on leveraging AI to streamline these processes. There's an underlying theme of balancing technical ambitions with practical user support and ensuring the tools developed remain accessible and useful to a broad audience.
- all: iterating on LASER language/technology decision criteria document
- scenario proposals:
- Measles in England and Wales 1944-1964
- Multi-Node SEIR with a Part II of adding a re-infected individuals feature
- Single Node SIR
- still iterating on how to assess "approachability" of various languages and implementations
- some additional considerations, among others:
- availability of or ease of setting up development and modeling environments, e.g. notebook environments or GitHub codespaces
- AI assistance for model implementation (new features)
- AI assistance for model development (composing features into a model and parameterizing simulations)
- scenario proposals:
- Jonathan: I'm wrapping up an exploration of various ways of doing the EULA modeling, including downsampling with weights, and moving them into a separate fully sampled population, including putting that population into an on-disk database. Also exploring doing this a way that is abstracted enough to be reusable across (Python) modeling design choices. Added make-based pipeline for splitting the population to 2 -- modeled pop and EULA pop -- as a pre-modeling step. These EULA design details impact births and deaths and can't really be evaluated without fertility and mortality being in place.
The meeting transcript from January 24, 2024, primarily involved discussions among the LASER Core team members, Kevin McCarthy, Katherine Rosenfeld, and others (Jonathan Bloedow and Christopher Lorton), focusing on various technical aspects of disease modeling and simulations.
Key points discussed include:
Collaboration and Feedback from Other Teams: Kevin McCarthy mentioned reaching out to Jillian from the Cholera team and attending a malaria meeting to gather additional features for simulations. He also discussed getting feedback from the polio team, specifically from Steve [Kroiss], who emphasized the need for incorporating features like seasonality in transmission, migration, and a method for handling viral evolution.
Cohort Modeling and Distribution: There was a detailed discussion on cohort modeling and distributions, particularly in the context of epidemiologically "uninteresting" agents. The idea was to simplify simulations by grouping these agents and potentially reactivating them if they become relevant again.
Use of Different Technologies and Languages: The team discussed the importance of using idiomatic code in different programming languages (like Julia, MATLAB, Lisp) to implement common algorithms for the simulations.
Scenario Definition and Performance Benchmarking: Katherine Rosenfeld emphasized the need for defining scenarios for disease modeling, which would help in benchmarking performance across different technologies. She suggested creating detailed scenarios that include reactive interventions and other specific agent behaviors.
Handling of Epidemiologically Interesting Agents: The discussion delved into the technicalities of managing agents in simulations, particularly those that might become interesting again due to factors like waning immunity. Various approaches to modeling these agents were discussed.
Next Steps and Asynchronous Work: The team concluded with a plan for asynchronous follow-up, particularly on defining scenarios for simulations. Katherine Rosenfeld highlighted the importance of considering emergent phenomena and how they might be incorporated into the models.
In summary, the meeting focused on technical discussions about disease modeling, with a particular emphasis on how to efficiently manage different types of agents in simulations, the use of various programming languages and technologies, and the need for defining clear scenarios for benchmarking performance. The next steps involve further asynchronous collaboration to refine these ideas and integrate them into their modeling platform.
- bit packing - Katherine
- manipulating packed bits - hassle vs. payoff
- can we get all an agent's data into a tractable number of bits (32? 64?)
- expectations for performance gains
- link to Katherine's experiment showing a nice way to interact with packed data
- Jillian's comments on modeling cholera
- age and immune state
- lots of subpopulations
- cholera outbreaks look different - sporadic and localized
- surveillance and reactive interventions are necessary features for a model
- ¿Do we plan to integrate calibration into LASER or use an existing tool?
- feedback from the Malaria team
- not many scenarios that don't need heavy agent/heavy pathogen
- EMOD vector is their light model (includes genetics)
- maybe some of Prashanth's scenarios with vector migration and pathogen genetics could be LASER scenarios
- feedback from Steve on polio:
- need seasonally adjusted R0
- need migration
- need some viral evolution with changing R0/evolution tracking (more than Kevin's age adjusted transmissibility?)
- PolioSim has no spatial component. Which is easier - making PolioSim spatial or implementing polio in LASER?
- feedback from Hil on polio:
- preferential mixing
- multiple strain transmission
- more complex immuno-infection model (not just SIR but an immune state that maps onto your infectiousness)
- CL: Created issues in the LASER GitHub repository for investigating existing, alternative ABM tools. Addressing those issues is dependent on finishing the technology decision criteria document.
- CL & JB: worked on adding births and deaths to prototype models to support endemicity and start seeing periodic propagation beyond CCS nodes
- JB: diving into efficient EULA handling
- CL: took a quick, initial look at using OpenMP in JB's C extensions. First results were unimpressive.
- CL/JB/KM: nothing formalized yet but looking at SQL -> Pandas/Polars -> NumPy -> NumPy + Numba | NumPy + C implementations with the expectation that SQL and Pandas/Polars models are more approachable and easier to validate. Other implementations would build on the validated code to improve performance and still pass tests.
- KR: Wiki post on LLMs for learning and using software packages
- all: testing philosophy and the pros and cons of bit-identical test outputs vs. statistical tests
- CL: working on a vectorized implementation with vital dynamics and keeping agents in the same state, e.g., susceptible or infectious, in contiguous entries of the vectorized data structures. code Vector Consolidation.pptx
- CL: learned a little about the required S/N fraction for endemic transmission
- all: GEOMED 2024 "GEOMED is an international, interdisciplinary conference on spatial statistics, geographical epidemiology and public health. The GEOMED conference is held every 2 years since 1997, and brings together researchers from different disciplines: statisticians, geographers, epidemiologists, computer scientists and public health professionals."
- JB: exploring how to address non-disease mortality in EULAs if EULAs are consolidated into a few or just one agent with large MCW.
- JB: exploring KM's idea about putting EULAs in a separate table (SQL)
- all: ¿What should the LASER team bring to the IDM symposium in October? ¿What should the team try to get from participants about their approaches and challenges?
- KM to CL: comments about wavelets and spatial modeling are more about analysis tools than features expected in a spatial model. Waves/periodicity/etc. should be emergent properties of a sufficiently accurate model (features/population/connectivity/etc.).
- all: discussing approach to initial/input population
- pre-processing tools
- challenges here with joint distributions etc.
- burn-in
- other?
- pre-processing tools
- JB: propose a test scenario of NxM independent communities with size varying along one dimension and birthrate varying along another. Our models should show relationship between those two values and CCS (endemicity).
- KR: Sent around the E&W data. We need to coordinate on her EMOD model and expectations. Is the EMOD model the standard for out technology decision scenario?
- KR & CL: some low-level experiments with bit packing and processing different sized data 8-bit/16-bit/32-bit/64-bit (simple NumPy
sum()
of array of 8-bit values is slower thansum()
of array of same count of 64-bit values). - KR: Microbenchmarking examples on tasks like array manipulation and priority queues.
- KR: Setup an example GitHub codespace for Julia
- JB: SQL with Polars model implementation
- JB: Investigated EULAs (Epidemiologically Uninteresting Light Agents) in his models.
- JB: Implemented equivalent models in SQL/Polars/NumPy
- JB: Looked at Python implementation with performance sensitive functions calling out to compiled C.
- JB: Working on models with vital dynamics.
- JB: Looking at priority queues for scheduled events, e.g. RI for individuals.
- KR: Implemented Julia versions of SIR and SEIR models
- KR: Identified bug in CL's SIR model. CL fixed and verified fix.
- KR: Started LASER technology comparison repository
- KM: Started document on standard modeling scenario for comparison purposes (implementation and performance)
- KM: Reviewed earlier work on potential for measles eradication in Nigeria
- CL: Wrote up notes on Performance
- CL: Got spatial SEIR (Nigeria) working on Taichi. Taichi implementation worked unchanged on Windows w/CUDA and with one minor change on Ubuntu w/CUDA.
- CL: Took a stab at a Julia implementation of SEIR - didn't get impressive performance. Need to look at multi-threading and GPU computing with Julia.
- scientific validation tests
- Clinton's list of ABM tools
- technology assessment axes:
- performance on one or more common scenarios
- ease of development
- ease of use/ramp-up
- other?
- comprehensibility
- scalability
- AI readiness?
The meeting transcript you provided is from a weekly team meeting of the LASER Core team, dated January 17, 2024. Here is a high-level summary:
-
Technology Evaluation and Decision-Making: The team discusses evaluating and choosing technologies for their projects. There is an emphasis on conducting direct comparisons and performance tests between different technologies. Kevin McCarthy highlights the need for careful decision-making and suggests assembling a set of performance tests for direct comparison between technologies.
-
Experiences with Various Technologies: Team members share their experiences with different technologies, particularly Python and its various libraries like NumPy and Numba. There's a discussion about the feasibility of running models on CPUs and the successful use of Apple Silicon for specific tasks.
-
Consideration of Julia Language: Julia is considered as a potential technology choice. Katherine Rosenfeld expresses optimism about Julia, citing its speed and the evolving ecosystem. However, there are concerns about the learning curve associated with new languages and technologies, especially given the team's existing expertise in Python.
-
Debate Over Technology Adoption: The discussion includes the challenges of convincing people to adopt new technologies or programming languages. They discuss the difficulty in shifting from established methods to new ones, even when the new methods might be more efficient or powerful.
-
Integration of AI and Machine Learning: There's a segment where the role of AI and machine learning in their work is discussed. They consider how AI might assist in model development and configurations, potentially bridging skill gaps.
-
Practical Applications and Future Plans: The team discusses practical applications of their work, including running different models and simulations. They explore the idea of integrating AI more deeply into their toolset and consider how to make their technology choices more data-driven.
-
Collaboration and Sharing of Expertise: Throughout the meeting, there is a strong emphasis on collaboration, sharing of expertise, and learning from each other. The team members are encouraged to contribute their knowledge and perspectives to the decision-making process.
The meeting reflects a thoughtful and collaborative approach to technology decision-making, with a focus on performance, user experience, scalability, and the potential for integrating advanced technologies like AI.
- Several meetings between various team members, Kevin, Katherine, Jonathan, and Christopher since the IDM retreat and foundation week.
- GitHub repository seeded with template for a Python library (no code at this time on the main branch): InstituteforDiseaseModeling/laser: Light Agent Spatial modeling for ERadication (github.com)
- Christopher testing out some Numpy+Numba technology to see what is possible on reasonably current laptops. SIR, SEIR, and Spatial SEIR models in the
well-mixed-abc
branch: laser/tests at clorton/well-mixed-abc · InstituteforDiseaseModeling/laser (github.com) - Jonathan has been doing some detailed explorations and experiments on "dataframe-based" light-agent-spatial models, comparing and contrasting functionally equivalent implementations in SQL(ite), Polars, and NumPy. Getting performance metrics at various different scales between populations of 106 and 108. Everything spatial, where spatial always means "agent has a node attribute". Also been ramping up on Numba including numba.cuda. : laser/sql_modeling at sql_experiments · jonathanhhb/laser (github.com)
- As a first test for scientific validation, Katherine took a look at reproducing the Kermack-McKendrick result for the size of an outbreak in a fully susceptible population. In the SIR model, the final attack fraction doesn’t look quite right, see below. This is still under investigation. SEIR model appears correct.
- Over the Christmas break, Christopher looked at implementing the SEIR model with PyCUDA†. Initial assessment is that it is too technical and finicky – both setup and implementation – and would set the bar for user customization too high. However, Numba supports CUDA and might be useful for some targeted performance improvements on particular models.
- Kevin set some targets for measles eradication modeling. Specific scenarios for performance testing potentially to be based on replicating Kevin’s past Nigeria elimination scenarios from this paper:
- 100M-1B agents distributed into 1k+ subpopulations ranging from 10M person megacities to <1k person villages
- Agents need, at a minimum, age, accessibility label, and maybe an SES or nutrition status label
- Spatial heterogeneity in demography, RI, campaign schedules, and performance.
- Vaccine effect on individuals is age-dependent
- Reactive OBR necessary
- Temporal dynamics in demography and RI performance
- Christopher would like to survey other future users of LASER to get similar requirements to prevent going down an implementation path that inhibits or outright prevents use for additional scenarios
- Kevin to follow up with additional researchers for these requirements (polio, typhoid/cholera, light-agent malaria, et al.)
†Very, very, very rough repository. No readme, inconsistent setup requirements (some code plays with Numba but Numba isn’t in the requirements.txt). PyCUDA requires access to both the CUDA SDK and the Microsoft Compiler. It’s not clear to me how to get the Python process to run in an environment with all the correct settings for that, so the code brute-forces the working environment variables into the local environment. Almost certainly will not work on your machine.
- Design philosophy: composition over configuration, i.e., modelers should build a model with only the components they need (minimize cognitive load) rather than selectively enabling/disabling features from an array of features for a variety of diseases (larger cognitive load with extraneous information).
- If composition is the way to go, what infrastructure does LASER provide?
- What is the upper limit on size required for IDM needs? ~200M agents (Nigeria, Pakistan) ~1.4B agents (Africa, India, China)?
- Most scenarios have large “epidemiologically uninteresting” cohorts (post-exposure or vaccination, pre-debut for STIs, inactive-TB, etc.). What’s the design of cohorts to enable quicker compute and lower memory requirements?
- If the community or node implementation is pluggable, would it be useful to have compartmental model nodes or something like a neuronal network (not neural network) with activation (infection) thresholds and refractory periods (delay until subsequent re-infection is possible)?
- Kevin starting a document for us to collectively lay out technology decision-making process - defining specific test scenarios for performance comparisons across technologies, plus other non-perf considerations (ease of use, ease of dev, community, ...)
- Visualization is still an open question. Some options:
- Bryan Ressler’s vis-tools
- 3rd party tools
- LASER bespoke tools
LASER Teams channel is here: Light-Agent Spatial model for ERadication | General | Microsoft Teams