deploy: d3d5a15

alan-turing-institute · Dec 18, 2024 · fdb293f · fdb293f
commit fdb293f
Show file tree

Hide file tree

Showing 250 changed files with 49,416 additions and 0 deletions.
diff --git a/.buildinfo b/.buildinfo
@@ -0,0 +1,4 @@
+# Sphinx build info version 1
+# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
+config: a5c126a7fa5a1a1c62cb1ed71d6991d4
+tags: 645f666f9bcd5a90fca523b33c5a78b7
diff --git a/.nojekyll b/.nojekyll
diff --git a/_downloads/3b423627fe21976c3c06e60f78e14ae1/LICENSE b/_downloads/3b423627fe21976c3c06e60f78e14ae1/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2024 The Alan Turing Institute
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/_images/009cb793045c08bddb95c2cea8541d0a115ef04d78fe426d61a2c6367667776e.png b/_images/009cb793045c08bddb95c2cea8541d0a115ef04d78fe426d61a2c6367667776e.png
diff --git a/_images/5829ac17b74e7de7d90eae86886e4ef97baaf20c37df8ea6228e0cf843085d8d.png b/_images/5829ac17b74e7de7d90eae86886e4ef97baaf20c37df8ea6228e0cf843085d8d.png
diff --git a/_images/5a8a6d61f6031a034b50272e3bc26a67bfc89e48b9cc22beda2371d4b7fc3daf.png b/_images/5a8a6d61f6031a034b50272e3bc26a67bfc89e48b9cc22beda2371d4b7fc3daf.png
diff --git a/_images/6a7b4b8340a9c96c76c609aa6bebf7cf2e2d807a94cc3c3afb8fcd415f00a960.png b/_images/6a7b4b8340a9c96c76c609aa6bebf7cf2e2d807a94cc3c3afb8fcd415f00a960.png
diff --git a/_images/9bae5145b448923873d25c906db9ec746d98cac73c41f8306295e379289e77b3.png b/_images/9bae5145b448923873d25c906db9ec746d98cac73c41f8306295e379289e77b3.png
diff --git a/_images/ade792a2f2d2ffdbb8d4ce77b28eec380016e662ec264e87bbe8fe999e034e79.png b/_images/ade792a2f2d2ffdbb8d4ce77b28eec380016e662ec264e87bbe8fe999e034e79.png
diff --git a/_images/bbce3c65d52231aa62ac3f54723fc9ae062977f70913e31afeb667be82d1696f.png b/_images/bbce3c65d52231aa62ac3f54723fc9ae062977f70913e31afeb667be82d1696f.png
diff --git a/_images/ca9604378c733976654e4ac7d612b2ae67482f5418cd9b8c1695bebe2b253a52.png b/_images/ca9604378c733976654e4ac7d612b2ae67482f5418cd9b8c1695bebe2b253a52.png
diff --git a/_images/d2a10660c6fcb9ba9d34b1bb41ca71027c2b13dfea493ff6e18c54df55110eda.png b/_images/d2a10660c6fcb9ba9d34b1bb41ca71027c2b13dfea493ff6e18c54df55110eda.png
diff --git a/_modules/autoemulate/compare.html b/_modules/autoemulate/compare.html
diff --git a/_modules/autoemulate/datasets.html b/_modules/autoemulate/datasets.html
diff --git a/_modules/autoemulate/emulators/conditional_neural_process.html b/_modules/autoemulate/emulators/conditional_neural_process.html
diff --git a/_modules/autoemulate/emulators/conditional_neural_process_attn.html b/_modules/autoemulate/emulators/conditional_neural_process_attn.html
diff --git a/_modules/autoemulate/emulators/gaussian_process.html b/_modules/autoemulate/emulators/gaussian_process.html
diff --git a/_modules/autoemulate/emulators/gaussian_process_mt.html b/_modules/autoemulate/emulators/gaussian_process_mt.html
diff --git a/_modules/autoemulate/emulators/gaussian_process_sklearn.html b/_modules/autoemulate/emulators/gaussian_process_sklearn.html
diff --git a/_modules/autoemulate/emulators/gradient_boosting.html b/_modules/autoemulate/emulators/gradient_boosting.html
diff --git a/_modules/autoemulate/emulators/light_gbm.html b/_modules/autoemulate/emulators/light_gbm.html
diff --git a/_modules/autoemulate/emulators/neural_net_sk.html b/_modules/autoemulate/emulators/neural_net_sk.html
diff --git a/_modules/autoemulate/emulators/polynomials.html b/_modules/autoemulate/emulators/polynomials.html
diff --git a/_modules/autoemulate/emulators/radial_basis_functions.html b/_modules/autoemulate/emulators/radial_basis_functions.html
diff --git a/_modules/autoemulate/emulators/random_forest.html b/_modules/autoemulate/emulators/random_forest.html
diff --git a/_modules/autoemulate/emulators/support_vector_machines.html b/_modules/autoemulate/emulators/support_vector_machines.html
diff --git a/_modules/autoemulate/simulations/epidemic.html b/_modules/autoemulate/simulations/epidemic.html
diff --git a/_modules/autoemulate/simulations/projectile.html b/_modules/autoemulate/simulations/projectile.html
diff --git a/_modules/index.html b/_modules/index.html
diff --git a/_sources/community/code-of-conduct.md b/_sources/community/code-of-conduct.md
@@ -0,0 +1,2 @@
+```{include} ../../CODE_OF_CONDUCT.md
+```
diff --git a/_sources/community/contributing-docs.md b/_sources/community/contributing-docs.md
@@ -0,0 +1,52 @@
+# Contributing to the docs
+
+We welcome all documentation contributions, from fixing small typos to adding comprehensive tutorials. This guide will help you get started.
+
+## Prerequisites
+
+Before contributing, please read our [contributing guide](contributing.md) to set up your development environment and understand our workflow.
+
+## Types of Documentation Contributions
+
+### 1. Fixing typos and small changes
+
+1. Navigate to the relevant file in the `docs/` directory
+2. Make your changes
+3. Build the docs locally to verify your changes:
+
+   ```bash
+   jupyter-book build docs --all
+   ```
+
+4. Open the generated file `docs/_build/html/index.html` in your browser to preview.
+
+### 2. Adding tutorials
+
+1. Create a new Jupyter notebook in `docs/tutorials/`
+2. Include:
+   - Clear introduction and objectives
+   - Step-by-step instructions
+   - Code examples
+3. Add your tutorial to the table of contents:
+   - Open `_toc.yml` in the docs/ directory
+   - Add an entry for your new tutorial
+4. Build and verify the docs as described above
+
+### 3. Updating API documentation
+
+The API documentation is generated from source code docstrings. There are two scenarios:
+
+#### Modifying existing API docs
+
+Simply update the docstring in the source code and rebuild:
+
+```bash
+jupyter-book build docs --all
+```
+
+#### Adding new API docs
+
+1. Create a new `.rst` file in `docs/community/reference/`
+2. Add the file to `_toc.yml`
+3. Ensure your source code has comprehensive docstrings
+4. Build the documentation
diff --git a/_sources/community/contributing-emulators.md b/_sources/community/contributing-emulators.md
@@ -0,0 +1,71 @@
+# Contributing emulators
+
+This guide explains how to contribute new emulator models to `AutoEmulate`.
+
+## Emulator structure
+
+All emulators in AutoEmulate are implemented as `scikit-learn` estimators, making them compatible with scikit-learn's cross-validation, grid-search, and pipeline functionality. Have a look at the [scikit-learn estimator developer guide](https://scikit-learn.org/1.5/developers/develop.html#rolling-your-own-estimator) for more details on how to implement a new emulator.
+
+**Note**: Keep in mind when contributing emuulators that AutoEmulate doesn't currently support time-series or spatial data.
+
+### Core Requirements
+
+Each emulator class must:
+
+1. Live in `autoemulate/emulators/`
+2. Inherit from `sklearn.base`'s `BaseEstimator` and `RegressorMixin`
+3. Implement the `fit` and `predict` methods
+4. Include these additional methods/properties:
+
+   - `get_grid_params()`: Returns a dictionary of parameter values for grid search over hyperparameters
+   - `model_name`: Property that returns the emulator name (usually `self.__class__.__name__`)
+   - `_more_tags()`: Defines emulator properties like multioutput support
+
+### Getting Started
+
+The easiest way to create a new emulator is to:
+
+1. Look at existing emulators in `autoemulate/emulators/` as templates
+2. Run the scikit-learn estimator tests `tests/test_estimators.py` early to catch any implementation issues
+3. Add your own tests in `tests/models/`
+
+### Naming Conventions
+
+The `model_name` property allows the emulator to be accessed with both long and short names:
+
+- Long name: The class name (e.g., "RadialBasisFunctions")
+- Short name: Uppercase letters from long name (e.g., "rbf")
+
+Make sure your chosen class name:
+
+- Doesn't conflict with existing emulators
+- Contains some uppercase letters for the short name
+- Is descriptive of the emulation technique
+
+## Testing emulators
+
+We use two types of tests:
+
+1. **Scikit-learn Test Suite**: Add your emulator to `tests/test_estimators.py` to verify scikit-learn compatibility. Not all tests need to pass - use `_more_tags()` to skip incompatible tests. See the [estimator tags overview](https://scikit-learn.org/1.5/developers/develop.html#estimator-tags) for details.
+
+2. **Custom Tests**: Add specific tests for your emulator in `tests/models/` to verify its core functionality (e.g., validating end-to-end functionality of components such as parameter search etc).
+
+## Registering an emulator
+
+After your emulator passes tests:
+
+1. Add it to `model_registry` in `autoemulate/emulators/__init__.py`
+2. Set `is_core=False` to make it available but not a default model
+
+## PyTorch emulators
+
+PyTorch emulators require special handling:
+
+1. Put the model architecture in `autoemulate/emulators/neural_networks/`
+2. Put the main emulator class in `autoemulate/emulators/`
+3. Use [skorch](https://skorch.readthedocs.io/) for scikit-learn compatibility:
+   - Create `self.model_` as `NeuralNetRegressor` instance
+   - Pass model architecture as first argument
+   - Use `self.model_` in `fit` and `predict` methods
+
+See existing PyTorch emulators like `conditional_neural_process.py` for examples.
diff --git a/_sources/community/contributing.md b/_sources/community/contributing.md
@@ -0,0 +1,2 @@
+```{include} ../../CONTRIBUTING.md
+```
diff --git a/_sources/community/faq/faq-contributors.md b/_sources/community/faq/faq-contributors.md
@@ -0,0 +1,67 @@
+# First-Time Contributors' Frequently Asked Questions
+
+**TODO**
+
+## Getting Started
+
+1. How can I contribute to AutoEmulate?
+   <!-- Overview of the ways to contribute, from code to documentation, and how to get started. -->
+
+2. What are the guidelines for contributing code?
+   <!-- Information on coding standards, the pull request process, and how contributions are reviewed. -->
+
+3. How do I choose what to work on for my first contribution?
+   <!-- Guidance on identifying beginner-friendly issues, selecting tasks based on personal expertise, or areas of the project that need the most help. -->
+
+4. What coding standards and practices does AutoEmulate follow?
+   <!-- Information on coding conventions, documentation standards, and testing practices contributors should adhere to. -->
+
+5. Are there any specific development tools or environments recommended for working on AutoEmulate?
+   <!-- Suggestions for IDEs, code editors, version control systems, or other tools that facilitate development and contribute to the project. -->
+
+## Making Contributions
+
+1. How do I submit a contribution, and what is the review process?
+   <!-- Step-by-step guide on creating pull requests, what happens after submission, how contributions are reviewed, and typical timelines for feedback. -->
+
+2. Can I contribute by writing documentation or tutorials, and how?
+   <!-- Details on how to contribute to the project's documentation, tutorial creation, or translation efforts, including style guides or templates to follow. -->
+
+3. What should I do if my pull request gets rejected or needs revision?
+   <!-- Advice on how to handle feedback on contributions, including how to make requested changes and resubmit for review. -->
+
+## Technical Questions
+
+1. How is the AutoEmulate project structured?
+   <!-- An introduction to the project's architecture and where contributors can find key components. -->
+
+2. How do I set up my development environment for AutoEmulate?
+   <!-- Steps to configure a local development environment, including any necessary tools or dependencies. -->
+
+3. How do I run tests for AutoEmulate?
+   <!-- Instructions on how to execute the project's test suite to ensure changes do not introduce regressions. -->
+
+## Community and Support
+
+1. Where can I ask questions if I'm stuck?
+   <!-- Information on where to find support, such as community forums, chat channels, or mailing lists. -->
+
+2. How does AutoEmulate handle contributions related to security issues?
+   <!-- Guidelines on reporting security vulnerabilities and how they are addressed by the project. -->
+
+3. Is there a code of conduct for contributors?
+   <!-- Details on the project's code of conduct, expectations for respectful and constructive interaction, and how to report violations. -->
+
+4. How can I get involved in decision-making or project planning as a contributor?
+   <!-- Explanation of how the project governance works, ways to participate in project roadmap discussions, and opportunities for contributors to influence development priorities. -->
+
+## Beyond Code Contributions
+
+1. Can I contribute without coding, for example, through design, marketing, or community management?
+   <!-- Overview of non-code contribution opportunities, including outreach efforts, event organisation, or community moderation. -->
+
+2. How does the project recognise or reward contributions?
+   <!-- Information on acknowledgment of contributions through all-contributors. -->
+
+3. Are there regular meetings or forums where contributors can discuss the project?
+   <!-- Schedule and formats of any regular contributor meetings, forums for discussion, or channels for real-time communication among contributors. -->
diff --git a/_sources/community/faq/faq-users.md b/_sources/community/faq/faq-users.md
@@ -0,0 +1,86 @@
+# First-Time Users' Frequently Asked Questions
+
+## General Questions
+
+1. What is `AutoEmulate`?
+   <!-- A brief description of what the package does, its main features, and its intended use case. -->
+   - A Python package that makes it easy to build emulators for complex simulations. It takes a set of simulation inputs `X` and outputs `y`, and automatically fits, optimises and evaluates various machine learning models to find the best emulator model. The emulator model can then be used as a drop-in replacement for the simulation, but will be much faster and computationally cheaper to evaluate. 
+
+2. How do I install `AutoEmulate`?
+   <!-- Step-by-step instructions on installing the package, including any dependencies that might be required. -->
+   - See the [installation guide](../../getting-started/installation.md) for detailed instructions.
+
+3. What are the prerequisites for using `AutoEmulate`?
+   <!-- Information on the knowledge or data required to effectively use AutoEmulate, such as familiarity with Python, machine learning concepts, or specific data formats. -->
+   - `AutoEmulate` is designed to be easy to use. The user has to first generate a dataset of simulation inputs `X` and outputs `y`, and optimally have a basic understanding of Python and machine learning concepts.
+
+## Usage Questions
+
+1. How do I start using `AutoEmulate` with my simulation?
+   <!-- A simple example to get a new user started, possibly pointing to more detailed tutorials or documentation. -->
+   - See the [getting started guide](../../getting-started/quickstart.ipynb) or a more [in-depth tutorial](../../tutorials/01_start.ipynb).
+
+2. What kind of data does `AutoEmulate` need to build an emulator?
+   <!-- Clarification on the types of datasets suitable for analysis, including data formats and recommended data sizes. -->
+
+   - `AutoEmulate` takes simulation inputs `X` and simulation outputs `y` to build an emulator.`X` is an ndarray of shape `(n_samples, n_parameters)` and `y` is an ndarray of shape `(n_samples, n_outputs)`. Each sample here is a simulation run, so each row of `X` corresponds to a set of input parameters and each row of `y` corresponds to the corresponding simulation output. Currently, all inputs and outputs should be numeric, and we don't support missing data.
+
+   - All models work with multi-output data. We have optimised `AutoEmulate` to work with smaller datasets (in the order of hundreds to thousands of samples). Training emulators with large datasets (hundreds of thousands of samples) may currently require a long time and is not recommended.
+
+3. How do I interpret the results from `AutoEmulate`?
+   <!-- Guidance on understanding the output of the software, including any metrics or visualizations it produces. -->
+   - See the [tutorial](../../tutorials/01_start.ipynb) for an example of how to interpret the results from `AutoEmulate`. Briefly, `X` and `y` are first split into training and test sets. Cross-validation and/or hyperparameter optimisation are performed on the training data. After comparing the results from different emulators, the user can evaluate the chosen emulator on the test set with `AutoEmulate.evaluate_model()`, and plot test set predictions with `AutoEmulate.plot_model()`, see [autoemulate.compare](../../reference/compare.rst) module for details.
+
+   - An important thing to note is that the emulator can only be as good as the data it was trained on. Therefore, the experimental design (on which points the simulation was evaluated) is key to obtaining a good emulator.
+
+4. Can I use `AutoEmulate` for commercial purposes?
+   <!-- Information on licensing and any restrictions on use. -->
+   - Yes. It's licensed under the MIT license, which allows for commercial use. See the [license](../../../LICENSE) for more information.
+
+## Advanced Usage
+
+1. Does AutoEmulate support parallel processing or high-performance computing (HPC) environments?
+   <!-- Details on the software's capabilities to leverage multi-threading, distributed computing, or HPC resources to speed up computations. -->
+   - Yes, [AutoEmulate.setup()](../../reference/compare.rst) has an `n_jobs` parameter which allows to parallelise cross-validation and hyperparameter optimisation.
+
+2. Can AutoEmulate be integrated with other data analysis or simulation tools?
+   <!-- Information on APIs, file formats, or protocols that facilitate the integration of AutoEmulate with other software ecosystems. -->
+   - `AutoEmulate` takes simple `X` and `y` ndarrays as input, and returns emulator models that can be saved and loaded with `joblib`. All emulators are written as scikit learn estimators, so they can be used like any other scikit learn model in a pipeline.
+
+## Data Handling
+
+1. What are the best practices for data preprocessing before using `AutoEmulate`?
+   <!-- Tips and recommendations on preparing data, including normalisation, dealing with missing values, or data segmentation. -->
+   - The user will typically run their simulation on a selected set of input parameters (-> experimental design) using a latin hypercube or other sampling method. `AutoEmulate` currently needs all inputs to be numeric and we don't support missing data. By default, `AutoEmulate` will scale the input data to zero mean and unit variance, and there's the option to do dimensionality reduction in `setup()`.
+
+2. How does AutoEmulate handle large datasets?
+   <!-- Advice on managing large-scale data analyses, potential memory management features, or ways to streamline processing. -->
+   - `AutoEmulate` is optimised to work with smaller datasets (in the order of hundreds to thousands of samples). Training emulators with large datasets (hundreds of thousands of samples) may currently require a long time and is not recommended. Emulators are created because it's expensive to evaluate the simulation, so we expect most users to have a relatively small dataset.
+
+## Troubleshooting
+
+1. What common issues might I encounter when using `AutoEmulate`, and how can I solve them?
+   <!-- A list of frequently encountered problems with suggested solutions, possibly linked to a more extensive troubleshooting guide. -->
+   - `AutoEmulate.setup()` has a `log_to_file` option to log all warnings/errors to a file. It also has a `verbose` option to print more information to the console. If you encounter an error, please open an issue (see below).
+
+2. How can I report a bug or request a feature in `AutoEmulate`?
+   <!-- Instructions on the proper channels for reporting issues or suggesting enhancements, including any templates or information to include. -->
+   - You can report a bug or request a new feature through the [issue templates](https://github.com/alan-turing-institute/autoemulate/issues/new/choose) in our GitHub repository. Head on over there and choose one of the templates for your purpose and get started.
+
+## Community and Learning Resources
+
+1. Are there any community projects or collaborations using `AutoEmulate` I can join or learn from?
+   <!-- Information on community-led projects, study groups, or collaborative research initiatives involving AutoEmulate. -->
+   - Reach out to Martin ([email](mailto:[email protected])) or Kalle ([email](mailto:[email protected])) for more information.
+
+2. Where can I find tutorials or case studies on using `AutoEmulate`?
+   <!-- Directions to comprehensive learning materials, such as video tutorials (if we want to record that), written guides, or published research papers using AutoEmulate. -->
+   - See the [tutorial](../../tutorials/01_start.ipynb) for a comprehensive guide on using the package.
+
+3. How can I stay updated on new releases or updates to AutoEmulate?
+   <!-- Guidance on subscribing to newsletters when/if we will have that, community calls if we start that, following the project on social media if we want to create those platforms, or joining community forums/Slack once we have that ready... -->
+   - Watch the [AutoEmulate repository](https://github.com/alan-turing-institute/autoemulate).
+
+4. What support options are available if I need help with AutoEmulate?
+   <!-- Overview of support resources, including documentation, community forums/Slack when we have that ready... -->
+   - Please open an issue or contact the maintainer ([email](mailto:[email protected])) directly.
diff --git a/_sources/community/faq/index.md b/_sources/community/faq/index.md
@@ -0,0 +1,30 @@
+# Frequently Asked Questions
+
+Welcome to the AutoEmulate Frequently Asked Questions (FAQ) page! This document is designed to assist both first-time users who are diving into AutoEmulate for their projects and new contributors looking to support the development of this innovative software. Whether you are exploring how to set up and run your first simulation, troubleshoot common issues, or you're interested in enhancing AutoEmulate with your contributions, this FAQ is here to provide you with quick and clear answers.
+
+Our goal is to make your experience with AutoEmulate as smooth and rewarding as possible. To facilitate this, we've compiled a list of questions and answers based on common queries we receive from our community. These are divided into sections tailored to users and contributors, alongside general information about AutoEmulate, to help you find the answers you need efficiently.
+
+If you're unsure where to start, we recommend using the search function (Ctrl+F or Command+F on most browsers) to quickly navigate to relevant topics. Additionally, each section is designed to be self-contained, allowing you to jump directly to the information that interests you the most.
+
+Remember, this FAQ is a collaborative effort, reflecting the inquiries and collective knowledge of our vibrant community. We welcome feedback and suggestions for new questions that you believe should be included. Together, we can make this resource even more valuable for everyone involved with AutoEmulate.
+
+::::{grid} 1 1 2 2
+:gutter: 2
+
+:::{grid-item-card}
+:link: faq-users
+:link-type: doc
+**Users**.
+
+Are you a new user of AutoEmulate? Check out the frequently asked questions here.
+:::
+
+:::{grid-item-card}
+:link: faq-contributors
+:link-type: doc
+**Contributors**.
+
+Are you a new contributor to AutoEmulate's codebase? Check out the frequently asked questions here.
+:::
+
+::::