forked from ActivitySim/activitysim
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
211 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,211 @@ | ||
# Significant and Breaking Changes | ||
|
||
This document describes significant changes to ActivitySim. This includes | ||
major new features that may require modifications to existing model configurations | ||
or code to utilize, as well as breaking changes that may cause existing model | ||
configurations or code to fail to run correctly. | ||
|
||
## v1.3 | ||
|
||
### Logging | ||
|
||
The reading of YAML configuration files has been modified to use the "safe" reader, | ||
which prohibits the use of arbitrary Python code in configuration files. This is a | ||
security enhancement, but it requires some changes to the way logging is configured. | ||
|
||
In previous versions, the logging configuration file could contain Python code to | ||
place log files in various subdirectories of the output directory, which might | ||
vary for different subprocesses of the model, like this: | ||
|
||
```yaml | ||
logging: | ||
handlers: | ||
logfile: | ||
class: logging.FileHandler | ||
filename: !!python/object/apply:activitysim.core.config.log_file_path ['activitysim.log'] | ||
mode: w | ||
formatter: fileFormatter | ||
level: NOTSET | ||
``` | ||
In the new version, the use of `!!python/object/apply` is prohibited. Instead of using | ||
this directive, the `log_file_path` function can be invoked in the configuration file | ||
by using the `get_log_file_path` key, like this: | ||
|
||
```yaml | ||
logging: | ||
handlers: | ||
logfile: | ||
class: logging.FileHandler | ||
filename: | ||
get_log_file_path: activitysim.log | ||
mode: w | ||
formatter: fileFormatter | ||
level: NOTSET | ||
``` | ||
|
||
Similarly, previous use of the `if_sub_task` directive in the logging level | ||
configuration like this: | ||
|
||
```yaml | ||
logging: | ||
handlers: | ||
console: | ||
class: logging.StreamHandler | ||
stream: ext://sys.stdout | ||
level: !!python/object/apply:activitysim.core.mp_tasks.if_sub_task [WARNING, NOTSET] | ||
formatter: elapsedFormatter | ||
``` | ||
|
||
can be replaced with the `if_sub_task` and `if_not_sub_task` keys, like this: | ||
|
||
```yaml | ||
logging: | ||
handlers: | ||
console: | ||
class: logging.StreamHandler | ||
stream: ext://sys.stdout | ||
level: | ||
if_sub_task: WARNING | ||
if_not_sub_task: NOTSET | ||
formatter: elapsedFormatter | ||
``` | ||
|
||
For more details, see [logging](Logging). | ||
|
||
### Chunking | ||
|
||
Version 1.3 introduces a new "[explicit](Explicit-Chunking)" chunking mechanism. | ||
|
||
Explicit chunking is simpler to use and understand than dynamic chunking, and in | ||
practice has been found to be more robust and reliable. It requires no "training" | ||
and is activated in the top level model configuration file (typically `settings.yaml`): | ||
|
||
```yaml | ||
chunk_training_mode: explicit | ||
``` | ||
|
||
Then, for model components that may stress the memory limits of the machine, | ||
the user can specify the number of choosers in each chunk explicitly, either as an integer | ||
number of choosers per chunk, or as a fraction of the overall number of choosers. | ||
This is done by setting the `explicit_chunk` configuration setting in the model | ||
component's settings. If the `explicit_chunk` value is 0 or missing, then no | ||
chunking is applied for that component. The `explicit_chunk` values are ignored | ||
if the `chunk_training_mode` is not set to `explicit`. | ||
Refer to each model component's configuration documentation for details. | ||
|
||
Refer to code updates that implement explicit chunking for accessibility in | ||
[PR #759](https://github.com/ActivitySim/activitysim/pull/759), for | ||
vehicle type choice, non-mandatory tour frequency, school escorting, and | ||
joint tour frequency in [PR #804](https://github.com/ActivitySim/activitysim/pull/804), | ||
and all remaining interaction-simulate components in | ||
[PR #870](https://github.com/ActivitySim/activitysim/pull/870). | ||
|
||
### Automatic dropping of unused columns | ||
|
||
Variables that are not used in a model component are now automatically dropped | ||
from the chooser table before the component is run. Whether a variable is deemed | ||
as "used" is determined by a text search of the model component code and specification | ||
files for the variable name. Dropping unused columns can be disabled by setting | ||
[`drop_unused_columns`](activitysim.core.configuration.base.ComputeSettings.drop_unused_columns) | ||
to `False` in the [`compute_settings`](activitysim.core.configuration.base.ComputeSettings) | ||
for any model component, but by default this setting is `True`, as it can result in a | ||
significant reduction in memory usage for large models. | ||
|
||
Dropping columns may also cause problems if the model is not correctly configured. | ||
If it is desired to use this feature, but some required columns are being dropped | ||
incorrectly, the user can specify columns that should not be dropped by setting the | ||
[`protect_columns`](activitysim.core.configuration.base.ComputeSettings.protect_columns) | ||
setting under [`compute_settings`](activitysim.core.configuration.base.ComputeSettings). | ||
This allows the user to specify columns that should not be dropped, even if they are | ||
not apparently used in the model component. | ||
|
||
Code updates to drop unused columns are in | ||
[PR #833](https://github.com/ActivitySim/activitysim/pull/833) and to protect | ||
columns in [PR #871](https://github.com/ActivitySim/activitysim/pull/871). | ||
|
||
### Automatic conversion of string data to categorical | ||
|
||
Version 1.3 introduces a new feature that automatically converts string data | ||
to categorical data. This reduces memory usage and speeds up processing for | ||
large models. The conversion is done automatically for string columns | ||
in most chooser tables. | ||
|
||
See code updates in [PR #782](https://github.com/ActivitySim/activitysim/pull/782) | ||
and [PR #863](https://github.com/ActivitySim/activitysim/pull/863) | ||
|
||
### Alternatives preprocessors for trip destination. | ||
|
||
Added alternatives preprocessor in | ||
[PR #865](https://github.com/ActivitySim/activitysim/pull/869), | ||
and converted to separate preprocessors for sample (at the TAZ level) and | ||
simulate (at the MAZ level for 2 zone systems) in | ||
[PR #869](https://github.com/ActivitySim/activitysim/pull/869). | ||
|
||
### Per-component sharrow controls | ||
|
||
This version adds a uniform interface for controlling sharrow optimizations | ||
at the component level. This allows users to disable sharrow entirely, | ||
or to disable the "fastmath" optimization for individual components. | ||
Controls for sharrow are set in each component's settings under `compute_settings`. | ||
For example, to disable sharrow entirely for a component, use: | ||
|
||
```yaml | ||
compute_settings: | ||
sharrow_skip: true | ||
``` | ||
|
||
This overrides the global sharrow setting, and is useful if you want to skip | ||
sharrow for particular components, either because their specifications are | ||
not compatible with sharrow or if the sharrow performance is known to be | ||
poor on this component. | ||
|
||
When a component has multiple subcomponents, the `sharrow_skip` setting can be | ||
a dictionary that maps the names of the subcomponents to boolean values. | ||
For example, in the school escorting component, to skip sharrow for an | ||
OUTBOUND and OUTBOUND_COND subcomponent but not the INBOUND subcomponent, | ||
use the following settings: | ||
|
||
```yaml | ||
compute_settings: | ||
sharrow_skip: | ||
OUTBOUND: true | ||
INBOUND: false | ||
OUTBOUND_COND: true | ||
``` | ||
|
||
The `compute_settings` can also be used to disable the "fastmath" optimization. | ||
This is useful if the component is known to have numerical stability issues | ||
with the fastmath optimization enabled, usually when the component potentially | ||
works with data that includes `NaN` or `Inf` values. To disable fastmath for | ||
a component, use: | ||
|
||
```yaml | ||
compute_settings: | ||
fastmath: false | ||
``` | ||
|
||
Code updates that apply these settings are in | ||
[PR #824](https://github.com/ActivitySim/activitysim/pull/824). | ||
|
||
### Configuration validation | ||
|
||
Version 1.3 adds a configuration validation system using the Pydantic library. | ||
Previously, the YAML-based configuration files were allowed to contain arbitrary | ||
keys and values, which could lead to errors if the configuration was not correctly | ||
specified. The new validation system checks the configuration files for correctness, | ||
and provides useful error messages if the configuration is invalid. Invalid | ||
conditions include missing required keys, incorrect data types, and the presence | ||
of unexpected keys. Existing models may need to be cleaned up (i.e. extraneous settings | ||
in config files removed) to conform to the new validation system. | ||
|
||
See [PR #758](https://github.com/ActivitySim/activitysim/pull/758) for code updates. | ||
|
||
### Input checker | ||
|
||
Version 1.3 adds an input checker that verifies that the input data is consistent | ||
with expectations. This tool can help identify problems with the input data before | ||
the model is run, and can be used to ensure that the input data is correctly | ||
formatted and complete. | ||
|
||
See [PR #753](https://github.com/ActivitySim/activitysim/pull/753) for code updates. |