Documentation for SoundSpaces 2.0

SoundSpaces 2.0 builds based on the RLRAudioPropagation library and Habitat-Sim to provide both audio-visual renderings. RLRAudioPropagation is a bi-directional ray tracer based audio simulator. Given a source location, listener location, scene geometry (3D mesh), audio materials, and some parameters (described below) it will simulate how audio waves travel from the source to arrive at the listener. The output of this simulation is an impulse response for the listener.

The C++ implementation is exposed for python users via pybind11. This document explains the various python APIs, structs, and enums. Also see the relevant Habitat-sim python API doc pages.

Please refer to the installation doc doc for information on how to build SoundSpaces 2.0.

Citation

If you find this work useful in your research, please cite the following paper:

@article{chen22soundspaces2,
  title     =     {SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning},
  author    =     {Changan Chen and Carl Schissler and Sanchit Garg and Philip Kobernik and Alexander Clegg and Paul Calamia and Dhruv Batra and Philip W Robinson and Kristen Grauman},
  journal   =     {NeuriPS 2022 Datasets and Benchmarks Track},
  year      =     {2022}
}

List of sections

Acoustics configuration - RLRAudioPropagationConfiguration()
Channel layout - RLRAudioPropagationChannelLayout()
Audio sensor specs - AudioSensorSpec()
APIs
Steps to run audio simulation in python

- Acoustics configuration

The RLRAudioPropagationConfiguration() exposes various configuration options that can be used to customize the audio simulation. This section describes the available config settings including data types and default values.

Config name	Data Type	Default Value	Usage
sampleRate	int	44100	Sample rate for the simulated audio
frequencyBands	int	4	Number of frequency bands in the audio simulation
directSHOrder	int	3	The spherical harmonic order used for calculating direct sound spatialization for non-point sources (those with non-zero radii). It is not recommended to go above order 9.
indirectSHOrder	int	1	The spherical harmonic order used for calculating the spatialization of indirect sound (reflections, reverb). It is not recommended to go above order 5. Increasing this value requires more rays to be traced for the results to converge properly, and uses substantially more memory (scales quadratically).
threadCount	int	1	Number of CPU thread the simulation will use
updateDt	float	0.02f	Simulation time step
irTime	float	4.f	Maximum render time budget for the audio simulation
unitScale	float	1.f	Unit scale for the scene. Mesh and positions are multiplied by this factor
globalVolume	float	0.25f	Total initial pressure value
indirectRayCount	int	5000	Number of indirect rays that the ray tracer will use
indirectRayDepth	int	200	Maximum depth of each indirect ray cast by the ray tracer
sourceRayCount	int	200	Number of direct rays that the ray tracer will use
sourceRayDepth	int	10	Maximum depth of direct rays cast by the ray tracer
maxDiffractionOrder	int	10	The maximum number of edge diffraction events that can occur between a source and listener. This value cannot exceed 10 (compile-time limit).
direct	bool	true	Enable contribution from the direct rays
indirect	bool	true	Enable contribution from the indirect rays
diffraction	bool	true	Enable diffraction for the simulation
transmission	bool	true	Enable transmission of rays
meshSimplification	bool	false	Uses a series of mesh simplification operations to reduce the mesh complexity for ray tracing. Vertex welding is applied, followed by simplification using the edge collapse algorithm.
temporalCoherence	bool	false	Turn on/off temporal smoothing of the impulse response. This uses the impulse response from the previous simulation time step as a starting point for the next time step. This reduces the number of rays required by about a factor of 10, resulting in faster simulations, but should not be used if the motion of sources/listeners is not continuous.
dumpWaveFiles	bool	false	Write the wave files for different bands. Will be writted to the AudioSensorSpec's outputDirectory
enableMaterials	bool	true	Enable audio materials
writeIrToFile	bool	false	Write the final impulse response to a file. Will be writted to the AudioSensorSpec's outputDirectory

- Channel layout

This section describes the channel layout struct, which defines what the output will look like.

Config name	Data Type	Default Value	Usage
channelType	enum	RLRAudioPropagationChannelLayoutType.Binaural	Channel type for the simulated audio
channelCount	int	2	Number of output channels in simulated audio

RLRAudioPropagationChannelLayoutType

The channel layout describes how the audio output will be experienced by the listener. Lets look at channel layout types that are currently supported.

Enum	Usage
Unknown	Unknown channel layout type
Mono	Monaural channel layout that does not have any spatial information. This layout usually has 1 channel
Binaural	Channel layout with 2 channels that spatializes audio using an HRTF
Ambisonics	Channel layout that encodes fully spherical spatial audio as a set of spherical harmonic basis function coefficients

- APIs

The audio sensor is implemented in C++ and exposed to python via pybind11. Import the following to get access to the audio sensor:

python imports
import habitat_sim
import habitat_sim.sim

The acoustic sensor spec is part of habitat_sim

struct/enum in habitat_sim	notes
habitat_sim.AudioSensorSpec()	acoustic sensor spec

To call APIs on the audio sensor, get access to the audio sensor object using the uuid.

APIs for audio_sensor	notes
audio_sensor = sim.get_agent(0)._sensors["audio_sensor"]	get the audio sensor object from the habitat sim object
audio_sensor.setAudioSourceTransform(np.array([x, y, z]))	set the audio source location where x,y,z are floats
audio_sensor.reset()	Reset the simulation object to restart from a fresh context. This is the same as deleting the audio sensor and re-creating it.

Relevant APIs on the Simulator object

APIs for the Habitat Simulator object	notes
sim.add_sensor(audio_sensor_spec)	Add the audio sensor. This is similar to adding any other sensors
obs = sim.get_sensor_observations()["audio_sensor"]	Get the impulse response. obs is a n-d array where n = channel count

- Steps to run audio simulation in python

Please see the jupyter notebook for an example of how to use the python audio sensor. Follow these steps and refer to the python script for the code.

Create the habitat sim object and configuration.
Create the AudioSensorSpec().
Set the acoustic configuration (RLRAudioPropagationConfiguration) object. Set the various simulation parameters.
Set the channel layout (RLRAudioPropagationChannelLayout).
Add the audio sensor spec to the simulation. This will create the C++ AudioSensor object.
Get the audio_sensor object from the list of sensors on the agent. The identifier is set under AudioSensorSpec -> uuid config.
Set the location of the audio source by calling audio_sensor.setAudioSourceTransform
Run the simulation step and get the audio sensor output sim.get_sensor_observations()["audio_sensor"]. Use the uuid defined. The output is a n-d array of floats where n is the channel count defined in RLRAudioPropagationChannelLayout

- Some notes on the audio simulation

The acoustic simulation is based on Monte-Carlo path tracing. The simulation is stochastic and the results will vary slightly between runs. The results will also vary based on the number of rays traced. Increasing the number of rays will increase the accuracy of the simulation, but will also increase the simulation time.
By default, the height of the sensor is on the ground. If you want to place the sound source at certain height, you'll need to set the height on Y axis.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SoundSpaces2.md

SoundSpaces2.md

Documentation for SoundSpaces 2.0

Citation

List of sections

- Acoustics configuration

- Channel layout

RLRAudioPropagationChannelLayoutType

- APIs

- Steps to run audio simulation in python

- Some notes on the audio simulation

Files

SoundSpaces2.md

Latest commit

History

SoundSpaces2.md

File metadata and controls

Documentation for SoundSpaces 2.0

Citation

List of sections

- Acoustics configuration

- Channel layout

RLRAudioPropagationChannelLayoutType

- APIs

- Steps to run audio simulation in python

- Some notes on the audio simulation