Single environment performance on GPU worse than on CPU #280

nico-bohlinger · 2022-10-26T16:32:41Z

nico-bohlinger
Oct 26, 2022

I want to use a single(!) ant environment for my project and checked the steps per seconds with this script:

import time
import functools
import gym
from brax import envs
import jax

jax.config.update("jax_platform_name", "cpu")

entry_point = functools.partial(envs.create_gym_env, env_name='ant')
gym.register('brax-ant-v0', entry_point=entry_point)


nr_envs = 1
nr_steps = 1000
total_steps = nr_envs * nr_steps
gym_env = gym.make("brax-ant-v0", batch_size=nr_envs)

obs = gym_env.reset()

key = jax.random.PRNGKey(42)

key, subkey = jax.random.split(key)
jax.random.uniform(subkey, gym_env.action_space.shape)

jit_env_step = jax.jit(gym_env.step)
obs, reward, done, info = jit_env_step(gym_env.action_space.sample())

before = time.time()

for _ in range(nr_steps):
    key, subkey = jax.random.split(key)
    action = jax.random.uniform(subkey, gym_env.action_space.shape)
    obs, rewards, done, info = jit_env_step(action)

duration = time.time() - before
print(f'time for {total_steps} steps: {duration:.2f}s ({int(total_steps / duration)} steps/sec)')

Using my CPU for jitting I get around 2800 steps per second, which is better than what I would get with a MuJoCo ant environment. But if I comment out the line for putting jax in CPU mode and use my GPU I only get 700 steps per second.
Jitting with the GPU is amazing for many environments but seems to be not that great with only one environment. On the other hand using the GPU is far better for the RL algorithm I want to use later

Is there a way of improving single environment performance on GPU?
Or is there maybe a way of efficiently jitting the environment with the CPU and still be using the GPU for my RL algorithm in Jax?

btaba · 2022-11-09T14:50:29Z

btaba
Nov 9, 2022
Maintainer

Hi @nico-bohlinger , I think the lower steps-per-sec is somewhat expected for a single environment, but you'd be able to batch the envs on GPU and get a higher overall steps-per-sec. You will want to ask on the gym repo how to batch the envs, but in brax it would look something like:

env = VmapWrapper(env)
rngs = jax.random.split(jax.random.PRNGKey(seed), batch_size)
env.reset(rng)
jit_env_step = jax.jit(env.step)
jit_env_step(action)  # action is [batch_size, n_action]

You likely want to keep the RL algo running on the same device as the env.step to avoid data transfers btwn the host and the gpu/tpu device

The agents in https://github.com/google/brax/tree/main/brax/training/agents are a great reference

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Single environment performance on GPU worse than on CPU #280

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Single environment performance on GPU worse than on CPU #280

nico-bohlinger Oct 26, 2022

Replies: 1 comment

btaba Nov 9, 2022 Maintainer

nico-bohlinger
Oct 26, 2022

btaba
Nov 9, 2022
Maintainer