Skip to content

Commit

Permalink
skip looking up cache
Browse files Browse the repository at this point in the history
  • Loading branch information
tohtana committed May 31, 2024
1 parent 0c8e0e6 commit ea50fb5
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions deepspeed/inference/v2/engine_v2.py
Original file line number Diff line number Diff line change
Expand Up @@ -260,6 +260,7 @@ def setup_cached_sequence(self, uid: int, cached_length: int, block_ids: torch.T
seq.pre_forward(cached_length)
seq.post_forward()
seq.extend_kv_cache(block_ids)
seq.num_prefix_cache_blocks = len(block_ids)
self._state_manager.increment_ref_count(block_ids)
self._state_manager._kv_cache.allocate_blocks(block_ids)

Expand Down

0 comments on commit ea50fb5

Please sign in to comment.