Skip to content

Commit

Permalink
Introduce TreapList (#12)
Browse files Browse the repository at this point in the history
`TreapList` is a Treap-based `PersistentList` implementation which
significantly outperforms both `ArrayList` and the reference
`PersistentList` implementation from `kotlinx.collections.immutable` for
most operations. Nearly every `PersistentList` operation has a log-time
implementation, including things like inserting one list in the middle
of another list.

`TreapList` is notably a rather efficient persistent Deque, with
O(log(N)) insertion and removal at both ends of the list.

## Design

`TreapList` is organized as a Treap, with left/right being determined by
the order of the elements in the list, and with randomly-assigned node
priorities to balance the tree. This is different from how we assign
priorities in `TreapMap`/`TreapSet`, where we hash the element values to
get the priority; because `TreapList` can contain multiple elements with
the same value, we must randomly assign priorities to ensure that, say,
a list of 1,000,000 entries with the same value won't just end up being
a linked list. If we randomly assigned priorities in the set/map
implementations, it would complicate unions, intersections, merges, etc.
- but those operations do not apply to Lists, so random priorities are
fine here.

Also unlike the set/map implementations, each `TreapList` node tracks
the size of the sublist represented by that node, enabling log-time
indexing into the list.

`TreapList` is a little like a probabilistically balanced
[Rope](https://en.wikipedia.org/wiki/Rope_(data_structure)), but as an
arbitrary list.

## Performance

For most operations, `TreapList` outperforms `ArrayList` (when used as
an immutable list), as well as the reference `PersistentList`
implementation from `kotlinx.collections.immutable`. The most notable
exception is the `get` operation, which for `ArrayList` is a simple
array indexing operation:


![image](https://github.com/Certora/collections/assets/7407587/b6b1d85d-a2f7-44c5-b403-6495e816f87f)

`TreapList` generally matches or beats `ArrayList` when adding a single
element to the end of the list (producing a new list), but is beaten by
`kotlinx.collections.immutable`:


![image](https://github.com/Certora/collections/assets/7407587/8f073ef4-fb96-4472-8b02-f0a73929930a)

However, for most other list operations, `TreapList` wins handily. For
example, insertion of a single item at the front of the list:


![image](https://github.com/Certora/collections/assets/7407587/b28fd2ac-8cfc-466e-9d21-1733746e93dd)

Replacing the value at a given index:


![image](https://github.com/Certora/collections/assets/7407587/6020ae6a-d93b-4516-ac50-273dc3b40208)

Or appending one list to another list:


![image](https://github.com/Certora/collections/assets/7407587/b0d66ed5-09d3-4781-97d4-c579b4ffea09)

More benchmarks are available; you can run `gradlew listBenchmark` to
run them all.

## Memory Usage

One disadvantage of `TreapList` vs the alternatives is that a single
`TreapList` of a given size is quite a lot larger than the equivalent
`ArrayList` or `kotlinx.collections.immutable` list:


![image](https://github.com/Certora/collections/assets/7407587/f8b6f575-b237-4913-b46b-ce29909e0aea)

However, `TreapList` is better able to re-use allocations between
"versions" of a list. For example, consider the scenario where we add
one item at a time, in the middle of the list, retaining all
intermediate results, and adding up the total size:


![image](https://github.com/Certora/collections/assets/7407587/42cd6dd4-5433-44fe-a196-86bce863eb42)

We can see that in this extreme case, `TreapList` uses much less memory
than the other alternatives.

Real heap usage will of course depend on how much the specific use case
is able to take advantage of the increased allocation sharing.
  • Loading branch information
ericeil authored Feb 8, 2024
1 parent 38f10b6 commit 315e3db
Show file tree
Hide file tree
Showing 46 changed files with 1,822 additions and 582 deletions.
12 changes: 10 additions & 2 deletions benchmarks/build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,12 @@ benchmark {

register("compare") {
param("size", "10", "1000", "10000")
param("implementation", "hash_map", "hamt", "treap")
param("implementation", "java", "treap")
}

register("named") {
param("size", "10", "1000", "10000")
param("implementation", "hash_map", "hamt", "treap")
param("implementation", "java", "treap")
include("${project.findProperty("benchmark")}")
}

Expand All @@ -62,7 +62,15 @@ benchmark {
include("immutableMap.ParallelUpdateValues")
}

register("list") {
param("size", "1", "10", "1000", "10000")
param("implementation", "treap", "java")

include("benchmarks.immutableList")
}

configureEach {
reportFormat = "csv"
warmups = 5
iterations = 10
iterationTime = 100
Expand Down
33 changes: 33 additions & 0 deletions benchmarks/src/main/kotlin/benchmarks/FakePersistentList.kt
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
package benchmarks

import kotlinx.collections.immutable.*

class FakePersistentList<T>(val value: List<T>) : PersistentList<T>, List<T> by value {
class Builder<T>(val value: MutableList<T>) : PersistentList.Builder<T>, MutableList<T> by value {
override fun equals(other: Any?) = value == other
override fun hashCode() = value.hashCode()
override fun build() = FakePersistentList(value)
}

override fun equals(other: Any?) = value == other
override fun hashCode() = value.hashCode()

override fun builder() = Builder(value.toMutableList())
override fun clear() = FakePersistentList(emptyList<T>())

override fun add(element: T): PersistentList<T> = FakePersistentList(value + element)
override fun add(index: Int, element: T): PersistentList<T> = FakePersistentList(value.toMutableList().apply { add(index, element) })
override fun addAll(elements: Collection<T>): PersistentList<T> = FakePersistentList(value + elements)
override fun addAll(index: Int, c: Collection<T>): PersistentList<T> = FakePersistentList(value.toMutableList().apply { addAll(index, c) })
override fun remove(element: T): PersistentList<T> = FakePersistentList(value - element)
override fun removeAll(predicate: (T) -> Boolean): PersistentList<T> = FakePersistentList(value.filterNot(predicate))
override fun removeAll(elements: Collection<T>): PersistentList<T> = FakePersistentList(value - elements)
override fun removeAt(index: Int): PersistentList<T> = FakePersistentList(value.toMutableList().apply { removeAt(index) })
override fun retainAll(elements: Collection<T>): PersistentList<T> = FakePersistentList(value.filter { it !in elements})
override fun set(index: Int, element: T): PersistentList<T> = FakePersistentList(value.toMutableList().apply { set(index, element) })

override fun subList(fromIndex: Int, toIndex: Int): ImmutableList<T> =
super<PersistentList>.subList(fromIndex, toIndex)
}

fun <T> fakePersistentListOf(): PersistentList<T> = FakePersistentList(emptyList<T>())
6 changes: 3 additions & 3 deletions benchmarks/src/main/kotlin/benchmarks/hashCodeTypes.kt
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ private fun generateIntWrappers(hashCodeType: String, size: Int): List<IntWrappe
fun generateKeys(hashCodeType: String, size: Int) = generateIntWrappers(hashCodeType, size)
fun generateElements(hashCodeType: String, size: Int) = generateIntWrappers(hashCodeType, size)

const val ORDERED_HAMT_IMPL = "ordered_hamt"
const val HAMT_IMPL = "hamt"
const val KOTLIN_IMPL = "kotlin"
const val KOTLIN_ORDERED_IMPL = "kotlin_ordered"
const val TREAP_IMPL = "treap"
const val HASH_MAP_IMPL = "hash_map"
const val JAVA_IMPL = "java"
60 changes: 60 additions & 0 deletions benchmarks/src/main/kotlin/benchmarks/immutableList/Add.kt
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
/*
* Modified from the kotlinx.collections.immutable sources, which contained the following notice:
* Copyright 2016-2019 JetBrains s.r.o.
* Use of this source code is governed by the Apache 2.0 License that can be found in the LICENSE.txt file.
*/

package benchmarks.immutableList

import benchmarks.*
import kotlinx.collections.immutable.*
import kotlinx.benchmark.*

@State(Scope.Benchmark)
open class Add {
@Param(KOTLIN_IMPL, TREAP_IMPL, JAVA_IMPL)
var implementation = ""

@Param(BM_1, BM_10, BM_100, BM_1000, BM_10000, BM_100000, BM_1000000, BM_10000000)
var size: Int = 0

var initial = persistentListOf<String>()

@Setup
fun prepare() {
initial = persistentListAdd(implementation, size)
}

@Benchmark
fun addLast(): ImmutableList<String> {
return initial.add("another element")
}

/**
* Adds [size] - 1 elements to an empty persistent list
* and then inserts one element at the beginning.
*
* Measures mean time and memory spent per `add` operation.
*
* Expected time: nearly constant.
* Expected memory: nearly constant.
*/
@Benchmark
fun addFirst(): ImmutableList<String> {
return initial.add(0, "another element")
}

/**
* Adds [size] - 1 elements to an empty persistent list
* and then inserts one element at the middle.
*
* Measures mean time and memory spent per `add` operation.
*
* Expected time: nearly constant.
* Expected memory: nearly constant.
*/
@Benchmark
fun addMiddle(): ImmutableList<String> {
return initial.add(initial.size / 2, "another element")
}
}
107 changes: 107 additions & 0 deletions benchmarks/src/main/kotlin/benchmarks/immutableList/AddAll.kt
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
/*
* Modified from the kotlinx.collections.immutable sources, which contained the following notice:
* Copyright 2016-2019 JetBrains s.r.o.
* Use of this source code is governed by the Apache 2.0 License that can be found in the LICENSE.txt file.
*/

package benchmarks.immutableList

import benchmarks.*
import kotlinx.collections.immutable.ImmutableList
import kotlinx.collections.immutable.persistentListOf
import kotlinx.benchmark.*

@State(Scope.Benchmark)
open class AddAll {
@Param(KOTLIN_IMPL, TREAP_IMPL, JAVA_IMPL)
var implementation = ""

@Param(BM_1, BM_10, BM_100, BM_1000, BM_10000, BM_100000, BM_1000000, BM_10000000)
var size: Int = 0

private var initialHalf = persistentListOf<String>()
private var initialTwoThirds = persistentListOf<String>()

private var listToAdd = emptyList<String>()
private var halfList = emptyList<String>()
private var oneThirdList = emptyList<String>()

@Setup
fun prepare() {
listToAdd = persistentListAdd(implementation, size)
halfList = persistentListAdd(implementation, size / 2)
initialHalf = persistentListAdd(implementation, size - halfList.size)
oneThirdList = persistentListAdd(implementation, size / 3)
initialTwoThirds = persistentListAdd(implementation, size - oneThirdList.size)
}

// Results of the following benchmarks do not indicate memory or time spent per operation,
// however regressions there do indicate changes.
//
// the benchmarks measure mean time and memory spent per added element.
//
// Expected time: nearly constant.
// Expected memory: nearly constant.

/**
* Adds [size] elements to an empty persistent list using `addAll` operation.
*/
@Benchmark
fun addAllLast(): ImmutableList<String> {
return emptyPersistentList<String>(implementation).addAll(listToAdd)
}

/**
* Adds `size / 2` elements to an empty persistent list
* and then adds `size - size / 2` elements using `addAll` operation.
*/
@Benchmark
fun addAllLast_Half(): ImmutableList<String> {
return initialHalf.addAll(halfList)
}

/**
* Adds `size - size / 3` elements to an empty persistent list
* and then adds `size / 3` elements using `addAll` operation.
*/
@Benchmark
fun addAllLast_OneThird(): ImmutableList<String> {
return initialTwoThirds.addAll(oneThirdList)
}

/**
* Adds `size / 2` elements to an empty persistent list
* and then inserts `size - size / 2` elements at the beginning using `addAll` operation.
*/
@Benchmark
fun addAllFirst_Half(): ImmutableList<String> {
return initialHalf.addAll(0, halfList)
}

/**
* Adds `size - size / 3` elements to an empty persistent list
* and then inserts `size / 3` elements at the beginning using `addAll` operation.
*/
@Benchmark
fun addAllFirst_OneThird(): ImmutableList<String> {
return initialTwoThirds.addAll(0, oneThirdList)
}

/**
* Adds `size / 2` elements to an empty persistent list
* and then inserts `size - size / 2` elements at the middle using `addAll` operation.
*/
@Benchmark
fun addAllMiddle_Half(): ImmutableList<String> {
return initialHalf.addAll(initialHalf.size / 2, halfList)
}

/**
* Adds `size - size / 3` elements to an empty persistent list builder
* and then inserts `size / 3` elements at the middle using `addAll` operation.
*/
@Benchmark
fun addAllMiddle_OneThird(): ImmutableList<String> {
return initialTwoThirds.addAll(initialTwoThirds.size / 2, oneThirdList)
}
}
37 changes: 37 additions & 0 deletions benchmarks/src/main/kotlin/benchmarks/immutableList/Construct.kt
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
package benchmarks.immutableList

import benchmarks.*
import kotlinx.collections.immutable.*
import kotlinx.benchmark.*

@State(Scope.Benchmark)
open class Construct {
@Param(KOTLIN_IMPL, TREAP_IMPL, JAVA_IMPL)
var implementation = ""

@Param(BM_1, BM_10, BM_100, BM_1000, BM_10000, BM_100000, BM_1000000, BM_10000000)
var size: Int = 0

var toAdd = listOf<Int>()

@Setup
fun prepare() {
toAdd = (1..size).toList()
}

@Benchmark
fun oneAtATime(): ImmutableList<Int> {
var list = emptyPersistentList<Int>(implementation)
toAdd.forEach {
list = list.add(it)
}
return list
}

@Benchmark
fun addAll(): ImmutableList<Int> {
var list = emptyPersistentList<Int>(implementation)
list = list.addAll(toAdd)
return list
}
}
35 changes: 35 additions & 0 deletions benchmarks/src/main/kotlin/benchmarks/immutableList/Get.kt
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
/*
* Modified from the kotlinx.collections.immutable sources, which contained the following notice:
* Copyright 2016-2019 JetBrains s.r.o.
* Use of this source code is governed by the Apache 2.0 License that can be found in the LICENSE.txt file.
*/

package benchmarks.immutableList

import benchmarks.*
import kotlinx.collections.immutable.PersistentList
import kotlinx.collections.immutable.persistentListOf
import kotlinx.benchmark.*

@State(Scope.Benchmark)
open class Get {
@Param(KOTLIN_IMPL, TREAP_IMPL, JAVA_IMPL)
var implementation = ""

@Param(BM_1, BM_10, BM_100, BM_1000, BM_10000, BM_100000, BM_1000000, BM_10000000)
var size: Int = 0

private var persistentList: PersistentList<String> = persistentListOf()

@Setup
fun prepare() {
persistentList = persistentListAdd(implementation, size)
}

@Benchmark
fun getByIndex(bh: Blackhole) {
for (i in 0 until size) {
bh.consume(persistentList[i])
}
}
}
57 changes: 57 additions & 0 deletions benchmarks/src/main/kotlin/benchmarks/immutableList/Iterate.kt
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
/*
* Modified from the kotlinx.collections.immutable sources, which contained the following notice:
* Copyright 2016-2019 JetBrains s.r.o.
* Use of this source code is governed by the Apache 2.0 License that can be found in the LICENSE.txt file.
*/

package benchmarks.immutableList

import benchmarks.*
import com.certora.collect.TreapList
import kotlinx.collections.immutable.PersistentList
import kotlinx.collections.immutable.persistentListOf
import kotlinx.benchmark.*

@State(Scope.Benchmark)
open class Iterate {
@Param(KOTLIN_IMPL, TREAP_IMPL, JAVA_IMPL)
var implementation = ""

@Param(BM_1, BM_10, BM_100, BM_1000, BM_10000, BM_100000, BM_1000000, BM_10000000)
var size: Int = 0

private var persistentList: PersistentList<String> = persistentListOf()

@Setup
fun prepare() {
persistentList = persistentListAdd(implementation, size)
}

@Benchmark
fun firstToLast(bh: Blackhole) {
for (e in persistentList) {
bh.consume(e)
}
}

@Benchmark
fun lastToFirst(bh: Blackhole) {
val iterator = persistentList.listIterator(size)

while (iterator.hasPrevious()) {
bh.consume(iterator.previous())
}
}

@Benchmark
fun forEachElement(bh: Blackhole) {
when (val list = persistentList) {
is TreapList<*> -> list.forEachElement { e ->
bh.consume(e)
}
else -> list.forEach { e ->
bh.consume(e)
}
}
}
}
Loading

0 comments on commit 315e3db

Please sign in to comment.