Columns and Lists

Columns and Lists are two primary data structures that we use to store node and rel properties as well as adjacency lists in the system.

Columns

A Column is simply an array with a value (of the same data type) at each offset. Hence, given an offset, a Column gives constant-time random access to its associated value.

Lists

Lists can be imagined as the extension to Columns where multiple values can be associated per offset. Hence, given an offset, a Lists gives constant-time random access to the associated list that contains an array of values (of the same data type). We use a 2-level CSR structure to organize data in Lists. We logically differentiate lists into the following types:

Small List: A list is defined as small if it can fit into one system page, that is in 4KB. For example, if elements of lists occupy 4Bytes each, then we say that a list is small is it contains less than or equal to 1024.
Large List: If it is not a small list.

Lists are also logically divided into chunks and all the small lists in a chunk are contiguously stored in the file. The size of the chunk is fixed in the system to be 512. Large Lists of a chunk span multiple pages and are stored outside the chunk, though contiguously.

We further go on to group offsets in Lists into chunks of 512. All the lists of a single chunk are stored contiguously in the file With each lists, we use 2 auxiliary structures that contains information about where to find the data (which is a list) in the lists data structure.

Headers: There is one header per offset in the lists data structure. A header contains the
ListsMetadata

Coding Style
- Style Reference
Casting Rules
- Casting Rules
Frontend
- Parser
- Binder
- Optimizer
Processor
- Hash Join
- DataChunksIterator
Storage
Test
- Testing Framework

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Columns and Lists

Columns

Lists

Clone this wiki locally