Skip to content
Slade Maurer edited this page Aug 15, 2013 · 7 revisions

Purpose

The purpose of Friesian is to provide a simple Java application that is configurable to simulate JVM memory behavior so that it can be used to verify and validate core Groningen features such as the Hypothesizer's generational genetic algorithm. It starts up quickly and can simulate the diurnal qps load that a real user facing modern cloud application experiences. Its multithreaded and 100% pure Java.

Algorithm

Friesian is a multi-threaded processing loop. It consists of a master thread and worker threads. A PRN generator is used to inject randomness into the memory access pattern and amount of work a worker performs.

Upon startup, the master thread allocates two caches called "hot" and "cold". Then it spawns worker threads at a variable rate determined by the diurnal function.

Work is performed by worker threads and they perform a variable amount of work. Each worker partitions a list and sorts each partition sequentially. Partitions are equally sized, except possibly the last one. The number of partitions is chosen randomly. A worker may randomly sleep between partition sorts (mimicking blocking on I/O). One of two caches is potentially updated when the workers are done sorting partitions of the list.

The purpose of the caches is to provide long lived objects for the JVM to manage. The hot cache is more likely than the cold cache to be updated. It is possible that no cache is updated when work is complete. A cache update is simply a random replacement of a prior object reference. The likelyhood of a cache update is inversely proportional to the number of items in the cache, so that it warms up quickly on Friesian startup.

A worker's caching stage is intended to apply varying pressure to the JVM's new and old generations. Sometimes a worker's list is not cached, which implies that its JVM memory is short lived. Sometimes a worker's list is hot cached, which implies that its JVM memory is long lived. Infrequently, the worker's list is cold cached, which implies that its JVM memory is very long lived.

A worker's random work time and random sleeps are intended to "mix up" the various worker thread's life times to fragment the JVM heaps. The intention is to allow users to control the way memory is allocated within the heap, so that the relative lifetimes of objects in memory has a tunable variance.

A worker's random sleeps provide variations on the CPU utilization within our borglet's container. This varies the amount of CPU available to the GC in concurrent modes.

"Work" is performed on a randomly sized array by a thread. Worker threads are spawned by the master thread at a variable rate based on the formula f(t) = M * d(t) + B where d(t) is diurnal curve simulation function. M is the user define flag: "--rateSlopeConstant". The B parameter is the user defined flag: "--rateInterceptConstant".

Many command line configuration flags are provided for Friesian. They give the user control over JVM heap pressure, such as live object foot print, short versus long lived object ratio, heap fragmentation and available CPU resources. This allows tests to be constructed that are biased toward a particular JVM setting to demonstrate that Groningen finds the right answer.

Work State Machine

workflow diagram

DEFINITION OF STATES

  • Init - Create rand sized list and populate w/ rand ints
  • Work - Pick non-overlapping rand partition of list and sort it
  • Work Done - All partitions are sorted
  • Cache - Randomly put list in hot or cold cache or delete it
  • Rand Edge - Pick a rand edge to take (to Work or to Sleep)
  • Sleep - Sleep a user defined amount of time

Application Flags

  • --rateSlopeConstant = The "M" used to control the rate function's slope constant (rate is N per second)
  • --rateInterceptConstant = The "B" used to control the rate function's intercept constant
  • --maxListSize = The maximum size of the list workers are sorting
  • --minListSize = The minimum size of the list workers are sorting
  • --maxNumberOfListPartitions = The maximum number of partitions a worker divides a list into
  • --minNumberOfListPartitions = The minimum number of partitions a worker divides a list into
  • --hotProbability = A percentage from 0 to 100 that the hot cache is updated (must be greater than --coldProbability)
  • --hotCacheSize = Number of elements in the hot cache.
  • --coldProbability = A percentage from 0 to 100 that the cold cache is updated (drop+hot+cold = 100%)
  • --coldCacheSize = Number of elements in the cold cache
  • --workerSleepTime = Number of milliseconds a worker thread sleeps
  • --sleepProbability = A percentage from 0 to 100 that the worker thread sleeps after sorting a partition
  • --port = The port or HTML server runs on
  • --diurnalPeriod = The amount of time in hours that the diurnal function takes before repeating itself (24.0 hour default value)

Diurnal Function

The diurnal function returns a number of milliseconds to sleep with the intention that this is used as a rate to limit the number of worker threads created by the master. Given x, which is a time in milliseconds since the epoch, the diurnal function is defined as:

d(x) = 1000 / (cos( (2 * PI * x) / (PERIOD * 60 * 60 * 1000) ) + 1.5), where PERIOD is the value of "--diurnalPeriod"

The function peaks at midnight and the trough is at noon. The value at the peak is 2.5 and the trough is 0.5.