Skip to content

Commit

Permalink
Merge pull request #118 from yetanalytics/time-bounds-fixes
Browse files Browse the repository at this point in the history
Time bounds fixes + updates
  • Loading branch information
kelvinqian00 authored Oct 17, 2023
2 parents eec11b2 + 24cbdc3 commit 992ec86
Show file tree
Hide file tree
Showing 13 changed files with 268 additions and 168 deletions.
27 changes: 14 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,33 +82,29 @@ and `weight` values (as described under `verbs`).
- `patterns`: An array of objects with Pattern `id` and the following additional optional values:
- `weights`: An array of child Pattern/Template `id` and `weight` values. Each weight affects how likely each of the Pattern's child patterns are chosen (for `alternates`) or how likely the child Pattern will be selected at all (for `optional`, for these `null` is also a valid option). This has no effect on `sequence`, `zeroOrMore`, or `oneOrMore` Patterns.
- `repeat-max`: A positive integer representing the maximum number of times (exclusive) the child pattern can be generated. Only affects `zeroOrMore` and `oneOrMore` patterns.
- `bounds`: An array of objects containing key-value pairs where each value is an array of singular values (e.g. `"January"`) or pair arrays of start and end values (e.g. `["January", "October"]`). For example `{"years": [2023], "months": [[1 5]]}` describes an inclusive bound from January to May 2023. If not present, indicates an infinite bound, such that any timestamp is valid. The following are valid bound values:
- `bounds`: An array of objects containing key-value pairs where each value is an array of singular values (e.g. `"January"`) or arrays of start, end, and optional step values (e.g. `["January", "October"]`). For example `{"years": [2023], "months": [[1, 5]]}` describes an inclusive bound from January to May 2023; making the `months` bound `[[1, 5, 2]]` would have restricted it to only January, March, and May 2023. If not present, `bounds` indicates an infinite bound, such that any timestamp is valid. The following are valid bound values:
- `years`: Any positive integer
- `months`: `1` to `12`, or their name equivalents, i.e. `"January"` to `"December"`
- `daysOfMonth:` `1` to `31` (though `29` or `30` are skipped at runtime for months that do not include these days)
- `daysOfWeek`: `0` to `6`, or their name equivalents, i.e. `"Sunday"` to `"Saturday"`
- `hours`: `0` to `23`
- `minutes`: `0` to `59`
- `seconds`: `0` to `59`
- `periods`: an array of objects that specify the amount of time between generated Statements. Only the first valid period in the array will be applied to generate the next Statement (see `bounds` property). Each period object has the following optional properties:
- `boundRetries`: An array of Pattern IDs to retry if the timestamp violates `bounds`. The top-most Pattern in `boundRetries` will be tried, e.g. if Pattern A is a parent of Pattern B and both are listed in `boundRetries`, it will be Pattern A that is retried. If `boundRetries` is empty or not present, or if none of the ancestor Patterns are included, then Statement generation will continue at its current point.
- `periods`: An array of objects that specify the amount of time between generated Statements. Only the first valid period in the array will be applied to generate the next Statement (see `bounds` property). Each period object has the following optional properties:
- `min`: a minimum amount of time between Statements; default is `0`
- `mean` the average amount of time between Statements (added on top of `min`); default is `1`
- `fixed`: a fixed amount of time between Statements; overrides `min` and `mean`
- `unit`: the time unit for all temporal values. Valid values are `millis`, `seconds`, `minutes`, `hours`, `days`, and `weeks`; the default is `minutes`
- `bounds`: an array of the temporal bounds the period can apply in. During generation, the current Statement timestamp is checked against each period's `bounds`, and the first period whose bound satisfies the timestamp will be used to generate the next Statement timestamp. A nonexisting `bounds` value indicates an infinite bound, i.e. any timestamp is always valid. The syntax is the same as the top-level `bounds` array. At least one period must not have a `bounds` value, so it can act as the default period.
- `retry`: One of four options that determine Statement generation retry behavior in the event where a time bound is exceeded:
- `null` (or not present): Terminate the generation on the current Pattern immediately, and move again with the next Pattern's generation.
- `"pattern"`: Retry generation of this Pattern if this Pattern's bound is exceeded.
- `"child"`: Retry generation of whichever child Pattern or Statement Template in which this Pattern's bound is exceeded.
- `"template"`: Retry generation of the Statement Template that exceeded this Pattern's bound.
- `templates`: An array of objects with Statement Template `id` and optional `bounds`, `period`, and `retry` values, as explained above in `patterns`. Note that `weights` and `repeat-max` do not apply here.
- `templates`: An array of objects with Statement Template `id` and optional `bounds`, `boundRetries`, and `period` properties, as explained above in `patterns`. Note that `weights` and `repeat-max` do not apply here.
- `objectOverrides`: An array of objects containing (xAPI) `object` and `weight`. If present, these objects will overwrite any that would have been set by the Profile.

An example of a model array with valid `personae`, `verbs`, and `templates` is shown below:

```json
[
{
[
{
"personae": [
{
"id": "mbox::mailto:[email protected]",
Expand All @@ -130,15 +126,18 @@ An example of a model array with valid `personae`, `verbs`, and `templates` is s
"months": [["January", "May"]]
}
],
"boundRetries": [
"https://w3id.org/xapi/cmi5#toplevel"
],
"period": {
"min": 1,
"mean": 2.0,
"unit": "second"
}
}
]
}
]
}
]
```

#### Simulation Parameters
Expand All @@ -151,9 +150,11 @@ The simulation parameters input covers the details of the simulation not covered
"end": "2019-11-19T11:38:39.219768Z",
"max": 200,
"timezone": "America/New_York",
"seed": 42
"seed": 42,
"max-retries": 10
}
```
Note the `max-retries` parameter; this is to limit the amount of times a particular Pattern is repeated when a `bounds` is violated.

#### (Alternatively) Simulation Specification

Expand Down
13 changes: 12 additions & 1 deletion dev-resources/models/simple_with_temporal.json
Original file line number Diff line number Diff line change
Expand Up @@ -38,11 +38,22 @@
],
"templates": [
{
"id": "https://w3id.org/xapi/cmi5#satisfied",
"id": "https://w3id.org/xapi/cmi5#launched",
"bounds": [
{
"minutes": [[0, 10]]
}
],
"boundRetries": [
"https://w3id.org/xapi/cmi5#typicalsessions"
]
},
{
"id": "https://w3id.org/xapi/cmi5#initialized",
"bounds": [
{
"minutes": [[10, 59, 5]]
}
]
}
]
Expand Down
39 changes: 22 additions & 17 deletions src/main/com/yetanalytics/datasim/input/model/alignments.clj
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,14 @@
(< start end))

(defmacro bound-spec [scalar-spec]
`(s/every (s/or :scalar ~scalar-spec
:interval (s/and (s/tuple ~scalar-spec ~scalar-spec)
interval?))
`(s/every (s/or :scalar
~scalar-spec
:interval
(s/and (s/tuple ~scalar-spec ~scalar-spec)
interval?)
:step-interval
(s/and (s/tuple ~scalar-spec ~scalar-spec pos-int?)
interval?))
:kind vector?
:min-count 1
:gen-max 3))
Expand Down Expand Up @@ -112,6 +117,13 @@
:kind vector?
:min-count 1))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Time Bound Retries
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

(s/def ::boundRestarts
(s/every ::xs/iri :kind vector?))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Time Period
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Expand Down Expand Up @@ -145,18 +157,11 @@
:kind (every-pred vector? has-default-period?)
:min-count 1))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Retry Options
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

(s/def ::retry
(s/nilable #{"template" "child" "pattern"}))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Max Repeat
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

(s/def ::repeat-max pos-int?)
(s/def ::repeatMax pos-int?)

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Object
Expand Down Expand Up @@ -203,17 +208,17 @@

(def pattern-spec
(s/keys :req-un [::id]
:opt-un [::weights ; for alternate and optional patterns
::repeat-max ; for oneOrMore and zeroOrMore patterns
:opt-un [::weights ; for alternate and optional patterns
::repeatMax ; for oneOrMore and zeroOrMore patterns
::bounds
::periods
::retry]))
::boundRestarts
::periods]))

(def template-spec
(s/keys :req-un [::id]
:opt-un [::bounds
::periods
::retry]))
::boundRestarts
::periods]))

(def object-override-spec
(s/keys :req-un [::object]
Expand Down
12 changes: 7 additions & 5 deletions src/main/com/yetanalytics/datasim/input/parameters.clj
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@
:opt-un [::end
::from
::max
::max-retries
::gen-profiles
::gen-patterns])
ordered-timestamps?))
Expand All @@ -104,11 +105,12 @@
If `params` is not provided simply return the default parameters."
([]
(apply-defaults {}))
([{:keys [start from timezone seed] :as params}]
([{:keys [start from timezone seed max-retries] :as params}]
(merge
params
(let [start (or start (.toString (Instant/now)))]
{:start start
:from (or from start)
:timezone (or timezone "UTC")
:seed (or seed (random/rand-unbound-int (random/rng)))}))))
{:start start
:from (or from start)
:timezone (or timezone "UTC")
:seed (or seed (random/rand-unbound-int (random/rng)))
:max-retries (or max-retries 5)}))))
21 changes: 11 additions & 10 deletions src/main/com/yetanalytics/datasim/model.clj
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
(ns com.yetanalytics.datasim.model
(:require [clojure.spec.alpha :as s]
[xapi-schema.spec :as xs]
[com.yetanalytics.datasim.input.model :as model]
[com.yetanalytics.datasim.input.model.alignments :as model.alignments]
[com.yetanalytics.datasim.math.random :as random]
Expand Down Expand Up @@ -42,20 +43,20 @@
(s/def ::pattern/bounds
::temporal/bounds)

(s/def ::pattern/bound-retries
(s/every ::xs/iri :kind set?))

(s/def ::pattern/period
::temporal/period)

(s/def ::pattern/retry
#{:template})

(s/def ::pattern/repeat-max
pos-int?)

(s/def ::pattern
(s/keys :opt-un [::pattern/weights
::pattern/bounds
::pattern/bound-retries
::pattern/period
::pattern/retry
::pattern/repeat-max]))

(s/def ::patterns
Expand Down Expand Up @@ -96,13 +97,13 @@
(defn- reduce-patterns
[patterns]
(reduce
(fn [acc {:keys [id weights repeat-max bounds periods retry]}]
(fn [acc {:keys [id weights repeatMax bounds boundRestarts periods]}]
(let [m (cond-> {}
weights (assoc :weights (reduce-weights weights))
bounds (assoc :bounds (temporal/convert-bounds bounds))
periods (assoc :periods (temporal/convert-periods periods))
retry (assoc :retry (keyword retry))
repeat-max (assoc :repeat-max repeat-max))]
weights (assoc :weights (reduce-weights weights))
bounds (assoc :bounds (temporal/convert-bounds bounds))
boundRestarts (assoc :bound-restarts (set boundRestarts))
periods (assoc :periods (temporal/convert-periods periods))
repeatMax (assoc :repeat-max repeatMax))]
(assoc acc id m)))
{}
patterns))
Expand Down
8 changes: 5 additions & 3 deletions src/main/com/yetanalytics/datasim/model/temporal.clj
Original file line number Diff line number Diff line change
Expand Up @@ -224,10 +224,12 @@
(defn- reduce-values
[vs* v]
(if (coll? v)
(let [[start end] v
(let [[start end ?step] v
start* (string->int start)
end* (string->int end)
vrange (range start* (inc end*))]
end* (inc (string->int end))
vrange (if ?step
(range start* end* ?step)
(range start* end*))]
(into vs* vrange))
(let [v* (string->int v)]
(conj vs* v*))))
Expand Down
13 changes: 7 additions & 6 deletions src/main/com/yetanalytics/datasim/sim.clj
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@
(defn- temp-statement-seq
"Generate sequence of maps of `:template`, `:timestamp`, `:time-since-last`,
and `:registration` values."
[inputs alignments seed timestamp registration-seq]
[inputs alignments seed max-retries timestamp registration-seq]
(let [profile-rng
(random/seed-rng seed)
fill-statement-seq*
Expand All @@ -63,7 +63,7 @@
(let [profile-seed
(random/rand-unbound-int profile-rng)
template-maps
(p/walk-profile-patterns inputs alignments profile-seed timestamp)
(p/walk-profile-patterns inputs alignments profile-seed max-retries timestamp)
?next-timestamp
(:timestamp (meta template-maps))]
(cond-> (map #(assoc % :registration registration) template-maps)
Expand Down Expand Up @@ -111,13 +111,13 @@
"Generate a lazy sequence of xAPI Statements occuring as a Poisson
process. The sequence will either end at `?end-time` or, if `nil`,
be infinite."
[input seed alignments start-time ?end-time ?from-time zone-region]
[input seed alignments start-time ?end-time ?from-time zone-region max-retries]
(let [sim-rng (random/seed-rng seed)
reg-seed (random/rand-unbound-int sim-rng)
temp-seed (random/rand-unbound-int sim-rng)
stmt-rng (random/seed-rng (random/rand-unbound-int sim-rng))]
(->> (init-statement-seq reg-seed)
(temp-statement-seq input alignments temp-seed start-time)
(temp-statement-seq input alignments temp-seed max-retries start-time)
(drop-statement-seq ?end-time)
(seed-statement-seq stmt-rng)
(from-statement-seq ?from-time)
Expand Down Expand Up @@ -155,7 +155,7 @@
Spooky."
[{:keys [profiles personae-array models parameters]}]
(let [;; Input parameters
{:keys [start end from timezone seed]} parameters
{:keys [start end from timezone seed max-retries]} parameters
;; RNG for generating the rest of the seeds
sim-rng (random/seed-rng seed)
;; Set timezone region and timestamps
Expand Down Expand Up @@ -200,7 +200,8 @@
start-time
?end-time
?from-time
zone-region)]
zone-region
max-retries)]
(assoc m actor-id actor-stmt-seq)))
{}))))

Expand Down
5 changes: 4 additions & 1 deletion src/main/com/yetanalytics/datasim/xapi/profile.clj
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,7 @@
:args (s/cat :profile-map ::profile-map
:alignments ::model/alignments
:seed ::random/seed
:max-retries pos-int?
:start-time ::temporal/date-time)
:ret (s/every ::pat/template-map))

Expand All @@ -186,10 +187,12 @@
[{pattern-iri-map :pattern-map}
{pattern-alignments :patterns}
seed
max-retries
start-time]
(let [pattern-rng (random/seed-rng seed)
root-pattern (get pattern-iri-map ::pat/root)
context {:pattern-map pattern-iri-map
:alignments-map pattern-alignments
:max-retries max-retries
:rng pattern-rng}]
(pat/walk-pattern context nil start-time start-time root-pattern)))
(pat/walk-pattern context [] start-time start-time root-pattern)))
Loading

0 comments on commit 992ec86

Please sign in to comment.