Skip to content

Commit

Permalink
Merge pull request #125 from yetanalytics/camelCaseParameters
Browse files Browse the repository at this point in the history
camelCase parameters
  • Loading branch information
kelvinqian00 authored Nov 9, 2023
2 parents 080c8ca + 7b03732 commit 3fc93a4
Show file tree
Hide file tree
Showing 17 changed files with 111 additions and 94 deletions.
14 changes: 8 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ The inputs to DATASIM consist of four parts, each represented by JSON. They are

One or more valid xAPI Profiles are required for DATASIM to generate xAPI Statements. You can learn more about the xAPI Profile Specification [here](https://github.com/adlnet/xapi-profiles). This input can either be a single Profile JSON-LD document or an array of JSON-LD format profiles. At this time all referenced concepts in a Profile must be included in the input. For instance if in "Profile A" I have a Pattern that references a Statement Template found in "Profile B", both Profiles must be included in an array as the Profile input.

Note that by default, any patterns with a `primary` property set to `true` in the provided profiles will be used for generation. You can control which profiles these primary patterns are sourced from with the `gen-profiles` option by supplying one or more profile IDs. You can further control which specific primary patterns are used with the `gen-patterns` option by supplying one or more pattern IDs.
Note that by default, any patterns with a `primary` property set to `true` in the provided profiles will be used for generation. You can control which profiles these primary patterns are sourced from with the `genProfiles` option by supplying one or more profile IDs. You can further control which specific primary patterns are used with the `genPatterns` option by supplying one or more pattern IDs.

#### Personae

Expand Down Expand Up @@ -90,14 +90,14 @@ and `weight` values (as described under `verbs`).
- `hours`: `0` to `23`
- `minutes`: `0` to `59`
- `seconds`: `0` to `59`
- `boundRetries`: An array of Pattern IDs to retry if the timestamp violates `bounds`. The top-most Pattern in `boundRetries` will be tried, e.g. if Pattern A is a parent of Pattern B and both are listed in `boundRetries`, it will be Pattern A that is retried. If `boundRetries` is empty or not present, or if none of the ancestor Patterns are included, then Statement generation will continue at its current point.
- `boundRestarts`: An array of Pattern IDs to retry if the timestamp violates `bounds`. The top-most Pattern in `boundRestarts` will be tried, e.g. if Pattern A is a parent of Pattern B and both are listed in `boundRestarts`, it will be Pattern A that is retried. If `boundRestarts` is empty or not present, or if none of the ancestor Patterns are included, then Statement generation will continue at its current point.
- `periods`: An array of objects that specify the amount of time between generated Statements. Only the first valid period in the array will be applied to generate the next Statement (see `bounds` property). Each period object has the following optional properties:
- `min`: a minimum amount of time between Statements; default is `0`
- `mean` the average amount of time between Statements (added on top of `min`); default is `1`
- `fixed`: a fixed amount of time between Statements; overrides `min` and `mean`
- `unit`: the time unit for all temporal values. Valid values are `millis`, `seconds`, `minutes`, `hours`, `days`, and `weeks`; the default is `minutes`
- `bounds`: an array of the temporal bounds the period can apply in. During generation, the current Statement timestamp is checked against each period's `bounds`, and the first period whose bound satisfies the timestamp will be used to generate the next Statement timestamp. A nonexisting `bounds` value indicates an infinite bound, i.e. any timestamp is always valid. The syntax is the same as the top-level `bounds` array. At least one period must not have a `bounds` value, so it can act as the default period.
- `templates`: An array of objects with Statement Template `id` and optional `bounds`, `boundRetries`, and `period` properties, as explained above in `patterns`. Note that `weights` and `repeat-max` do not apply here.
- `templates`: An array of objects with Statement Template `id` and optional `bounds`, `boundRestarts`, and `period` properties, as explained above in `patterns`. Note that `weights` and `repeat-max` do not apply here.
- `objectOverrides`: An array of objects containing (xAPI) `object` and `weight`. If present, these objects will overwrite any that would have been set by the Profile.

An example of a model array with valid `personae`, `verbs`, and `templates` is shown below:
Expand Down Expand Up @@ -126,7 +126,7 @@ An example of a model array with valid `personae`, `verbs`, and `templates` is s
"months": [["January", "May"]]
}
],
"boundRetries": [
"boundRestarts": [
"https://w3id.org/xapi/cmi5#toplevel"
],
"period": {
Expand All @@ -151,10 +151,12 @@ The simulation parameters input covers the details of the simulation not covered
"max": 200,
"timezone": "America/New_York",
"seed": 42,
"max-retries": 10
"maxRestarts": 10
}
```
Note the `max-retries` parameter; this is to limit the amount of times a particular Pattern is repeated when a `bounds` is violated.
Note the `maxRestarts` parameter; this is to limit the amount of times a particular Pattern is restarted when a `bounds` is violated.

Additional parameters include `genPatterns` and `genProfiles`, which are explained in more detail under [xAPI Profiles](#xapi-profiles).

#### (Alternatively) Simulation Specification

Expand Down
2 changes: 1 addition & 1 deletion dev-resources/models/simple_with_temporal.json
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
"minutes": [[0, 10]]
}
],
"boundRetries": [
"boundRestarts": [
"https://w3id.org/xapi/cmi5#typicalsessions"
]
},
Expand Down
10 changes: 6 additions & 4 deletions dev-resources/parameters/simple.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
{"start": "2019-11-18T11:38:39.219768Z",
"end": "2019-11-19T11:38:39.219768Z",
"timezone": "America/New_York",
"seed": 42}
{
"start": "2019-11-18T11:38:39.219768Z",
"end": "2019-11-19T11:38:39.219768Z",
"timezone": "America/New_York",
"seed": 42
}
10 changes: 5 additions & 5 deletions dev-resources/parameters/tccc_dev.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"start" : "2020-02-10T11:38:39.219768Z",
"end" : "2020-02-25T17:38:39.219768Z",
"timezone" : "America/New_York",
"seed" : 40
}
"start": "2020-02-10T11:38:39.219768Z",
"end": "2020-02-25T17:38:39.219768Z",
"timezone": "America/New_York",
"seed": 40
}
4 changes: 2 additions & 2 deletions src/cli/com/yetanalytics/datasim/main.clj
Original file line number Diff line number Diff line change
Expand Up @@ -115,10 +115,10 @@
:id :async
:default true]
[nil "--gen-profile IRI" "Only generate based on primary patterns in the given profile. May be given multiple times to include multiple profiles."
:id :gen-profiles
:id :genProfiles
:assoc-fn conj-param-input]
[nil "--gen-pattern IRI" "Only generate based on the given primary pattern. May be given multiple times to include multiple patterns."
:id :gen-patterns
:id :genPatterns
:assoc-fn conj-param-input]
;; Help
["-h" "--help"]])
Expand Down
6 changes: 3 additions & 3 deletions src/main/com/yetanalytics/datasim/input.clj
Original file line number Diff line number Diff line change
Expand Up @@ -43,10 +43,10 @@
;; makes use of two different parts of the input spec

(defn validate-pattern-filters
[{{:keys [gen-profiles gen-patterns]} :parameters
[{{:keys [genProfiles genPatterns]} :parameters
profiles :profiles}]
(concat (profile/validate-profile-filters profiles gen-profiles)
(profile/validate-pattern-filters profiles gen-patterns)))
(concat (profile/validate-profile-filters profiles genProfiles)
(profile/validate-pattern-filters profiles genPatterns)))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Input I/O
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@
:min-count 1))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Time Bound Retries
;; Time Bound Restarts
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

(s/def ::boundRestarts
Expand Down
23 changes: 15 additions & 8 deletions src/main/com/yetanalytics/datasim/input/parameters.clj
Original file line number Diff line number Diff line change
Expand Up @@ -51,12 +51,16 @@
(s/def ::max
pos-int?)

;; Max number of bound restarts before giving up
(s/def ::maxRestarts
pos-int?)

;; Restrict Generation to these profile IDs
(s/def ::gen-profiles
(s/def ::genProfiles
(s/every ::prof/id))

;; Restrict Generation to these pattern IDs
(s/def ::gen-patterns
(s/def ::genPatterns
(s/every ::pat/id))

(defn- ordered-timestamps?
Expand All @@ -82,9 +86,9 @@
:opt-un [::end
::from
::max
::max-retries
::gen-profiles
::gen-patterns])
::maxRestarts
::genProfiles
::genPatterns])
ordered-timestamps?))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Expand All @@ -100,17 +104,20 @@
;; Defaults
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

(def utc-timezone "UTC")
(def default-max-restarts 5)

(defn apply-defaults
"Apply defaults to `params` with the current time and a random seed.
If `params` is not provided simply return the default parameters."
([]
(apply-defaults {}))
([{:keys [start from timezone seed max-retries] :as params}]
([{:keys [start from timezone seed maxRestarts] :as params}]
(merge
params
(let [start (or start (.toString (Instant/now)))]
{:start start
:from (or from start)
:timezone (or timezone "UTC")
:timezone (or timezone utc-timezone)
:seed (or seed (random/rand-unbound-int (random/rng)))
:max-retries (or max-retries 5)}))))
:maxRestarts (or maxRestarts default-max-restarts)}))))
8 changes: 4 additions & 4 deletions src/main/com/yetanalytics/datasim/input/profile.clj
Original file line number Diff line number Diff line change
Expand Up @@ -61,15 +61,15 @@
(into #{}))]
(for [[idx pattern-id] (map-indexed vector gen-patterns)
:when (not (contains? pattern-id-set pattern-id))]
{:id (str "parameters-gen-patterns-" idx)
:path [:parameters :gen-patterns idx]
{:id (str "parameters-genPatterns-" idx)
:path [:parameters :genPatterns idx]
:text (validate-pattern-filters-emsg pattern-id pattern-id-set)})))

(defn validate-profile-filters
[profiles gen-profiles]
(let [profile-id-set (->> profiles (map :id) (into #{}))]
(for [[idx profile-id] (map-indexed vector gen-profiles)
:when (not (contains? profile-id-set profile-id))]
{:id (str "parameters-gen-profiles-" idx)
:path [:parameters :gen-profiles idx]
{:id (str "parameters-genProfiles-" idx)
:path [:parameters :genProfiles idx]
:text (validate-profile-fitlers-emsg profile-id profile-id-set)})))
4 changes: 2 additions & 2 deletions src/main/com/yetanalytics/datasim/model.clj
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
(s/def ::pattern/bounds
::bounds/bounds)

(s/def ::pattern/bound-retries
(s/def ::pattern/bound-restarts
(s/every ::xs/iri :kind set?))

(s/def ::pattern/period
Expand All @@ -56,7 +56,7 @@
(s/def ::pattern
(s/keys :opt-un [::pattern/weights
::pattern/bounds
::pattern/bound-retries
::pattern/bound-restarts
::pattern/period
::pattern/repeat-max]))

Expand Down
16 changes: 10 additions & 6 deletions src/main/com/yetanalytics/datasim/sim.clj
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@
(defn- temp-statement-seq
"Generate sequence of maps of `:template`, `:timestamp`, `:time-since-last`,
and `:registration` values."
[inputs alignments seed max-retries timestamp registration-seq]
[inputs alignments seed max-restarts timestamp registration-seq]
(let [profile-rng
(random/seed-rng seed)
fill-statement-seq*
Expand All @@ -65,7 +65,11 @@
(let [profile-seed
(random/rand-unbound-int profile-rng)
template-maps
(p/walk-profile-patterns inputs alignments profile-seed max-retries timestamp)
(p/walk-profile-patterns inputs
alignments
profile-seed
max-restarts
timestamp)
?next-timestamp
(:timestamp (meta template-maps))
template-maps*
Expand Down Expand Up @@ -125,13 +129,13 @@
"Generate a lazy sequence of xAPI Statements occuring as a Poisson
process. The sequence will either end at `?end-time` or, if `nil`,
be infinite."
[input seed alignments start-time ?end-time ?from-time zone-region max-retries]
[input seed alignments start-time ?end-time ?from-time zone-region max-restarts]
(let [sim-rng (random/seed-rng seed)
reg-seed (random/rand-unbound-int sim-rng)
temp-seed (random/rand-unbound-int sim-rng)
stmt-rng (random/seed-rng (random/rand-unbound-int sim-rng))]
(->> (init-statement-seq reg-seed)
(temp-statement-seq input alignments temp-seed max-retries start-time)
(temp-statement-seq input alignments temp-seed max-restarts start-time)
(drop-statement-seq ?end-time)
(seed-statement-seq stmt-rng)
(from-statement-seq ?from-time)
Expand All @@ -154,7 +158,7 @@
Spooky."
[{:keys [profiles personae-array models parameters]}]
(let [;; Input parameters
{:keys [start end from timezone seed max-retries]} parameters
{:keys [start end from timezone seed maxRestarts]} parameters
;; RNG for generating the rest of the seeds
sim-rng (random/seed-rng seed)
;; Set timezone region and timestamps
Expand Down Expand Up @@ -200,7 +204,7 @@
?end-time
?from-time
zone-region
max-retries)]
maxRestarts)]
(assoc m actor-id actor-stmt-seq)))
{}))))

Expand Down
26 changes: 13 additions & 13 deletions src/main/com/yetanalytics/datasim/xapi/profile.clj
Original file line number Diff line number Diff line change
Expand Up @@ -98,13 +98,13 @@
:ret ::type-iri-map)

(defn select-primary-patterns
"Given `type-iri-map` and the `gen-profiles` and `gen-patterns` params,
"Given `type-iri-map` and the `genProfiles` and `genPatterns` params,
update the Pattern map to further specify primary patterns for generation.
Primary patterns in this context must be specified by `gen-profiles` or
`gen-patterns`, or else they will no longer be counted as primary patterns."
[type-iri-map {:keys [gen-profiles gen-patterns]}]
(let [?profile-set (some-> gen-profiles not-empty set)
?pattern-set (some-> gen-patterns not-empty set)
Primary patterns in this context must be specified by `genProfiles` or
`genPatterns`, or else they will no longer be counted as primary patterns."
[type-iri-map {:keys [genProfiles genPatterns]}]
(let [?profile-set (some-> genProfiles not-empty set)
?pattern-set (some-> genPatterns not-empty set)
primary-pat? (fn [profile-id pattern-id]
(and (or (nil? ?profile-set)
(contains? ?profile-set profile-id))
Expand Down Expand Up @@ -175,24 +175,24 @@
(partial tmp/update-parsed-rules-map profile-map*))))

(s/fdef walk-profile-patterns
:args (s/cat :profile-map ::profile-map
:alignments ::model/alignments
:seed ::random/seed
:max-retries pos-int?
:start-time t/local-date-time?)
:args (s/cat :profile-map ::profile-map
:alignments ::model/alignments
:seed ::random/seed
:max-restarts pos-int?
:start-time t/local-date-time?)
:ret (s/every ::pat/template-map))

(defn walk-profile-patterns
"Walk the primary patterns of the compiled profiles,"
[{pattern-iri-map :pattern-map}
{pattern-alignments :patterns}
seed
max-retries
max-restarts
start-time]
(let [pattern-rng (random/seed-rng seed)
root-pattern (get pattern-iri-map ::pat/root)
context {:pattern-map pattern-iri-map
:alignments-map pattern-alignments
:max-retries max-retries
:max-restarts max-restarts
:rng pattern-rng}]
(pat/walk-pattern context [] start-time start-time root-pattern)))
Loading

0 comments on commit 3fc93a4

Please sign in to comment.