diff --git a/README.md b/README.md index 15e6d8ce..b9bcd55b 100755 --- a/README.md +++ b/README.md @@ -186,20 +186,8 @@ data.pos: positions `pos` concatenated across manifolds data.x: vectors `x` concatenated across manifolds data.y: labels for each point denoting which manifold it belongs to data.edge_index: edge list of proximity graph (each manifold gets its graph, disconnected from others) -data.gauges: local coordinate bases when `local_gauge=True` ``` -### How to pick good parameters - -Choosing good parameters for the description of manifold, in particular `spacing` and `k`, can be essential for the success of your analysis. The illustration below shows three different scenarios to give you intuition. - -1. (left) **'Optimal' scenario.** Here, the sample spacing along trajectories and between trajectories is comparable and `k` is chosen such that the proximity graph connects to neighbours but no further. At the same time, `k` is large enough to have enough neighbours for gradient approximation. Notice the trade-off here. -2. (middle) **Suboptimal scenario 1.** Here, the sample spacing is much smaller along the trajectory than between trajectories. This is probably frequently encountered when there are few trials relative to the dimension of the manifold and the size of the basin of attraction. Fitting a proximity graph to this dataset will lead to a poorly connected manifold or having too many neighbours pointing to consecutive points on the trajectory, leading to poor gradient approximation. Also, too-dense discretisation will mean that second-order features will not pick up on second-order features (curvature) of the trajectories. **Fix:** either increase `spacing` and/or subsample your trajectories before using `construct_dataset()`. -3. (right) **Suboptimal scenario 2.** Here, there are too few sample points relative to the curvature of the trajectories. As a result, the gradient approximation will be inaccurate. **Fix:** decrease `spacing` or collect more data. - - - - ### Training