Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add warning for unbound indices in AO2D merging #234

Merged
merged 3 commits into from
Sep 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions docs/hyperloop/userdocumentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@
<div align="center">
<img src="../images/wagonShortcuts.png" width="80%">
</div>

Check failure on line 60 in docs/hyperloop/userdocumentation.md

View workflow job for this annotation

GitHub Actions / PR formatting / whitespace

Trailing spaces

Remove the trailing spaces at the end of the line.
## <a name="wagon-settings"></a> Wagon Settings

* <a name="wagonsettings"></a>In _Wagon settings_ you can modify the wagon name, work flow name, and select wagon's dependencies. The dependencies offered are wagons from the same _Analysis_ or from [_Service wagons_](#servicewagons).
Expand All @@ -65,7 +65,7 @@
<div align="center">
<img src="../images/wagonSettings.png" width="70%">
</div>

Check failure on line 68 in docs/hyperloop/userdocumentation.md

View workflow job for this annotation

GitHub Actions / PR formatting / whitespace

Trailing spaces

Remove the trailing spaces at the end of the line.
## <a name="wagon-configuration"></a> Wagon Configuration

* <a name="wagonconfiguration"></a>In _Configuration_ the wagon configuration corresponding to the workflow will be available in the _Base_. The configuration is divided per _Task_, hence if you need to add a new parameter, you will need add it in the following order: task, parameter and value.
Expand All @@ -92,26 +92,26 @@

* In order to update the base and subwagon configuration with the latest version of the workflow, click on the button `↻ sync` in _Configuration_. By synchronizing the configuration, the parameters which no longer belong to the workflow will be removed, and the values of the wagon's _Base_ will be updated as well if they have not been modified by the user.

## <a name="wagon-derived-data"></a> Derived data

Check failure on line 95 in docs/hyperloop/userdocumentation.md

View workflow job for this annotation

GitHub Actions / PR formatting / whitespace

Trailing spaces

Remove the trailing spaces at the end of the line.

* <a name="wagonderived"></a>In _Derived Data_ the tables which are produced by the task are displayed. If activated, these are saved to the output if the train is run as a derived data production. The produced derived data can be made available by the operators and serve as input for subsequent trains.

Check failure on line 97 in docs/hyperloop/userdocumentation.md

View workflow job for this annotation

GitHub Actions / PR formatting / whitespace

Trailing spaces

Remove the trailing spaces at the end of the line.

Check failure on line 98 in docs/hyperloop/userdocumentation.md

View workflow job for this annotation

GitHub Actions / PR formatting / whitespace

Trailing spaces

Remove the trailing spaces at the end of the line.
### <a name="deriveddatatypes"></a> Derived data types
* At the moment, there are two types of derived data specifications:
* Standard derived data (marked with 🗂️)- if the wagon is used in a train, this will produce derived data to be used for further analysis. The results will not be merged across runs and can be used as input for future train runs. Note that standard derived data trains do not submit automatically and may need additional approval. If in doubt, please seek advise before enabling derived data tables in your wagon configuration.
* Slim derived data (marked with green bordered 🗂️) - similarly to the standard derived data case, if used in a train, this will produce derived data to be used for further analysis. This is reserved for derived data of small output size. The results will be merged across runs and are not available to use in future train runs. The data will be automatically deleted after a preset period of time. You can mark a wagon for running as slim derived data by checking `Ready for slim derived data`.

Check failure on line 103 in docs/hyperloop/userdocumentation.md

View workflow job for this annotation

GitHub Actions / PR formatting / whitespace

Trailing spaces

Remove the trailing spaces at the end of the line.
* For wagons set as ready for slim derived data, two more fields need to be correctly set:
* Max DF size - This sets the maximal dataframe size in the merging step. Has to be 0 for not-self contained derived data (which need parent file access).
* Max derived file size - Sets the size limit for the output file size of the derived data file. This is an expert parameter which usually does not have to be changed. Only change this value if the processing in subsequent trains takes so long that the jobs fail. If set to 0 a good value will be automatically determined.

Check failure on line 107 in docs/hyperloop/userdocumentation.md

View workflow job for this annotation

GitHub Actions / PR formatting / whitespace

Trailing spaces

Remove the trailing spaces at the end of the line.
* In order to update the derived data configuration with the latest version of the workflow, click on the button `↻ sync` in _Derived data_. By synchronizing the derived data, the tables which no longer belong to the workflow will be removed, and the values of the tables will be updated.

<div align="center">
<img src="../images/derivedDataEx.png" width="70%">
</div>

## <a name="wagon-test-statistics"></a> Test Statistics

Check failure on line 114 in docs/hyperloop/userdocumentation.md

View workflow job for this annotation

GitHub Actions / PR formatting / whitespace

Trailing spaces

Remove the trailing spaces at the end of the line.

* <a name="wagonteststatistics"></a>_Test Statistics_ contains three graphs that display different metrics following the tests this wagon was part of. The first graph plots the _PSS Memory_ corresponding to each test run. The second one diplays the _CPU Time_, _Wall time_ and _Throughput_ along the test runs for this wagon. Finally, the third graph shows the _Output size_ at each test run.

Expand Down Expand Up @@ -260,7 +260,7 @@
</div>

* If you only want to see the top 10 graph with the highest average, check the Show top 10 largest box.

Check failure on line 263 in docs/hyperloop/userdocumentation.md

View workflow job for this annotation

GitHub Actions / PR formatting / whitespace

Trailing spaces

Remove the trailing spaces at the end of the line.
* To produce this type of performance graphs for a local O2 execution, follow the instructions [here](#producing-performance-graphs-for-a-local-o2-execution).

* Whenever a wagon configuration is changed, if there are enabled wagons (including wagons that depend on it), then the test is automatically reset and a new test is launched. However, if the enabled wagon was already composed in a train, the train will run with the wagons and dataset configuration of the time at which the train was created.
Expand All @@ -271,7 +271,7 @@

1. [Adding a new wagon](#addwagon): You can create a wagon with your unmerged or unreleased workflow. If the workflow is not available, add manually the configuration of the wagon, and subwagons if needed. You can synchronize the wagon's configuration once the package tag that includes your pull request has been released.
2. [Enabling a wagon in a dataset](#enabling-a-wagon): If you need to enable your wagon with workflow that is unmerged or unreleased, use a `Future tag based on pull request`. There is a list of the latest merged and unmerged pull requests available in the system, you can see the pull request number and description. Select the _pull request tag_ and enable the wagon in a dataset. By doing this, the wagon will be queued to test, and the test will begin once the _pull request_ has been merged to a package tag, and the package tag is released. And then, if the test is successful, it'll be composed in a train with the latest package tag available.

Check failure on line 274 in docs/hyperloop/userdocumentation.md

View workflow job for this annotation

GitHub Actions / PR formatting / whitespace

Trailing spaces

Remove the trailing spaces at the end of the line.
## <a name="warnings"></a> Warnings

When a wagon test finishes in warning, this means that the wagon will not be included in the automatic composition schedule. Therefore, train composition can be requested in the Operation channel, where an operator will take care of the request. Before doing so, please review if you cannot fix the cause of the warning yourself. Depending on the nature of the warning and the degree of exceeding specific constraints, the operator will either compose your train or advise you to review and improve certain parts before requesting a train again. In the latter case, the user can analyze the test and review the logs, searching for ways of improving resource usage or other elements that caused the exceptions.
Expand Down Expand Up @@ -336,6 +336,14 @@
</div>

* This is specific to tests with wagons set as ready for slim derived data. As the entire output is merged into one single file, there is a limit of 4000 MB for this. If exceeded, the user is advised to switch to standard derived data by unchecking the option “Ready for slim derived data” in the wagon edit view. Then a request for standard derived data train can be made.

### 8. <a name="warning-unbound-indices"></a>Unbound indices detected in AO2D merging

<div align="center">
<img src="../images/warningUnboundIndices.png" width="40%">
</div>

* For derived data trains, it notifies the detection of unbound columns during AO2D merging. This means that one of the output tables which has been asked to be stored has index columns to tables which are not within the output. This usually points to a bad or broken data model definition and should be fixed. The only case where this is expected and not worrisome is linked derived data. For both slim derived data and standard derived data, the data model should be fixed.

It is possible that a wagon test will produce multiple warnings. In that case, the same checks above will be done for each warning present, and the decision making regarding train submission will be done considering all the exceptions.

Expand Down
Binary file added docs/images/warningUnboundIndices.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading