diff --git a/docs/GettingStarted.md b/docs/GettingStarted.md index 6dce18190..9511aad70 100644 --- a/docs/GettingStarted.md +++ b/docs/GettingStarted.md @@ -27,6 +27,7 @@ You can get a copy of DataHelix by one of the following means: - Install via Chocolatey - Download the zip file - Clone and build the project +- Install via apt-get ### Install via Chocolatey @@ -60,6 +61,15 @@ You are also welcome to download the source code and build the generator yoursel Datahelix is under active development so expect new features and bug fixes. Please feel free to share any issues, feature requests, or ideas via the [GitHub issues page](https://github.com/finos/datahelix/issues). +### Install via apt-get +If you are using Debian or a Debian-based linux distribution such as Ubuntu you can install and update the datahelix using apt-get or apt. To do this you will need to add the datahelix package repository as a source. You can do this by adding the line +``` +deb [trusted=yes] https://apt.fury.io/datahelix/ / +``` +to your sources file, which can be found at `/etc/apt/sources.list`. This will require superuser privileges. After updating apt-get via the command `apt-get update` the helix can be installed with command `apt-get install datahelix`. + +Installing the package set up the datahelix so it can be run from the terminal with command `datahelix`. It will also install a manual page which can be viewed with the command `man datahelix`. + ## Creating your first profile We are going to work through creating a profile to generate random personal data about some test users. diff --git a/docs/developer/packaging/Apt-get.md b/docs/developer/packaging/Apt-get.md new file mode 100644 index 000000000..0b9d8e4ac --- /dev/null +++ b/docs/developer/packaging/Apt-get.md @@ -0,0 +1,18 @@ +## Apt-get + +Apt-get (Advanced Package Tool get) is a package manager that is a part of Debian linux and Debian-based linux distributions, such as Ubuntu. +The datahelix tool is published to a private repository which can be added as a source to apt-get, allowing install and automatic update. + +### Building the .deb package + +A `.deb` package can be built using the `createDeb` gradle task, runnable with command `gradle createDeb`. +This task will package the datahelix jar and a shell script so that when installed the datahelix will be runnable from the terminal globally. It will also package a compressed documentation file or 'manual page', so that it can be viewed with command `man datahelix` (so long as a man page viewer is installed). +This [documentation file](../../../orchestrator\src\main\resources\datahelix.1) may occasionally need to be updated if command line options change -- after editing it should be compressed using [gzip](https://www.gnu.org/software/gzip/) with the command `gzip -k -9 $path` (maximum compression, keep original file). + +### Publishing a new version of the package + +The package is hosted on [gemfury](https://gemfury.com/), a hosted package repository. +Uploading a new version of the package can be done in a couple of ways. +To upload through the site, simply login with user details located in the shared folder, click the button on the 'Upload' tab, and select the `.deb` file. +Alternatively it is possible to upload via a push request (e.g. by using cURL on the command line). + [This is documented on the gemfury site](https://gemfury.com/help/upload-packages). \ No newline at end of file diff --git a/chocolatey/README.md b/docs/developer/packaging/Chocolatey.md similarity index 100% rename from chocolatey/README.md rename to docs/developer/packaging/Chocolatey.md diff --git a/orchestrator/build.gradle b/orchestrator/build.gradle index 780166f0c..6c38bc130 100644 --- a/orchestrator/build.gradle +++ b/orchestrator/build.gradle @@ -18,6 +18,7 @@ plugins { id "java" id "de.gliderpilot.semantic-release" version "1.4.0" id "application" + id "nebula.deb" version "8.0.3" } group "com.scottlogic.datahelix.generator" @@ -97,6 +98,30 @@ task fatJar(type: Jar) { archiveName = "${baseName}.${extension}" } +task createDeb(type: Deb) { + packageName = "datahelix" + summary = 'Quickly generate rich and realistic data for simulation and testing. ' + maintainer = "Data Helix Team " + license = "Apache-2.0" + url = "https://github.com/finos/datahelix" + requires('java8-runtime-headless') + requires('jarwrapper') + dependsOn("fatJar") + from('src/main/resources/datahelix'){ + into '/usr/bin' + fileMode 0755 + user 'root' + } + from('build/libs/datahelix.jar'){ + into '/usr/share/datahelix' + user 'root' + } + from('src/main/resources/datahelix.1.gz'){ + into '/usr/share/man/man1' + user 'root' + } +} + jar { manifest { attributes 'Main-Class': 'com.scottlogic.datahelix.generator.orchestrator.App' diff --git a/orchestrator/src/main/resources/datahelix b/orchestrator/src/main/resources/datahelix new file mode 100644 index 000000000..49e501e54 --- /dev/null +++ b/orchestrator/src/main/resources/datahelix @@ -0,0 +1,3 @@ +#!/bin/bash +# Launch script to kick off Java JAR (/usr/bin/datahelix) +java -jar /usr/share/datahelix/datahelix.jar "$@" \ No newline at end of file diff --git a/orchestrator/src/main/resources/datahelix.1 b/orchestrator/src/main/resources/datahelix.1 new file mode 100644 index 000000000..ee886a743 --- /dev/null +++ b/orchestrator/src/main/resources/datahelix.1 @@ -0,0 +1,106 @@ +.TH DATAHELIX 1 + +.SH NAME +datahelix \- the open-source data generator. + +.SH SYNOPSIS +.B datahelix +[\fB\-c\fR \fIcombinationType\fR | \fB\-\-combinationstrategy=\fR\fIcombinationType\fR] +[\fB\-\-disable-schema-validation\fR] +[\fB\-h\fR | \fB\-\-help\fR] +[\fB\-n\fR \fImaxRows\fR | \fB\-\-max-rows=\fR\fImaxRows\fR] +[\fB\-o\fR \fIoutputPath\fR | \fB\-\-output-path=\fR\fIoutputPath\fR] +[\fB\-\-output-format=\fR\fIoutputFormat\fR] +[\fB\-p\fR \fIprofileFile\fR | \fB\-\-profile-file=\fR\fIprofileFile\fR] +[\fB\-\-\quiet\fR] +[\fB\-\-replace\fR] +[\fB\-\-set-from-file-directory=\fR\fIfromFilePath\fR] +[\fB\-v\fR | \fB\-\-version\fR] +[\fB\-\-verbose\fR] +[\fB\-\-visualiser-level=\fR\fIvisualiserLevel\fR] +[\fB\-\-visualiser-output-folder=\fR\fIvisualiserOutputFolder\fR] + +.SH DESCRIPTION +The generation of representative test and simulation data is a challenging and time-consuming task. +Although DataHelix was created to address a specific challenge in the financial services industry, you will find it a useful tool for the generation of realistic data for simulation and testing, regardless of industry sector. +All this from a straightforward JSON data profile document. + +.PP +For further documentation, examples , a getting started guide, and a profile creation guide visit the project's github at github.com/finos/datahelix + +.SH OPTIONS +.TP +.BR \-c ", " \-\-combination-strategy=\fIcombinationType\fR +Determines the type of combination strategy used +(EXHAUSTIVE, PINNING, MINIMAL) + +.TP +.BR \-\-disable-schema-validation +Disables schema validation + +.TP +.BR \-h ", " \-\-help +Display available command line options. + +.TP +.BR \-n ", " \-\-max-rows=\fImaxRows\fR +Defines the maximum number of rows that should be generated + +.TP +.BR \-o ", " \-\-output-path=\fIoutputPath\fR +The path to write the generated data file to. + +.TP +.BR \-\-output-format=\fIoutputFormat\fR +Output format +(CSV, JSON) + +.TP +.BR \-p ", " \-\-profile-file=\fIprofileFile\fR +The path of the profile json file. + +.TP +.BR \-\-quiet +Turns OFF default monitoring + +.TP +.BR \-\-replace +Defines whether to overwrite/replace existing output files. + +.TP +.BR \-\-set-from-file-directory=\fIfromFilePath\fR +Custom root for loading sets from file. + +.TP +.BR \-V ", " \-\-version +Print version information and exit. + +.TP +.BR \-\-verbose +Turns ON system out monitoring + +.TP +.BR \-\-visualiser-level=\fIvisualiserLevel\fR +Visualiser level +(OFF, STANDARD, DETAILED) + +.TP +.BR \-\-visualiser-output-folder=\fIvisualiserOutputFolder\fR +The path to the folder to write the generated visualiser files to (only used if visualiser-level != OFF). + +.SH EXAMPLES +.TP +.BR datahelix " " \-\-max-rows=100 " " \-\-replace " " \-\-profile-file=profile.json " " \-\-output-path=output.csv +The generator is a command line tool which reads a profile, and outputs data in CSV or JSON format. The \-\-max-rows=100 option tells the generator to create 100 rows of data, and the \-\-replace option tells it to overwrite previously generated files. +The compulsory \-\-profile-file option specifies the name of the input profile, and the \-\-output-path option specifies the location to write the output to. +In generate mode \-\-output-path is optional; the generator will default to standard output if it is not supplied. +By default the generator outputs progress, in rows per second, to the standard error output. +This can be useful when generating large volumes of data. +.SH NOTES +The github page for the datahelix project can be found at (https://github.com/finos/datahelix). +If you wish to contribute, request a feature, or report a bug please do so there. + +.PP +Copyright 2020 Scott Logic Ltd. +Distributed under the Apache License, Version 2.0. +SPDX-License-Identifier: Apache-2.0. diff --git a/orchestrator/src/main/resources/datahelix.1.gz b/orchestrator/src/main/resources/datahelix.1.gz new file mode 100644 index 000000000..7437a4594 Binary files /dev/null and b/orchestrator/src/main/resources/datahelix.1.gz differ