Skip to content

Commit

Permalink
Ziggy release 0.4.1 (2023-11-21)
Browse files Browse the repository at this point in the history
  • Loading branch information
wohler committed Nov 27, 2023
1 parent 59aca2a commit 122530e
Show file tree
Hide file tree
Showing 144 changed files with 1,482 additions and 1,168 deletions.
4 changes: 2 additions & 2 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
cff-version: 1.2.0
message: "If you use this software in your research, please cite it using these metadata."
title: Ziggy
version: v0.2.1
date-released: "2022-12-08"
version: v0.4.1
date-released: "2023-11-21"
abstract: "Ziggy, a portable, scalable infrastructure for science data processing pipelines, is the child of the Transiting Exoplanet Survey Satellite (TESS) pipeline and the grandchild of the Kepler Pipeline."
authors:
- family-names: Tenenbaum
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
</a>
</div>

[![DOI](https://zenodo.org/badge/518211190.svg)](https://zenodo.org/badge/latestdoi/518211190)

[[Previous]](doc/user-manual/user-manual.md)
[[Up]](doc/user-manual/user-manual.md)
[[Next]](doc/user-manual/system-requirements.md)
Expand Down
4 changes: 3 additions & 1 deletion build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
// To view a dependency graph using the taskinfo plugin, run "./gradlew tiTree build"

plugins {
id 'com.github.spotbugs' version '5.1.+'
id 'com.github.spotbugs' version '5.2.+'
id 'eclipse'
id 'jacoco'
id 'java'
Expand Down Expand Up @@ -198,11 +198,13 @@ jacocoTestReport {
javadoc {
title = "Ziggy API"
options.overview = "src/main/java/overview.html"
options.addBooleanOption("Xdoclint:-missing", true)
}

// The SpotBugs plugin adds spotbugsMain and spotbugsTest to the check task.
spotbugs {
// The SMP requires that all high priority problems are addressed before testing can commence.
// Set reportLevel to 'low' to reveal a handful of interesting and potential bugs.
reportLevel = 'high'
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

import org.gradle.api.DefaultTask;
import org.gradle.api.tasks.Input;
import org.gradle.api.tasks.OutputFile;
import org.gradle.api.tasks.TaskAction;

import com.google.common.collect.ImmutableList;
Expand Down Expand Up @@ -77,7 +78,7 @@ public class ZiggyVersionGenerator extends DefaultTask {
private static final int ABBREV = 10;

private static final String HEADER = "# This file is automatically generated by Gradle."
+ System.lineSeparator() + "# Do not edit." + System.lineSeparator();
+ System.lineSeparator();

private String versionPropertyName = DEFAULT_BUILD_VERSION_PROPERTY_NAME;
private String branchPropertyName = DEFAULT_BUILD_BRANCH_PROPERTY_NAME;
Expand All @@ -86,7 +87,7 @@ public class ZiggyVersionGenerator extends DefaultTask {
@TaskAction
public void generateVersionProperties() throws IOException, InterruptedException {

File outputFile = new File(getProject().getProjectDir(), buildConfiguration());
File outputFile = new File(getProject().getProjectDir(), getBuildConfiguration());

try (BufferedWriter output = new BufferedWriter(new FileWriter(outputFile))) {
output.write(HEADER);
Expand Down Expand Up @@ -124,7 +125,8 @@ public String runCommand(List<String> command) throws IOException, InterruptedEx
}

/** Override this to create your own subclass for pipeline-side version generation. */
protected String buildConfiguration() {
@OutputFile
public String getBuildConfiguration() {
return BUILD_CONFIGURATION;
}

Expand Down
8 changes: 4 additions & 4 deletions doc/user-manual/event-handler-examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,11 +51,11 @@ $

Once you do this, the Pi light on the console will quickly turn green. After a few seconds, you'll see a new pipeline instance appear in the instances panel:

<img src="images/event-handler-instances-1.png" style="width:19cm;"/>
<img src="images/event-handler-instances-1.png" style="width:13cm;"/>

The event handler automatically names the pipeline with the "bare" pipeline name ("sample"), the event handler name ("data-receipt"), and the timestamp of the event that started the processing. The Event name column shows the name of the event handler as well ("data-receipt"). Note that the Event name column is initially hidden as it duplicates the information in the pipeline name. If you want to sort the table by the event handler name, use the context menu in the table header to enable to Event name column. Then you can click in the header to update the sort. Meanwhile, the tasks table looks like this:

<img src="images/event-handler-tasks-1.png" style="width:19cm;"/>
<img src="images/event-handler-tasks-1.png" style="width:14cm;"/>

The data receipt task ran to completion before the display could even update, and the pipeline went on to its `permuter` tasks. After the usual few seconds, the pipeline will finish, with `flip` and `averaging` tasks.

Expand Down Expand Up @@ -103,11 +103,11 @@ $

As soon as the second ready file is created, a new pipeline instance will start:

<img src="images/event-handler-instances-2.png" style="width:19cm;"/>
<img src="images/event-handler-instances-2.png" style="width:13cm;"/>

The tasks display will look like this:

<img src="images/event-handler-tasks-2.png" style="width:19cm;"/>
<img src="images/event-handler-tasks-2.png" style="width:14cm;"/>

There's a fair amount of interesting stuff going on here, so let's dig into this display!

Expand Down
35 changes: 35 additions & 0 deletions doc/user-manual/halt-tasks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
<!-- -*-visual-line-*- -->

[[Previous]](rerun-task.md)
[[Up]](ziggy-gui-troubleshooting.md)
[[Next]](select-hpc.md)

## Halting Tasks

Sometimes it's necessary to stop the execution of tasks after they start running. Tasks that are running as jobs under control of a batch system at an HPC facility will provide command line tools for this, but they're a hassle to use when you're trying to halt a large number of jobs. Trying to halt tasks running locally is likewise hassle-tastic.

Fortunately, Ziggy will let you do this from the console.

### Halt all Jobs for a Task

To halt all jobs for a task, go to the tasks table on the instances panel, right click the task, and run the `Halt selected tasks` command:

<img src="images/halt-task-menu-item.png" style="width:14cm;"/>

You'll be prompted to confirm that you want to halt the task. When you do that, you'll see something like this:

<img src="images/halt-in-progress.png" style="width:32cm;"/>

The state of the task will be immediately moved to `ERROR`. The instance will go to state `ERRORS_RUNNING` because the other task is still running; once it completes, the instance will go to `ERRORS_STALLED`. Meanwhile, the alert looks like this:

<img src="images/halt-alert.png" style="width:32cm;"/>

As expected, it notifies you that the task stopped because it was halted and not due to an error of some kind.

### Halt all Tasks for an Instance

This is the same idea, except it's the pop-up menu for the instance table, and you select `Halt all incomplete tasks`.

[[Previous]](rerun-task.md)
[[Up]](ziggy-gui-troubleshooting.md)
[[Next]](select-hpc.md)
Binary file modified doc/user-manual/images/event-handler-instances-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified doc/user-manual/images/event-handler-instances-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified doc/user-manual/images/event-handler-tasks-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified doc/user-manual/images/event-handler-tasks-2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified doc/user-manual/images/flip-tasks.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/user-manual/images/halt-alert.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/user-manual/images/halt-in-progress.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/user-manual/images/halt-task-menu-item.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified doc/user-manual/images/instances-running.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed doc/user-manual/images/kill-alert.png
Binary file not shown.
Binary file removed doc/user-manual/images/kill-in-progress.png
Binary file not shown.
Binary file removed doc/user-manual/images/kill-task-menu-item.png
Binary file not shown.
Binary file modified doc/user-manual/images/monitoring-alerts.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified doc/user-manual/images/permuter-tasks.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified doc/user-manual/images/pipeline-done.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified doc/user-manual/images/restart-dialog.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified doc/user-manual/images/tasks-done.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
35 changes: 0 additions & 35 deletions doc/user-manual/kill-tasks.md

This file was deleted.

6 changes: 3 additions & 3 deletions doc/user-manual/rerun-task.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

[[Previous]](display-logs.md)
[[Up]](ziggy-gui-troubleshooting.md)
[[Next]](kill-tasks.md)
[[Next]](halt-tasks.md)

## Re-Run or Resume a Failed Task

Expand All @@ -12,7 +12,7 @@ Let's assume at this point that you've investigated the cause of your failed tas

To view the restart dialog box, select your failed task or tasks, right click, and run the `Restart failed tasks` command. You'll see this:

<img src="images/restart-dialog.png" style="width:15cm;"/>
<img src="images/restart-dialog.png" style="width:17cm;"/>

If you right-click on the `Restart Mode` field, you'll get a menu that offers several restart options.

Expand Down Expand Up @@ -60,4 +60,4 @@ In real life, it's possible that you'll encounter a situation like this one, in

[[Previous]](display-logs.md)
[[Up]](ziggy-gui-troubleshooting.md)
[[Next]](kill-tasks.md)
[[Next]](halt-tasks.md)
4 changes: 2 additions & 2 deletions doc/user-manual/select-hpc.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<!-- -*-visual-line-*- -->

[[Previous]](kill-tasks.md)
[[Previous]](halt-tasks.md)
[[Up]](user-manual.md)
[[Next]](remote-parameters.md)

Expand Down Expand Up @@ -95,6 +95,6 @@ Second: Ziggy allows you to automate the decision on whether a given number of s

By using these two parameters, you can, in effect, tell Ziggy in advance about your decisions about whether to resubmit a task and whether to use remote execution even if the number of subtasks to process is fairly small.

[[Previous]](kill-tasks.md)
[[Previous]](halt-tasks.md)
[[Up]](user-manual.md)
[[Next]](remote-parameters.md)
2 changes: 1 addition & 1 deletion doc/user-manual/start-pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ The pipeline and worker lights are grey again, the instance and all the tasks sh

At this point, you'd probably like an explanation of just what everything on the `Instances` panel is trying to tell you. If so, read on! Specifically, the article on [The Instances Panel](instances-panel.md).

<sup>1</sup> Although the Event name column appears in these screenshots, it is now initially hidden to save space. See the article [Event Handler Examples](event-handler-examples.md) for information on how to show and use this column.
<sup>1</sup> The Event name column is initially hidden to save space. See the article [Event Handler Examples](event-handler-examples.md) for information on how to show and use this column.

[[Previous]](ziggy-gui.md)
[[Up]](ziggy-gui.md)
Expand Down
2 changes: 1 addition & 1 deletion doc/user-manual/user-manual.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ Ziggy is "A Pipeline management system for Data Analysis Pipelines." This is the

11.4.​ [Re-Run or Resume a Failed Task](rerun-task.md)

11.5.​ [Killing Tasks](kill-tasks.md)
11.5.​ [Halting Tasks](halt-tasks.md)

12. [High Performance Computing](select-hpc.md)

Expand Down
2 changes: 1 addition & 1 deletion doc/user-manual/ziggy-gui-troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ For any given task you can display the task's log files directly from the consol

Ziggy also gives you options when you've decided what you want to do about a failed task.

### [Killing Tasks](kill-tasks.md)
### [Halting Tasks](halt-tasks.md)

You can stop a job or two if necessary.

Expand Down
2 changes: 1 addition & 1 deletion gradle.properties
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ org.gradle.parallel = true
// The version is updated when the first release candidate is created
// while following Release Branches in Appendix C of the SMP, Git
// Workflow. This property is used when publishing Ziggy.
version = 0.4.0
version = 0.4.1

// The Maven group for the published Ziggy libraries.
group = gov.nasa
2 changes: 1 addition & 1 deletion src/main/java/gov/nasa/ziggy/crud/AbstractCrud.java
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ protected final DatabaseService getDatabaseService() {
* Convenience method that returns the current persistence session. Do not cache this locally as
* it can vary between threads.
*
* @return the persistence session.
* @return the persistence session
*/
protected final Session getSession() {
return getDatabaseService().getSession();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@
import java.util.SortedSet;
import java.util.TreeSet;

import gov.nasa.ziggy.module.PipelineException;
import gov.nasa.ziggy.pipeline.definition.PipelineTask;
import gov.nasa.ziggy.pipeline.definition.crud.PipelineTaskCrud;
import gov.nasa.ziggy.util.AcceptableCatchBlock;
Expand Down Expand Up @@ -49,10 +48,8 @@ public String produceReport() {
/**
* Calculates the transitive closure of the relation produced(c,p).
*
* @param initialTaskIds
* @param acctCrud
* @return A {@link Map} from consumer to producer task id. If nothing then this returns an
* empty map.
* @return a {@link Map} from consumer to producer task id. If nothing then this returns an
* empty map
*/
Map<Long, Set<Long>> calculateClosure() {
Map<Long, Set<Long>> consumerProducer = new HashMap<>();
Expand Down Expand Up @@ -88,10 +85,6 @@ Map<Long, Set<Long>> calculateClosure() {
return consumerProducer;
}

/**
* @param consumerProducer consumer->producer
* @return producer->consumer
*/
Map<Long, Set<Long>> invertMap(Map<Long, Set<Long>> consumerProducer) {
Map<Long, Set<Long>> producerConsumer = new HashMap<>();

Expand All @@ -111,8 +104,6 @@ Map<Long, Set<Long>> invertMap(Map<Long, Set<Long>> consumerProducer) {
/**
* Finds the ids which are not pointed at by anything. Find all the producers which are not
* themselves consumers.
*
* @return
*/
Set<Long> findRoots(Set<Long> producers, Map<Long, Set<Long>> consumerProducer,
Map<Long, Set<Long>> producerConsumer) {
Expand Down Expand Up @@ -142,14 +133,6 @@ Set<Long> findRoots(Set<Long> producers, Map<Long, Set<Long>> consumerProducer,
return roots;
}

/**
* @param producerConsumer producer -&gt; consumer
* @param roots The top level consumers.
* @return report
* @throws IOException
* @throws PipelineException
*/

protected String formatReport(Map<Long, Set<Long>> producerConsumer, Set<Long> roots) {
List<Long> sortedRoots = new ArrayList<>(roots);
Collections.sort(sortedRoots);
Expand Down
33 changes: 13 additions & 20 deletions src/main/java/gov/nasa/ziggy/data/management/DataFileManager.java
Original file line number Diff line number Diff line change
Expand Up @@ -127,8 +127,9 @@ public DataFileManager(DatastorePathLocator datastorePathLocator, PipelineTask p

/**
* Constructor with PipelineTask and Paths to the task directory and the datastore root. Used in
* pipeline modules that use the DefaultUnitOfWork and DataFileType instances to identify and
* manage files that need to be moved between the task directory and the datastore.
* pipeline modules that use the UnitOfWorkGenerator.defaultUnitOfWorkGenerator() and
* DataFileType instances to identify and manage files that need to be moved between the task
* directory and the datastore.
*/
public DataFileManager(Path datastoreRoot, Path taskDirectory, PipelineTask pipelineTask) {
this.pipelineTask = pipelineTask;
Expand Down Expand Up @@ -215,8 +216,7 @@ public Map<DataFileType, Set<Path>> copyDataFilesByTypeToTaskDirectory(Path data
*
* @param datastoreDataFilesMap files to be copied, in the form of a {@link Map} that uses
* {@link DataFileType} as its key and a {@link Set} of data file {@link Path} instances as the
* map values.
* @param taskConfig {@link TaskConfigurationParameters} instance.
* map values
*/
public Map<DataFileType, Set<Path>> copyDataFilesByTypeToTaskDirectory(
Map<DataFileType, Set<Path>> datastoreDataFilesMap) {
Expand All @@ -227,13 +227,11 @@ public Map<DataFileType, Set<Path>> copyDataFilesByTypeToTaskDirectory(
for (Set<Path> paths : datastoreFilesMap.values()) {
datastoreFiles.addAll(paths);
}
// obtain the originators for all datastore files and add them as producers to the
// current pipeline task; also delete any existing ones so that in the event of a
// reprocess the correct information is reflected.
Set<Long> producerTaskIds = datastoreProducerConsumerCrud()
.retrieveProducers(datastoreFiles);
pipelineTask.clearProducerTaskIds();
pipelineTask.setProducerTaskIds(producerTaskIds);

// Obtain the originators for all datastore files and replace them as producers to the
// current pipeline task; in the event of a reprocess the correct information is reflected.
pipelineTask
.setProducerTaskIds(datastoreProducerConsumerCrud().retrieveProducers(datastoreFiles));

return datastoreFilesMap;
}
Expand All @@ -243,8 +241,6 @@ public Map<DataFileType, Set<Path>> copyDataFilesByTypeToTaskDirectory(
*
* @param datastoreSubDir subdirectory of datastore to use as the file source
* @param dataFileTypes set of DataFileType instances to use for the search
* @param taskConfig {@link TaskConfigurationManager} instance that indicates whether full
* processing or "keep ahead" processing is performed
* @return non @code{null} set of {@link Path} instances for data files to be used as input
*/
public Set<Path> dataFilesForInputs(Path datastoreSubDir, Set<DataFileType> dataFileTypes) {
Expand Down Expand Up @@ -503,13 +499,10 @@ public void copyToTaskDirectory(Set<? extends DataFileInfo> dataFiles) {
taskDirectory.resolve(dataFileInfo.getName()));
}

// obtain the originators for all datastore files and add them as producers to the
// current pipeline task; also delete any existing ones so that in the event of a
// reprocess the correct information is reflected.
Set<Long> producerTaskIds = datastoreProducerConsumerCrud()
.retrieveProducers(new HashSet<>(dataFileInfoToPath.values()));
pipelineTask.clearProducerTaskIds();
pipelineTask.setProducerTaskIds(producerTaskIds);
// Obtain the originators for all datastore files and replace them as producers to the
// current pipeline task; in the event of a reprocess the correct information is reflected.
pipelineTask.setProducerTaskIds(datastoreProducerConsumerCrud()
.retrieveProducers(new HashSet<>(dataFileInfoToPath.values())));
}

/**
Expand Down
Loading

0 comments on commit 122530e

Please sign in to comment.