diff --git a/.cproject b/.cproject new file mode 100644 index 0000000..f9670e5 --- /dev/null +++ b/.cproject @@ -0,0 +1,68 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + make + + clean + true + false + true + + + + diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..70cfadb --- /dev/null +++ b/.gitignore @@ -0,0 +1,13 @@ +**/.gradle* +**/__pycache__ +**/build +*.mexmaci64 +.classpath +.cproject +.project +.pydevproject +.settings +dist +doc/pdf +src/test/cpp/ExternalProcess/testprog.o +src/test/cpp/testmi/inputs-0.bin diff --git a/LICENSE.pdf b/LICENSE.pdf new file mode 100644 index 0000000..0dd8cbe Binary files /dev/null and b/LICENSE.pdf differ diff --git a/README.md b/README.md new file mode 100644 index 0000000..5822985 --- /dev/null +++ b/README.md @@ -0,0 +1,72 @@ + + +# Super-Basic Introduction + +## What is Ziggy? + +Ziggy is "A Pipeline management system for Data Analysis Pipelines." + +What does that mean? + +A data analysis pipeline is any data analysis software that proceeds in a step-by-step fashion, in which the inputs to the later steps include (but are not limited to) the outputs from the earlier steps. The vast majority of science data analysis falls into this category. + +A pipeline management system is all of the functionality that makes such a pipeline work. It's everything other than the actual software packages that do the processing. That includes execution of the algorithms on the data, but also such activities as: routing logging messages to the correct destinations; automatically executing the next step when the current step has completed; ensuring that the pipeline does the right thing when an exception occurs (either in Ziggy or in one of the algorithm packages); providing a user interface so that operators can control and monitor activities; managing a datastore of inputs and results; providing persistence for all of the records that need to be preserved across time; and much more. + +## Why does Ziggy exist? + +So -- why should anyone use Ziggy, or for that matter any other "pipeline management system" for their data analysis needs? Here's why: + +Any data analysis activity that handles more than a trivial amount of data will require some sort of pipeline management system. At a minimum, it's going to be essential to ensure that all the data gets processed and that the processing is uniform, because otherwise any results from the processing become suspect: the user has to wonder, "If I missed processing some subset of the data, would that affect my results?" and, "If I didn't process all of the data the same way -- if I changed how I did the processing midway through my dataset -- will that affect my results?" Because of these issues, data analysis inevitably winds up applying some degree of automation to the process, even if it's just a handful of shell scripts that the user runs manually. + +As data volumes get larger, the issue of managing the pipeline becomes more and more onerous, and more and more crucial. At the same time, development and maintenance of the pipeline manager becomes more and more of a distraction to the subject matter experts who just want to perform their data analysis, get their results, publish their papers, etc. At some point, rather than taking on the job of writing all this software that's not in their area of interest, the subject matter experts should look around for some existing software that will do all this management for them -- something that allows them to plug their processing application software into the management system and, presto! Complete system. + +Ziggy is that "something." + +## Where can I run Ziggy? + +Ziggy is actually pretty lightweight. During development, we run Ziggy on some fairly standard laptops without any problems, so you shouldn't have any trouble downloading, building, and trying out Ziggy. + +In terms of where you run Ziggy, what's more important than Ziggy itself is the data volume you need to process and the requirements you place on things like keeping the data in a location where mulitple users have access to it. Depending on the answers to those questions, you might be able to run your analysis on a laptop; a workstation; a server; or a cloud or high-performance computing (HPC) environment. Ziggy has been used in all of these different locations, based on the task at hand. + +## When did Ziggy get its start? + +Ziggy was originally written as the pipeline infrastructure (PI) component for the pipeline that processed data from NASA's Kepler mission, which used transit photometry to look for signals of planets circling distant stars. It was run on server clusters and the NASA Advanced Supercomputer (NAS) at NASA's Ames Research Center, and the original pipeline team was housed at Ames as well. + +A few years later, Ames provided the data analysis pipeline for another transit photometry mission, the Transiting Exoplanet Survey Satellite (TESS). A more advanced version of the Kepler PI component was developed for TESS, and was named Spiffy ("Science pipeline infrastructure for you"). + +In the fullness of time, members of the Kepler and TESS team realized that there was an opportunity to take Spiffy and make it into a software package that was suitable for truly huge data volumes (terabytes per day, as compared to the terabytes per month rate of TESS) and easier to use than Spiffy or PI had been. This resulted in the development process that culminated in Ziggy. + +## How do I put Ziggy to use? + +Glad you asked that! The intent of the [user manual](doc/user-manual/user-manual.md) is to take you through the process, step by step. Thus, we recommend starting at the first link and clicking your way down as you make progress. + +Additionally, Ziggy ships with a sample pipeline. This pipeline uses an extremely simple set of algorithms to demonstrate as much of Ziggy's prowess and features as possible. As we go through the steps, we'll show highlights from the sample pipeline so that you have a sense of what it is you should see. Anyway, it's always easier to explain something by example than to explain in a totally abstract way... + +## Who maintains Ziggy? + +Ziggy is still maintained by members of the Kepler and TESS team who are based at NASA Ames Research Center. + +## License + +Copyright © 2022 United States Government as represented by the Administrator of the National Aeronautics and Space Administration. All Rights Reserved. + +NASA acknowledges the SETI Institute’s primary role in authoring and producing Ziggy, a Pipeline Management System for Data Analysis Pipelines, under Cooperative Agreement Nos. NNX14AH97A, 80NSSC18M0068 & 80NSSC21M0079. + +This file is available under the terms of the NASA Open Source Agreement (NOSA). You should have received a copy of this agreement with the Ziggy source code; see the file [LICENSE.pdf](LICENSE.pdf). + +Disclaimers + +No Warranty: THE SUBJECT SOFTWARE IS PROVIDED "AS IS" WITHOUT ANY WARRANTY OF ANY KIND, EITHER EXPRESSED, IMPLIED, OR STATUTORY, INCLUDING, BUT NOT LIMITED TO, ANY WARRANTY THAT THE SUBJECT SOFTWARE WILL CONFORM TO SPECIFICATIONS, ANY IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR FREEDOM FROM INFRINGEMENT, ANY WARRANTY THAT THE SUBJECT SOFTWARE WILL BE ERROR FREE, OR ANY WARRANTY THAT DOCUMENTATION, IF PROVIDED, WILL CONFORM TO THE SUBJECT SOFTWARE. THIS AGREEMENT DOES NOT, IN ANY MANNER, CONSTITUTE AN ENDORSEMENT BY GOVERNMENT AGENCY OR ANY PRIOR RECIPIENT OF ANY RESULTS, RESULTING DESIGNS, HARDWARE, SOFTWARE PRODUCTS OR ANY OTHER APPLICATIONS RESULTING FROM USE OF THE SUBJECT SOFTWARE. FURTHER, GOVERNMENT AGENCY DISCLAIMS ALL WARRANTIES AND LIABILITIES REGARDING THIRD-PARTY SOFTWARE, IF PRESENT IN THE ORIGINAL SOFTWARE, AND DISTRIBUTES IT "AS IS." + +Waiver and Indemnity: RECIPIENT AGREES TO WAIVE ANY AND ALL CLAIMS AGAINST THE UNITED STATES GOVERNMENT, ITS CONTRACTORS AND SUBCONTRACTORS, AS WELL AS ANY PRIOR RECIPIENT. IF RECIPIENT'S USE OF THE SUBJECT SOFTWARE RESULTS IN ANY LIABILITIES, DEMANDS, DAMAGES, EXPENSES OR LOSSES ARISING FROM SUCH USE, INCLUDING ANY DAMAGES FROM PRODUCTS BASED ON, OR RESULTING FROM, RECIPIENT'S USE OF THE SUBJECT SOFTWARE, RECIPIENT SHALL INDEMNIFY AND HOLD HARMLESS THE UNITED STATES GOVERNMENT, ITS CONTRACTORS AND SUBCONTRACTORS, AS WELL AS ANY PRIOR RECIPIENT, TO THE EXTENT PERMITTED BY LAW. RECIPIENT'S SOLE REMEDY FOR ANY SUCH MATTER SHALL BE THE IMMEDIATE, UNILATERAL TERMINATION OF THIS AGREEMENT. + +## Other licenses + +Ziggy makes use of third-party software. Their licenses appear in the [licenses](licenses/licenses.md) directory. + +## Contributing + +If you are interested in contributing to Ziggy, please complete the NASA license agreement that applies to you: + +* [doc/legal/NASA-Corporate-CLA.pdf](doc/legal/NASA-Corporate-CLA.pdf) +* [doc/legal/NASA-Individual-CLA.pdf](doc/legal/NASA-Individual-CLA.pdf) diff --git a/RELEASE-NOTES.md b/RELEASE-NOTES.md new file mode 100644 index 0000000..bc49fe9 --- /dev/null +++ b/RELEASE-NOTES.md @@ -0,0 +1,17 @@ + + +# Ziggy Release Notes + +These are the release notes for Ziggy. While we're able to provide links to GitHub issues, we are not able to provide links to our internal Jira issues. + +## 0.2.0 + +This is the first Ziggy release to appear on GitHub. The major number of 0 indicates that we're still refactoring the Kepler and TESS codebases and reserve the right to make breaking changes from time to time as we make the pipeline more widely useful. However, the general pipeline infrastructure has been in production use since Kepler's launch in 2009. + +### Resolved issues + +In future releases, this section will contain a list of GitHub/Jira issues that were resolved and incorporated into the release. If the resolution for an issue introduced a breaking change, it will be described so that you can update your properties files or pipeline configurations in advance. + +## 0.1.0 + +This release was overtaken by events. This release came and went before we got authorization to push our code to GitHub. diff --git a/build.gradle b/build.gradle new file mode 100644 index 0000000..7563ef9 --- /dev/null +++ b/build.gradle @@ -0,0 +1,105 @@ +// Ziggy Gradle Build File +// +// This file contains the declarative portion of the build only. +// Imperative tasks are found in script-plugins and are applied at the +// bottom of this file. + +// See comment for ziggyDependencies in gradle.properties. +ext.ziggyDependencies = "$rootDir/$ziggyDependencies" + +apply plugin: 'java' +apply plugin: 'eclipse' + +defaultTasks "assemble" + +repositories { + mavenCentral() + flatDir { + dirs "$ziggyDependencies/lib" + } +} + +dependencies { + // Needed to compile ziggy. + compile 'com.github.librepdf:openpdf:1.3.+' + compile 'com.github.testdriven.guice:commons-configuration:1.+' + compile 'com.google.guava:guava:23.+' + compile 'com.jgoodies:jgoodies-forms:1.9.+' + compile 'com.jgoodies:jgoodies-looks:2.7.+' + compile 'commons-cli:commons-cli:1.5.+' + compile 'commons-codec:commons-codec:1.+' + compile 'commons-io:commons-io:2.11.+' + compile 'org.apache.commons:commons-collections4:4.+' + compile 'org.apache.commons:commons-compress:1.+' + compile 'org.apache.commons:commons-csv:1.9.+' + compile 'org.apache.commons:commons-exec:1.+' + compile 'org.apache.commons:commons-lang3:3.12.+' + compile 'org.apache.commons:commons-math3:3.6.+' + compile 'org.apache.commons:commons-text:1.+' + compile 'org.apache.logging.log4j:log4j-api:2.17.+' + compile 'org.apache.logging.log4j:log4j-core:2.17.+' + compile 'org.apache.logging.log4j:log4j-slf4j-impl:2.17.+' + compile 'org.hibernate.javax.persistence:hibernate-jpa-2.0-api:1.0.+' + compile 'org.hibernate:hibernate-core:4.2.+' + compile 'org.javassist:javassist:3.18.1-GA' + compile 'org.jfree:jfreechart:1.0.+' + compile 'org.netbeans.api:org-netbeans-swing-outline:+' + compile 'org.tros:l2fprod-properties-editor:1.3.+' + compile 'tanukisoft:wrapper:3.2.+' + + // Libraries built in buildSrc. + compile ':jarhdf5:1.12.+' + + // Needed to compile unit tests. + compile 'junit:junit:4.+' + compile 'org.hamcrest:hamcrest:2.+' + compile 'org.mockito:mockito-core:1.10.+' + + // Needed to run unit tests. + // hsqldb version 2.3.4 causes PipelineInstanceTaskCrudTest to + // fail with HsqlException: data exception: datetime field + // overflow. This can be fixed by replacing Long.MAX_VALUE with a + // smaller date. Other problems ensue with version 2.6. + compile 'org.hsqldb:hsqldb:2.3.2' + + // Needed at runtime. + compile 'org.postgresql:postgresql:9.4-1201-jdbc4' +} + +test { + systemProperty "log4j2.configurationFile","$projectDir/test/data/logging/log4j2.xml" + systemProperty "java.library.path", "$ziggyDependencies/lib" + + forkEvery = 1 + + testLogging { + events "failed", "skipped" + } + + // The RMI tests appear to have an ordering sensitivity. To wit, + // If Intra- runs before Inter-, they all run successfully. + // If Inter- runs before Intra-, Intra- tests all fail. + // Nonetheless, these tests are good to have for interactive use, + // as they allow to troubleshoot RMI problems more simply. Thus, + // we leave that class out of gradle test but keep it around in case + // we need it. + exclude "**/RmiInt*" +} + +javadoc { + title = "Ziggy API" + options.overview = "src/main/java/overview.html" +} + +// Apply Ziggy Gradle script plugins. +apply from: "script-plugins/copy.gradle" +apply from: "script-plugins/database-schemas.gradle" +apply from: "script-plugins/eclipse.gradle" +apply from: "script-plugins/gcc.gradle" +apply from: "script-plugins/hdf5.gradle" +// apply from: "script-plugins/matlab.gradle" +apply from: "script-plugins/misc.gradle" +apply from: "script-plugins/test.gradle" +apply from: "script-plugins/wrapper.gradle" +apply from: "script-plugins/xml-schemas.gradle" +apply from: "script-plugins/ziggy-libraries.gradle" diff --git a/buildSrc/build.gradle b/buildSrc/build.gradle new file mode 100644 index 0000000..87be7c2 --- /dev/null +++ b/buildSrc/build.gradle @@ -0,0 +1,61 @@ +/** + * This is the buildSrc/build.gradle file. This should contain a minimal + * amount of dependencies as this is used by the build system itself. + */ +apply plugin: 'eclipse' +apply plugin: 'java' + +defaultTasks 'assemble' + +repositories { + mavenCentral() +} + +dependencies { + // Needed to compile buildSrc. + compile 'com.google.guava:guava:23.+' + compile 'commons-io:commons-io:2.11.+' + compile 'org.apache.commons:commons-exec:1.+' + compile 'org.freemarker:freemarker:2.3.+' + + // Needed to compile unit tests. + compile 'junit:junit:4.+' + compile 'org.mockito:mockito-core:1.10.+' + + // The following stuff is needed in order to build custom gradle tasks and plugins. + compile gradleApi() + compile localGroovy() +} + +eclipse { + classpath { + defaultOutputDir = file("$buildDir/eclipse") + // outputBaseDir = file("$buildDir/eclipse") + downloadSources = false + downloadJavadoc = false + + // Gradle 4.4 now specifies all of the output directories, but puts + // them in the Eclipse default of "bin". There is a feature request + // to add classpath.outputBaseDir that has the same syntax and effect + // as the now-useless defaultOutputDir. In the meantime, update the + // path manually. + file.whenMerged { + entries.each { entry -> + if (entry.kind == "src" && entry.hasProperty("output")) { + // The use of $buildDir does not return build. + entry.output = "build/eclipse" + } + } + } + } +} + +test { + testLogging { + showStandardStreams = true + } +} + +apply from: "script-plugins/hdf5.gradle" +apply from: "script-plugins/wrapper.gradle" + diff --git a/buildSrc/script-plugins/hdf5.gradle b/buildSrc/script-plugins/hdf5.gradle new file mode 100644 index 0000000..23251d0 --- /dev/null +++ b/buildSrc/script-plugins/hdf5.gradle @@ -0,0 +1,45 @@ +// Download and build HDF5. +// +// TODO Next time around, parameterize the version and possibly the URL + +task buildHdf5() { + def tmp = file("$buildDir/tmp/hdf5") + def hdf5 = file("$tmp/hdf5-1.12.2") + def lib = file("$buildDir/lib") + + outputs.file "$lib/jarhdf5-1.12.2.jar" + + doLast() { + tmp.mkdirs() + + // Use the SHA256 links at https://www.hdfgroup.org/downloads/hdf5/source-code/ to determine the URL. + exec { + workingDir tmp + commandLine "curl", "-o", "hdf5-1.12.2.#1", "https://hdf-wordpress-1.s3.amazonaws.com/wp-content/uploads/manual/HDF5/HDF5_1_12_2/source/hdf5-1.12.2.{tar.bz2,sha256}" + } + exec { + workingDir tmp + commandLine "tar", "-xf", "hdf5-1.12.2.tar.bz2" + } + exec { + workingDir hdf5 + commandLine "sh", "-c", "./configure --with-zlib=/usr --prefix=$buildDir --enable-threadsafe --with-pthread=/usr --enable-unsupported --enable-java" + } + exec { + workingDir hdf5 + commandLine "make" + } + exec { + workingDir hdf5 + commandLine "make", "install" + } + copy { + from("$tmp/lib") { + include "*.jar" + } + into "$lib" + } + } +} + +assemble.dependsOn buildHdf5 diff --git a/buildSrc/script-plugins/wrapper.gradle b/buildSrc/script-plugins/wrapper.gradle new file mode 100644 index 0000000..fc210d2 --- /dev/null +++ b/buildSrc/script-plugins/wrapper.gradle @@ -0,0 +1,56 @@ +// Download and build the wrapper. + +import org.gradle.internal.os.OperatingSystem + +task buildWrapper() { + def version = "3.5.26" + def tmp = file("$buildDir/tmp/wrapper") + def bin = file("$buildDir/bin") + def lib = file("$buildDir/lib") + def format = "" + def libSuffix = "" + + OperatingSystem os = OperatingSystem.current(); + if (os.isLinux()) { + format = "linux-x86-64" + libSuffix = "so" + } else if (os.isMacOsX()) { + format = "macosx-universal-64" + libSuffix = "jnilib" + } + def wrapperBaseName = "wrapper-$format-$version" + def wrapperDir = file("$tmp/$wrapperBaseName") + def wrapperLib = "libwrapper.$libSuffix" + + outputs.file "$bin/wrapper" + outputs.file "$wrapperLib" + outputs.file "$lib/wrapper.jar" + + doLast() { + tmp.mkdirs() + + // See https://wrapper.tanukisoftware.com/doc/english/versions.jsp to determine the URL. + exec { + workingDir tmp + commandLine "curl", "-o", "wrapper-${version}.tar.gz", "https://download.tanukisoftware.com/wrapper/$version/${wrapperBaseName}.tar.gz" + } + exec { + workingDir tmp + commandLine "tar", "-xf", "wrapper-${version}.tar.gz" + } + copy { + from("$wrapperDir/bin") { + include "wrapper" + } + into "$bin" + } + copy { + from("$wrapperDir/lib") { + include "$wrapperLib", "wrapper.jar" + } + into "$lib" + } + } +} + +assemble.dependsOn buildWrapper diff --git a/buildSrc/settings.gradle b/buildSrc/settings.gradle new file mode 100644 index 0000000..e69de29 diff --git a/buildSrc/src/main/groovy/gov/nasa/tess/buildutil/XmlValidator.groovy b/buildSrc/src/main/groovy/gov/nasa/tess/buildutil/XmlValidator.groovy new file mode 100644 index 0000000..ce6384b --- /dev/null +++ b/buildSrc/src/main/groovy/gov/nasa/tess/buildutil/XmlValidator.groovy @@ -0,0 +1,96 @@ +package gov.nasa.tess.buildutil + +import java.io.BufferedReader +import java.io.File +import java.io.FileReader +import java.util.ArrayList + +import javax.xml.XMLConstants +import javax.xml.bind.JAXBContext +import javax.xml.bind.Unmarshaller +import javax.xml.bind.ValidationEvent +import javax.xml.bind.ValidationEventLocator +import javax.xml.bind.ValidationException +import javax.xml.bind.util.JAXBSource +import javax.xml.bind.util.ValidationEventCollector +import javax.xml.validation.SchemaFactory +import javax.xml.transform.stream.StreamSource + +public class XmlValidator { + + public static final class MessageError extends ValidationException { + + def lineNumber + def columnNumber + + public MessageError(String message) { + super(message) + } + + public MessageError(String message, String errorCode) { + super(message, errorCode) + } + + public MessageError(String message, String errorCode, Throwable exception) { + super(message, errorCode, exception) + } + + public MessageError(String message, Throwable exception) { + super(message, exception) + } + + public MessageError(Throwable exception) { + super(exception) + } + + public void setPosition(lineNumber, columnNumber) { + this.lineNumber = lineNumber + this.columnNumber = columnNumber + } + + int getLineNumber() { + return lineNumber + } + + int getColumnNumber() { + return columnNumber + } + } + + public static List validate(File xsdFile, File xmlFile) { + + def errors = new ArrayList() + def schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI) + def xsdSource + def xmlSource + try { + xsdSource = new StreamSource(new FileReader(xsdFile)) + xmlSource = new StreamSource(new FileReader(xmlFile)) + def schema = schemaFactory.newSchema(xsdSource) + def validator = schema.newValidator() + + validator.validate(xmlSource) + } catch (javax.xml.bind.UnmarshalException e) { + MessageError error = new MessageError(e.getMessage(), e.getErrorCode(), e) + errors[0] = error + } catch (org.xml.sax.SAXParseException e) { + MessageError error = new MessageError(e.getMessage(), e) + error.setPosition(e.getLineNumber(), e.getColumnNumber()) + errors[0] = error + } finally { + try { + if (xsdSource != null) { + xsdSource.close() + } + } catch (Exception ignore) {} + + try { + if (xmlSource != null) { + xmlSource.close() + } + } catch (Exception ignore) {} + } + + return errors + } +} diff --git a/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/EnvUtil.java b/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/EnvUtil.java new file mode 100644 index 0000000..ba15104 --- /dev/null +++ b/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/EnvUtil.java @@ -0,0 +1,33 @@ +package gov.nasa.ziggy.buildutil; + +import java.util.NoSuchElementException; + +/** + * Environment variables utilities. + * + * @author Sean McCauliff + * + */ +public class EnvUtil { + + /** + * @param key non-null environment variable key. + * @return the environment variable + * @exception NoSuchElementException if the key does not exist. + */ + public static String environment(String key) { + String value = System.getenv(key); + if (value == null) { + throw new NoSuchElementException("Environment variable \"" + key + "\" is not set."); + } + return value; + } + + public static String environment(String key, String defaultValue) { + String value = System.getenv(key); + if (value == null) { + return defaultValue; + } + return value; + } +} diff --git a/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/GitRevisionHistoryToLatex.java b/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/GitRevisionHistoryToLatex.java new file mode 100644 index 0000000..f957f5a --- /dev/null +++ b/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/GitRevisionHistoryToLatex.java @@ -0,0 +1,353 @@ +package gov.nasa.ziggy.buildutil; + +import java.io.BufferedReader; +import java.io.BufferedWriter; +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.io.InputStreamReader; +import java.io.PrintWriter; +import java.io.StringWriter; +import java.text.ParseException; +import java.text.SimpleDateFormat; +import java.util.ArrayList; +import java.util.Calendar; +import java.util.Date; +import java.util.List; +import java.util.Map; +import java.util.SortedMap; +import java.util.TreeMap; +import java.util.concurrent.TimeUnit; + +/** + * Utility methods for dealing with latex files. + */ +public class GitRevisionHistoryToLatex { + + private static final int LINES_PER_PAGE = 50; + private static final int CHARACTERS_PER_LINE = 100; + private static final ThreadLocal revisionHistoryDateFormat = ThreadLocal + .withInitial(() -> new SimpleDateFormat("d MMM yyyy")); + + /** + * Runs "git log" on the specified document file to generate the revision history in the same + * directory as -revision-history.tex + * + * @throws Exception + */ + public static void gitLogToLatex(String documentFileName, File destDir) throws Exception { + try { + System.out.println( + "Generating revision history for latex file \"" + documentFileName + "\"."); + gitLogToLatexInternal(documentFileName, destDir); + } catch (Throwable t) { + // This is here because gradle -stacktrace causes this stack trace + // to disappear. + StringWriter stringWriter = new StringWriter(); + PrintWriter pw = new PrintWriter(stringWriter); + t.printStackTrace(pw); + System.out.println(stringWriter.toString()); + throw t; + } + } + + /** + * Escapes stuff like $ and _. + */ + private static String escapeLatex(String raw) { + StringBuilder bldr = new StringBuilder(raw.length()); + for (int i = 0; i < raw.length(); i++) { + char c = raw.charAt(i); + switch (c) { + case '\\': + bldr.append("\\\\"); + break; + case '$': + bldr.append("\\$"); + break; + case '_': + bldr.append("\\_"); + break; + case '{': + bldr.append("\\{"); + break; + case '}': + bldr.append("\\}"); + break; + case '~': + bldr.append("\\~"); + break; + case '&': + bldr.append("\\&"); + break; + // TODO: test me + case '"': + bldr.append("\\verb|\"|"); + break; + default: + bldr.append(c); + } + } + return bldr.toString(); + } + + private static void writeErrorToLatexFile(Throwable exception, File outputFile) + throws IOException { + StringWriter errMsgString = new StringWriter(); + PrintWriter exceptionOutput = new PrintWriter(errMsgString); + exception.printStackTrace(exceptionOutput); + exceptionOutput.close(); + BufferedWriter fileWriter = new BufferedWriter(new FileWriter(outputFile)); + try { + fileWriter.write("Error &"); + fileWriter.write(escapeLatex(errMsgString.toString())); + fileWriter.write("\\\\\n"); + } finally { + fileWriter.close(); + } + } + + private static final class HistoryEntry { + private final Date entryDate; + private final String entry; + private final int pageNumber; + + public HistoryEntry(Date entryDate, String entry, int pageNumber) { + this.entryDate = entryDate; + this.entry = entry; + this.pageNumber = pageNumber; + } + + void writeTo(Appendable writer) throws IOException { + writer.append(revisionHistoryDateFormat.get().format(entryDate)); + writer.append(" & "); + writer.append(entry); + writer.append("\\\\\n"); + } + } + + private static void gitLogToLatexInternal(String documentFileName, File destDir) + throws ParseException, IOException, InterruptedException { + String[] command = new String[] { "git", "log", "--follow", documentFileName }; + + File documentFile = new File(documentFileName); + String fileNameWithoutExtension = stripFileExtension(documentFile.getName()); + String outputFName = fileNameWithoutExtension + "-revision-history.tex"; + File outputFile = new File(destDir, outputFName); + + List lines = readProcessOutput(command, outputFile); + + // System.out.println("Completed reading git history."); + SortedMap dateToLog = collectHistoryEntries(lines); + + // Paginate + List> historyEntriesPerPage = paginate(dateToLog); + + // Write out top level history tex file + System.out.println("Writing history file \"" + outputFile + "\"."); + try (BufferedWriter outputWriter = new BufferedWriter(new FileWriter(outputFile))) { + for (int pagei = 0; pagei < historyEntriesPerPage.size(); pagei++) { + outputWriter.write("\\input{build/" + fileNameWithoutExtension + + "-revision-history-" + pagei + ".tex}\n"); + } + } + + // Write out each page's history. + for (int pagei = 0; pagei < historyEntriesPerPage.size(); pagei++) { + File pageFileName = new File(outputFile.getParent(), + fileNameWithoutExtension + "-revision-history-" + pagei + ".tex"); + try (BufferedWriter writer = new BufferedWriter(new FileWriter(pageFileName))) { + // Table header + writer.write("\\begin{table}[H]\n"); + writer.write("\\centering\n"); + writer.write("\\begin{tabularx}{\\linewidth}{l X}\n"); + writer.write("\\thead{Change Date} & \\thead{Notes} \\\\\n"); + writer.write("\\hline\n"); + + for (HistoryEntry historyEntry : historyEntriesPerPage.get(pagei)) { + historyEntry.writeTo(writer); + } + + // Table footer. + writer.write("\\hline\n"); + writer.write("\\end{tabularx}\n"); + writer.write("\\end{table}\n"); + writer.write("\\newpage\n"); + } + } + + } + + private static List> paginate(SortedMap dateToLog) { + + List> entriesPerPage = new ArrayList<>(); + int lineCount = 0; + for (Map.Entry entry : dateToLog.entrySet()) { + String entryString = entry.getValue().toString(); + int lineCountForEntry = entryString.length() / CHARACTERS_PER_LINE + 1; + lineCount += lineCountForEntry; + int pageNo = lineCount / LINES_PER_PAGE; + + if (entriesPerPage.size() <= pageNo) { + entriesPerPage.add(new ArrayList<>(LINES_PER_PAGE)); + } + List l = entriesPerPage.get(pageNo); + l.add(new HistoryEntry(entry.getKey(), entryString, pageNo)); + } + + return entriesPerPage; + } + + private static SortedMap collectHistoryEntries(List lines) + throws ParseException { + TreeMap dateToLog = new TreeMap<>(); + SimpleDateFormat dateFormat = new SimpleDateFormat("EEE MMM d HH:mm:ss yyyy Z"); + StringBuilder currentEntry = null; + Date entryDate = null; + boolean skipEntry = false; + for (String line : lines) { + if (line.startsWith("commit")) { + // Start of new entry + if (currentEntry != null && entryDate != null) { + StringBuilder existingLog = dateToLog.get(entryDate); + existingLog.append(currentEntry); + } + currentEntry = new StringBuilder(); + entryDate = null; + skipEntry = false; + } else if (line.startsWith("Date: ")) { + // Start of new date + String dateLine = line.substring(6, line.length()).trim(); + Calendar calendar = Calendar.getInstance(); + calendar.setTime(dateFormat.parse(dateLine)); + calendar.set(Calendar.HOUR, 0); + calendar.set(Calendar.MINUTE, 0); + calendar.set(Calendar.SECOND, 0); + calendar.set(Calendar.AM_PM, Calendar.AM); + entryDate = calendar.getTime(); + if (!dateToLog.containsKey(entryDate)) { + dateToLog.put(entryDate, new StringBuilder()); + } + } else if (line.startsWith("Author")) { + // Ignore + } else if (line.matches("\\s+") || line.length() == 0) { + // Ignore lines with only white space + } else if (line.contains("Merge branch")) { + // Ignore log entries that are merges. + skipEntry = true; + currentEntry = null; + } else if (!skipEntry) { + // Content + if (currentEntry == null) { + throw new NullPointerException("currentEntry == null. line=\"" + line + "\""); + } + line = line.trim(); + currentEntry.append("~"); + currentEntry.append(escapeLatex(line)); + } + } + if (currentEntry != null && entryDate != null) { + // Append last entry + StringBuilder existingLog = dateToLog.get(entryDate); + existingLog.append(currentEntry); + } + return dateToLog; + } + + private static List readProcessOutput(String[] command, File outputFile) + throws InterruptedException, IOException { + Process process = null; + List lines = new ArrayList<>(1024); + try { + ProcessBuilder processBuilder = new ProcessBuilder(command); + // processBuilder.inheritIO(); + process = processBuilder.start(); + // System.out.println("Running git process."); + BufferedReader bufferedReader = new BufferedReader( + new InputStreamReader(process.getInputStream())); + + for (String line = bufferedReader.readLine(); line != null; line = bufferedReader + .readLine()) { + // System.out.println("lineRead " + line); + lines.add(line); + } + process.waitFor(1, TimeUnit.SECONDS); + + } catch (IOException ioe) { + System.out.println("Writing error to latex file."); + try { + writeErrorToLatexFile(ioe, outputFile); + } catch (IOException ignored) { + } + throw ioe; + } finally { + if (process != null) { + try { + process.getErrorStream().close(); + } catch (Exception ignored) { + } + try { + process.getOutputStream().close(); + } catch (Exception ignored) { + } + try { + process.getInputStream().close(); + } catch (Exception ignored) { + } + } + } + + if (process.exitValue() != 0) { + throw new IOException( + "Process exited with non-zero exit code (" + process.exitValue() + ")."); + } + return lines; + } + + /** + * Returns a file name without its extension so blah.txt would become blah (no dot). + */ + public static String stripFileExtension(String fname) { + int dotIndex = fname.lastIndexOf('.'); + if (dotIndex == -1 || dotIndex == 0) { + return fname; + } else { + return fname.substring(0, dotIndex); + } + } + + /** + * Returns the file extension + */ + public static String fileExtension(String fname) { + int dotIndex = fname.lastIndexOf('.'); + if (dotIndex == -1 || dotIndex == 0 || dotIndex == (fname.length() - 1)) { + return ""; + } else { + return fname.substring(dotIndex + 1, fname.length()); + } + } + + public static String changeFileExtension(String fname, String newExtension) { + if (newExtension.charAt(0) != '.') { + return newExtension = "." + newExtension; + } + String prefix = stripFileExtension(fname); + return prefix + newExtension; + } + + /** + * Given the specified file name this appends the creation date before the file name extension. + */ + public static String appendCreationDate(String fname) { + String extension = fileExtension(fname); + if (extension.length() != 0) { + extension = "." + extension; + } + String prefix = stripFileExtension(fname); + // TODO: Is this the correct format? + SimpleDateFormat dateFormatter = new SimpleDateFormat("dd-MMM-yyyy"); + return prefix + "-" + dateFormatter.format(new Date()) + extension; + } +} + diff --git a/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/Mcc.java b/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/Mcc.java new file mode 100644 index 0000000..72c3bb2 --- /dev/null +++ b/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/Mcc.java @@ -0,0 +1,221 @@ +package gov.nasa.ziggy.buildutil; + +import static java.nio.file.attribute.PosixFilePermission.GROUP_EXECUTE; +import static java.nio.file.attribute.PosixFilePermission.GROUP_READ; +import static java.nio.file.attribute.PosixFilePermission.OWNER_EXECUTE; +import static java.nio.file.attribute.PosixFilePermission.OWNER_READ; + +import java.io.File; +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.attribute.PosixFilePermission; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Set; +import java.util.stream.Collectors; + +import org.apache.commons.io.FileUtils; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import org.gradle.api.GradleException; +import org.gradle.api.file.FileCollection; +import org.gradle.api.tasks.InputFiles; +import org.gradle.api.tasks.Optional; +import org.gradle.api.tasks.OutputDirectory; +import org.gradle.api.tasks.OutputFile; +import org.gradle.api.tasks.TaskAction; +import org.gradle.internal.os.OperatingSystem; + +/** + * Compiles the matlab executables (i.e. it runs mcc). This class can not be made final. + *

+ * This class puts the variable MCC_DIR into the environment and sets it to the directory of the project that has + * invoked this task. + * + * @author Bill Wohler + * @author Sean McCauliff + * @author Forrest Girouard + */ +public class Mcc extends TessExecTask { + + private static final Logger log = LoggerFactory.getLogger(Mcc.class); + + private FileCollection controllerFiles; + private FileCollection additionalFiles; // added with mcc -a option + private File outputExecutable; + private boolean singleThreaded = true; + + public Mcc() { + setEnabled(isMccEnabled()); + } + + @InputFiles + public FileCollection getControllerFiles() { + return controllerFiles; + } + + public void setControllerFiles(FileCollection controllerFiles) { + this.controllerFiles = controllerFiles; + } + + public FileCollection getAdditionalFiles() { + return additionalFiles; + } + + public void setAdditionalFiles(FileCollection additionalFiles) { + this.additionalFiles = additionalFiles; + } + + @OutputFile + public File getOutputExecutable() { + String path = outputExecutable.getPath(); + if (OperatingSystem.MAC_OS == OperatingSystem.current()) { + path += ".app"; + return new File(path, "Contents/MacOS/" + outputExecutable.getName()); + } + + return outputExecutable; + } + + @OutputDirectory + public File getOutputApplication() { + String path = outputExecutable.getPath(); + if (OperatingSystem.MAC_OS == OperatingSystem.current()) { + path += ".app"; + return new File(path); + } + + return outputExecutable.getParentFile(); + } + + public void setOutputExecutable(File outputExecutable) { + this.outputExecutable = outputExecutable; + } + + @Optional + public boolean isSingleThreaded() { + return singleThreaded; + } + + public void setSingleThreaded(boolean newValue) { + singleThreaded = newValue; + } + + @TaskAction + public void action() { + log.info(String.format("%s.action()\n", this.getClass().getSimpleName())); + File matlabHome = matlabHome(); + + File buildBinDir = new File(getProject().getBuildDir(), "bin"); + List command = new ArrayList<>(); + + command.addAll(Arrays.asList("mcc", "-v", "-m", "-N", "-d", buildBinDir.toString(), "-R", + "-nodisplay", "-R", "-nodesktop")); + + if (isSingleThreaded()) { + command.add("-R"); + command.add("-singleCompThread"); + } + + for (String s : new String[] { "stats", "signal" }) { + command.add("-p"); + command.add(matlabHome + "/toolbox/" + s); + } + + command.add("-o"); + command.add(outputExecutable.getName()); + + String path = outputExecutable.getPath(); + File executable = new File(path); + if (OperatingSystem.MAC_OS == OperatingSystem.current()) { + path += ".app"; + executable = new File(path); + String message = String.format( + "The outputExecutable, \"%s\", already exists and cannot be deleted\n", executable); + if (executable.exists()) { + log.info(String.format("%s: already exists, delete it\n", executable)); + if (executable.isDirectory()) { + try { + FileUtils.deleteDirectory(executable); + } catch (IOException e) { + log.error(message); + throw new GradleException(message, e); + } + } else if (!executable.delete()) { + log.error(message); + throw new GradleException(message); + } + } + if (executable.exists()) { + log.error(message); + throw new GradleException(message); + } + } + + if (controllerFiles != null) { + for (File f : controllerFiles) { + command.add(f.toString()); + } + } + + if (additionalFiles != null) { + for (File f : additionalFiles) { + command.add("-a"); + command.add(f.toString()); + } + } + + String cmd = command.stream().collect(Collectors.joining(" ")); + List fullCommand = new ArrayList<>(); + fullCommand.add("/bin/bash"); + fullCommand.add("-c"); + fullCommand.add(cmd); + cmd = fullCommand.stream().collect(Collectors.joining(" ")); + log.info(cmd); + + ProcessBuilder processBuilder = new ProcessBuilder(fullCommand); + try { + processBuilder.environment() + .put("MCC_DIR", getProject().getProjectDir().getCanonicalPath()); + } catch (IOException e) { + log.error(String.format("Could not set MCC_DIR: %s", e.getMessage()), e); + } + execProcess(processBuilder); + + Set neededPermissions = new HashSet<>(Arrays.asList( + new PosixFilePermission[] { OWNER_EXECUTE, GROUP_EXECUTE, OWNER_READ, GROUP_READ })); + + if (!executable.exists()) { + String message = "The outputExecutable,\"" + executable + "\" does not exist."; + log.error(message); + throw new GradleException(message); + } else { + Set currentPosixFilePermissions = null; + try { + currentPosixFilePermissions = Files.getPosixFilePermissions(executable.toPath()); + log.info(currentPosixFilePermissions.stream() + .map(p -> p.name()) + .collect(Collectors.joining(" ", "Current file permissions are: ", "."))); + if (!neededPermissions.containsAll(currentPosixFilePermissions)) { + currentPosixFilePermissions.addAll(neededPermissions); + log.info(currentPosixFilePermissions.stream() + .map(p -> p.name()) + .collect(Collectors.joining(" ", "Setting file permissions to: ", ","))); + Files.setPosixFilePermissions(executable.toPath(), currentPosixFilePermissions); + } + } catch (IOException ioe) { + String message = "Failed to either get or set permissions on outputExecutable \"" + + outputExecutable; + log.error(message); + throw new GradleException(message); + } + } + File readme = new File(outputExecutable.getParentFile(), "readme.txt"); + if (readme.exists()) { + readme.renameTo(new File(outputExecutable.getParentFile(), + outputExecutable.getName() + "-readme.txt")); + } + } +} diff --git a/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/TessExecTask.java b/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/TessExecTask.java new file mode 100644 index 0000000..4d2ef1a --- /dev/null +++ b/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/TessExecTask.java @@ -0,0 +1,75 @@ +package gov.nasa.ziggy.buildutil; + +import java.io.BufferedReader; +import java.io.File; +import java.io.IOException; +import java.io.InputStreamReader; +import java.util.stream.Collectors; + +import org.gradle.api.DefaultTask; +import org.gradle.api.GradleException; + +/*** + * Gradle has an ExecTask, but you can't subclass it to make your own tasks. + * + * @author Sean McCauliff + */ +abstract class TessExecTask extends DefaultTask { + + protected static boolean isMccEnabled() { + String mccEnabled = System.getenv("MCC_ENABLED"); + if (mccEnabled == null || !Boolean.valueOf(mccEnabled)) { + return false; + } + return true; + } + + protected static File matlabHome() { + String matlabHome = System.getenv("MATLAB_HOME"); + if (matlabHome == null) { + throw new GradleException("MATLAB_HOME is not set"); + } + if (matlabHome.contains("2010")) { + throw new GradleException( + "MATLAB_HOME=" + matlabHome + ". This is not the MATLAB I'm looking for."); + } + File home = new File(matlabHome); + if (!home.exists()) { + throw new GradleException("MATLAB_HOME=\"" + home + "\" does not exist."); + } + return home; + } + + /** + * @param command The first element in this list is the name of the executable. the others are + * arguments. No escaping is required. + */ + protected static void execProcess(ProcessBuilder processBuilder) { + + processBuilder.redirectErrorStream(true); + String commandString = processBuilder.command().stream().collect(Collectors.joining(" ")); + try { + Process process = processBuilder.start(); + process.waitFor(); + if (process.exitValue() != 0) { + try (InputStreamReader inRead = new InputStreamReader(process.getInputStream()); + BufferedReader breader = new BufferedReader(inRead)) { + StringBuilder bldr = new StringBuilder(); + bldr.append(commandString).append('\n'); + for (String line = breader.readLine(); line != null; line = breader + .readLine()) { + bldr.append(line).append("\n"); + } + throw new GradleException(bldr.toString()); + } catch (IOException ioe) { + throw new GradleException("Command \"" + commandString + + "\" Failed, no other information avaialble."); + } + } + } catch (IOException e) { + throw new GradleException("While trying to exec \"" + commandString + "\".", e); + } catch (InterruptedException e) { + throw new GradleException(commandString, e); + } + } +} diff --git a/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/ZiggyCpp.java b/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/ZiggyCpp.java new file mode 100644 index 0000000..f870025 --- /dev/null +++ b/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/ZiggyCpp.java @@ -0,0 +1,191 @@ +package gov.nasa.ziggy.buildutil; + +import java.io.File; +import java.util.List; + +import org.gradle.api.DefaultTask; +import org.gradle.api.Project; +import org.gradle.api.tasks.InputFiles; +import org.gradle.api.tasks.OutputFile; +import org.gradle.api.tasks.TaskAction; + +import gov.nasa.ziggy.buildutil.ZiggyCppPojo.BuildType; + +/** + * Performs C++ builds for Gradle. Ziggy and its pipelines use this instead of the + * standard Gradle C++ task classes because it provides only the options we actually + * need without many that we can't use, and also because it allows us to use the CXX + * environment variable to define the location of the C++ compiler. This allows us to + * use the same compiler as some of the third party libraries used by Ziggy (they also + * use CXX to select their compiler). + * + * Because Gradle classes that extend DefaultTask are effectively impossible to unit test, + * all of the actual data and work are managed by the ZiggyCppPojo class (which does have + * unit tests), while this class simply provides access to the ZiggyCppPojo class for Gradle. + * + * @author PT + * + */ +public class ZiggyCpp extends DefaultTask { + + /** + * Data and methods used to perform the C++ compile and link. + */ + private ZiggyCppPojo ziggyCppPojo = new ZiggyCppPojo(); + + /** + * Default constructor. Provides the ZiggyCppPojo object with directories for the project and + * default options, if set + */ + public ZiggyCpp() { + Project project = getProject(); + ziggyCppPojo.setBuildDir(project.getBuildDir()); + ziggyCppPojo.setRootDir(pipelineRootDir(project)); + if (project.hasProperty(ZiggyCppPojo.DEFAULT_COMPILE_OPTIONS_GRADLE_PROPERTY)) { + ziggyCppPojo.setCompileOptions(ZiggyCppPojo.gradlePropertyToList( + project.property(ZiggyCppPojo.DEFAULT_COMPILE_OPTIONS_GRADLE_PROPERTY))); + } + if (project.hasProperty(ZiggyCppPojo.DEFAULT_LINK_OPTIONS_GRADLE_PROPERTY)) { + ziggyCppPojo.setLinkOptions(ZiggyCppPojo.gradlePropertyToList( + project.property(ZiggyCppPojo.DEFAULT_LINK_OPTIONS_GRADLE_PROPERTY))); + } + if (project.hasProperty(ZiggyCppPojo.DEFAULT_RELEASE_OPTS_GRADLE_PROPERTY)) { + ziggyCppPojo.setReleaseOptimizations(ZiggyCppPojo.gradlePropertyToList( + project.findProperty(ZiggyCppPojo.DEFAULT_RELEASE_OPTS_GRADLE_PROPERTY))); + } + if (project.hasProperty(ZiggyCppPojo.DEFAULT_DEBUG_OPTS_GRADLE_PROPERTY)) { + ziggyCppPojo.setDebugOptimizations(ZiggyCppPojo.gradlePropertyToList( + project.findProperty(ZiggyCppPojo.DEFAULT_DEBUG_OPTS_GRADLE_PROPERTY))); + } + } + + /** + * Returns the root directory of the pipeline + * + * @param project Current project + * @return rootDir if the pipelineRootDir project property is not set; if that property is set, + * the contents of the pipelineRootDir property are returned as a File + */ + public static File pipelineRootDir(Project project) { + File pipelineRootDir = null; + if (project.hasProperty(ZiggyCppPojo.PIPELINE_ROOT_DIR_PROP_NAME)) { + pipelineRootDir = new File( + project.property(ZiggyCppPojo.PIPELINE_ROOT_DIR_PROP_NAME).toString()); + } else { + pipelineRootDir = project.getRootDir(); + } + return pipelineRootDir; + } + + /** Provides access to the ZiggyCppPojo action() method for Gradle. */ + @TaskAction + public void action() { + ziggyCppPojo.action(); + } + + /** + * Provides access to the ZiggyCppPojo list of C++ files for Gradle, and specifies for gradle + * that those files are the inputs for this task. + * + * @return List of C++ files found in the C++ source file directory. + */ + @InputFiles + public List getCppFiles() { + return ziggyCppPojo.getCppFiles(); + } + + /** + * Provides Gradle with access to the ZiggyCppPojo File that will be the final product of the + * build. Also specifies that this file is the output for this task. + * + * @return File containing the target product for a task. + */ + @OutputFile + public File getBuiltFile() { + return ziggyCppPojo.getBuiltFile(); + } + + // Below are setters and getters for the ZiggyCppPojo members that must be mutated + // by ZiggyCpp tasks in gradle. In principle only setters are needed, but in the + // interest of sanity getters are also provided. Note that not all ZiggyCppPojo + // members need to be set by ZiggyCpp, there are a number that are used internally + // and are not set as part of a task. + // + // Note that the setters take Object and List rather than String and List. + // This is because Gradle allows its text string objects to be either Java String class or + // Groovy GString class. Consequently, we pass everything to ZiggyCppPojo as Objects, and + // ZiggyCppPojo uses the toString() methods to convert everything to Java Strings. + + // Path to the C++ source files + public void setCppFilePaths(List cppFilePaths) { + ziggyCppPojo.setCppFilePaths(cppFilePaths); + } + + public List getCppFilePaths() { + return ziggyCppPojo.getCppFilePaths(); + } + + // Paths for include files + public void setIncludeFilePaths(List includeFilePaths) { + ziggyCppPojo.setIncludeFilePaths(includeFilePaths); + } + + public List getIncludeFilePaths() { + return ziggyCppPojo.getIncludeFilePaths(); + } + + // paths for libraries that must be linked in + public void setLibraryPaths(List libraryPaths) { + ziggyCppPojo.setLibraryPaths(libraryPaths); + } + + public List getLibraryPaths() { + return ziggyCppPojo.getLibraryPaths(); + } + + // Libraries that must be linked in + public void setLibraries(List libraries) { + ziggyCppPojo.setLibraries(libraries); + } + + public List getLibraries() { + return ziggyCppPojo.getLibraries(); + } + + // compiler options + public void setCompileOptions(List compileOptions) { + ziggyCppPojo.setCompileOptions(compileOptions); + ; + } + + public List getCompileOptions() { + return ziggyCppPojo.getCompileOptions(); + } + + // linker options + public void setLinkOptions(List linkOptions) { + ziggyCppPojo.setLinkOptions(linkOptions); + } + + public List getLinkOptions() { + return ziggyCppPojo.getLinkOptions(); + } + + // output type (executable, shared library, static library) + public void setOutputType(Object outputType) { + ziggyCppPojo.setOutputType(outputType); + } + + public BuildType getOutputType() { + return ziggyCppPojo.getOutputType(); + } + + // Name of file to be produced + public void setOutputName(Object name) { + ziggyCppPojo.setOutputName(name); + } + + public String getOutputName() { + return ziggyCppPojo.getOutputName(); + } +} diff --git a/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/ZiggyCppMex.java b/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/ZiggyCppMex.java new file mode 100644 index 0000000..a29f4cd --- /dev/null +++ b/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/ZiggyCppMex.java @@ -0,0 +1,235 @@ +package gov.nasa.ziggy.buildutil; + +import java.io.File; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import org.gradle.api.DefaultTask; +import org.gradle.api.GradleException; +import org.gradle.api.Project; +import org.gradle.api.tasks.InputFiles; +import org.gradle.api.tasks.OutputFile; +import org.gradle.api.tasks.OutputFiles; +import org.gradle.api.tasks.TaskAction; + +import gov.nasa.ziggy.buildutil.ZiggyCppPojo.BuildType; + +/** + * Performs mexfile builds for Gradle. The sequence of build steps is as follows: + * 1. The C/C++ files are compiled with a MATLAB_MEX_FILE compiler directive. + * 2. The object files from (1) are linked into a shared object library. + * 3. The actual mexfiles are built from the appropriate object files and the + * shared object library. + * The user specifies the following: + * 1. The path to the C/C++ files + * 2. Compiler and linker options, including libraries, library paths, include + * file paths, optimization flags. + * 3. The names of the desired mexfiles. + * 4. Optionally, a name for the shared library (otherwise a default name is generated + * from the C++ source file path). + * + * Because it is effectively impossible to unit test any class that extends DefaultTask, + * the actual workings of the ZiggyCppMex class are in a separate class, ZiggyCppMexPojo, + * which has appropriate unit testing. This class provides a thin interface between + * Gradle and the ZiggyCppMexPojo class. + * + * @author PT + * + */ +public class ZiggyCppMex extends DefaultTask { + + private static final Logger log = LoggerFactory.getLogger(ZiggyCppMex.class); + private ZiggyCppMexPojo ziggyCppMexObject = new ZiggyCppMexPojo(); + + /** + * Default constructor. This constructor populates the 3 project directories needed by the + * ZiggyCppMexPojo class (build, root, and project), and populates the default compile and link + * options, and the default release and debug optimization options, if these are set as extra + * properties in the Project object. + */ + public ZiggyCppMex() { + Project project = getProject(); + ziggyCppMexObject.setBuildDir(project.getBuildDir()); + ziggyCppMexObject.setRootDir(ZiggyCpp.pipelineRootDir(project)); + ziggyCppMexObject.setProjectDir(project.getProjectDir()); + if (project.hasProperty(ZiggyCppMexPojo.DEFAULT_COMPILE_OPTIONS_GRADLE_PROPERTY)) { + ziggyCppMexObject.setCompileOptions(ZiggyCppPojo.gradlePropertyToList( + project.property(ZiggyCppMexPojo.DEFAULT_COMPILE_OPTIONS_GRADLE_PROPERTY))); + } + if (project.hasProperty(ZiggyCppMexPojo.DEFAULT_LINK_OPTIONS_GRADLE_PROPERTY)) { + ziggyCppMexObject.setLinkOptions(ZiggyCppPojo.gradlePropertyToList( + project.property(ZiggyCppMexPojo.DEFAULT_LINK_OPTIONS_GRADLE_PROPERTY))); + } + if (project.hasProperty(ZiggyCppMexPojo.DEFAULT_RELEASE_OPTS_GRADLE_PROPERTY)) { + ziggyCppMexObject.setReleaseOptimizations(ZiggyCppPojo.gradlePropertyToList( + project.findProperty(ZiggyCppMexPojo.DEFAULT_RELEASE_OPTS_GRADLE_PROPERTY))); + } + if (project.hasProperty(ZiggyCppMexPojo.DEFAULT_DEBUG_OPTS_GRADLE_PROPERTY)) { + ziggyCppMexObject.setDebugOptimizations(ZiggyCppPojo.gradlePropertyToList( + project.findProperty(ZiggyCppMexPojo.DEFAULT_DEBUG_OPTS_GRADLE_PROPERTY))); + } + setMatlabPath(); + } + + /** Provides access to the ZiggyCppMexPojo method action() for Gradle. */ + @TaskAction + public void action() { + ziggyCppMexObject.action(); + } + + /** Specifies that the C/C++ source files are the input files for this Gradle task. */ + @InputFiles + public List getCppFiles() { + return ziggyCppMexObject.getCppFiles(); + } + + /** Specifies that the mexfiles are the output files for this Gradle task. */ + @OutputFiles + public List getMexfiles() { + return ziggyCppMexObject.getMexfiles(); + } + + /** Specifies that the shared object library is also an output file for this Gradle task. */ + @OutputFile + public File getBuiltFile() { + return ziggyCppMexObject.getBuiltFile(); + } + + // Below are setters and getters for the ZiggyCppMexPojo members that must be mutated + // by ZiggyCppMex tasks in gradle. In principle only setters are needed, but in the + // interest of sanity getters are also provided. Note that not all ZiggyCppMexPojo + // members need to be set by ZiggyCppMex, there are a number that are used internally + // and are not set as part of a task. + // + // Note that the setters take Object and List rather than String and List. + // This is because Gradle allows its text string objects to be either Java String class or + // Groovy GString class. Consequently, we pass everything to ZiggyCppMexPojo as Objects, and + // ZiggyCppMexPojo uses the toString() methods to convert everything to Java Strings. + + // Path to the C++ source files + public void setCppFilePath(Object cppFilePath) { + ziggyCppMexObject.setCppFilePath(cppFilePath); + } + + public String getCppFilePath() { + return ziggyCppMexObject.getCppFilePaths().get(0); + } + + // Paths for include files + public void setIncludeFilePaths(List includeFilePaths) { + ziggyCppMexObject.setIncludeFilePaths(includeFilePaths); + } + + public List getIncludeFilePaths() { + return ziggyCppMexObject.getIncludeFilePaths(); + } + + // paths for libraries that must be linked in + public void setLibraryPaths(List libraryPaths) { + ziggyCppMexObject.setLibraryPaths(libraryPaths); + } + + public List getLibraryPaths() { + return ziggyCppMexObject.getLibraryPaths(); + } + + // Libraries that must be linked in + public void setLibraries(List libraries) { + ziggyCppMexObject.setLibraries(libraries); + } + + public List getLibraries() { + return ziggyCppMexObject.getLibraries(); + } + + // compiler options + public void setCompileOptions(List compileOptions) { + ziggyCppMexObject.setCompileOptions(compileOptions); + ; + } + + public List getCompileOptions() { + return ziggyCppMexObject.getCompileOptions(); + } + + // linker options + public void setLinkOptions(List linkOptions) { + ziggyCppMexObject.setLinkOptions(linkOptions); + } + + public List getLinkOptions() { + return ziggyCppMexObject.getLinkOptions(); + } + + // output type (executable, shared library, static library) + public void setOutputType(Object outputType) { + ziggyCppMexObject.setOutputType(outputType); + } + + public BuildType getOutputType() { + return ziggyCppMexObject.getOutputType(); + } + + // Name of shared library file to be produced + public void setOutputName(Object name) { + ziggyCppMexObject.setOutputName(name); + } + + public String getOutputName() { + return ziggyCppMexObject.getOutputName(); + } + + // Names of mexfiles to be produced + public void setMexfileNames(List mexfileNames) { + ziggyCppMexObject.setMexfileNames(mexfileNames); + } + + public List getMexfileNames() { + return ziggyCppMexObject.getMexfileNames(); + } + + /** + * Sets the path to the MATLAB executable. This searches the following options in the following + * order: 1. If there is a project extra property, matlabPath, use that. 2. If no matlabPath + * extra property, use the PATH environment variable to find the first path that includes both + * MATLAB (case-insensitive) and "bin" (case-sensitive). Use that. 3. If neither the path env + * var nor the project have the needed information, use the MATLAB_HOME env var. 4. If all of + * the above fail, throw a GradleException. + */ + public void setMatlabPath() { + String matlabPath = null; + Project project = getProject(); + if (project.hasProperty(ZiggyCppMexPojo.MATLAB_PATH_PROJECT_PROPERTY)) { + matlabPath = project.findProperty(ZiggyCppMexPojo.MATLAB_PATH_PROJECT_PROPERTY) + .toString(); + log.info("MATLAB path set from project extra property: " + matlabPath); + } + if (matlabPath == null) { + String systemPath = System.getenv("PATH"); + if (systemPath != null) { + String[] systemPaths = systemPath.split(":"); + for (String path : systemPaths) { + String pathLower = path.toLowerCase(); + if (pathLower.contains("matlab") && path.endsWith("bin")) { + matlabPath = path.substring(0, path.length() - 4); + log.info("MATLAB path set from PATH environment variable: " + matlabPath); + break; + } + } + } + } + if (matlabPath == null) { + String matlabHome = System.getenv(ZiggyCppMexPojo.MATLAB_PATH_ENV_VAR); + if (matlabHome != null) { + matlabPath = matlabHome; + log.info("MATLAB path set from MATLAB_HOME environment variable: " + matlabPath); + } + } + if (matlabPath == null) { + throw new GradleException( + "Unable to find MATLAB path in Gradle, PATH env var, or MATLAB_HOME env var"); + } + ziggyCppMexObject.setMatlabPath(matlabPath); + } +} diff --git a/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/ZiggyCppMexPojo.java b/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/ZiggyCppMexPojo.java new file mode 100644 index 0000000..d253ad8 --- /dev/null +++ b/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/ZiggyCppMexPojo.java @@ -0,0 +1,342 @@ +package gov.nasa.ziggy.buildutil; + +import java.io.File; +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.LinkedHashMap; +import java.util.List; +import java.util.Map; + +import org.apache.commons.exec.DefaultExecutor; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import org.gradle.api.GradleException; +import org.gradle.internal.os.OperatingSystem; + +/** + * Manages the construction of mexfiles from C/C++ source code. The source code is compiled + * using the C++ compiler in the CXX environment variable, with appropriate compiler options + * for use in creating object files that can be used in mexfiles. These object files are then + * combined into a shared library. Finally, the C++ compiler is used to produce mexfiles for + * each source file that contains the mexFunction entry point by linking the object files with + * the shared library and attaching an appropriate file type. + * + * Because Gradle task classes cannot easily be unit tested, the key functionality needed for + * mexfile construction is in this class; a separate class, ZiggyCppMex, extends the Gradle + * DefaultTask and provides the interface from Gradle to ZiggyCppMexPojo. + * + * @author PT + * + */ +public class ZiggyCppMexPojo extends ZiggyCppPojo { + + private static final Logger log = LoggerFactory.getLogger(ZiggyCppMexPojo.class); + + public static final String DEFAULT_COMPILE_OPTIONS_GRADLE_PROPERTY = "defaultCppMexCompileOptions"; + public static final String DEFAULT_LINK_OPTIONS_GRADLE_PROPERTY = "defaultCppMexLinkOptions"; + public static final String DEFAULT_RELEASE_OPTS_GRADLE_PROPERTY = "defaultCppMexReleaseOptimizations"; + public static final String DEFAULT_DEBUG_OPTS_GRADLE_PROPERTY = "defaultCppMexDebugOptimizations"; + public static final String MATLAB_PATH_PROJECT_PROPERTY = "matlabPath"; + public static final String MATLAB_PATH_ENV_VAR = "MATLAB_HOME"; + + /** Path to the MATLAB directories to be used in the build */ + private String matlabPath; + + /** Names of the mexfiles that are to be built (these need to match names of C++ files) */ + private List mexfileNames = null; + + /** Files for the mexfiles that are to be built (used to determine the task inputs) */ + private List mexfiles = null; + + /** Project directory, used to generate a library file name */ + private File projectDir = null; + + public ZiggyCppMexPojo() { + super.setOutputType(BuildType.SHARED); + } + + /** + * Returns the correct file type for a mexfile given the OS. + * + * @return string "mexmaci64" for a Mac, "mexa64" for Linux, GradleException for all other + * operating systems. + */ + String mexSuffix() { + OperatingSystem os = getOperatingSystem(); + String mexSuffix = null; + if (os.isMacOsX()) { + mexSuffix = "mexmaci64"; + } else if (os.isLinux()) { + mexSuffix = "mexa64"; + } else { + throw new GradleException("Operating system " + os.toString() + " not supported"); + } + return mexSuffix; + } + + /** + * Returns the correct MATLAB architecture name given the OS. + * + * @return string "maci64" for a Mac, "glnxa64" for Linux, Gradle exception for all other + */ + String matlabArch() { + OperatingSystem os = getOperatingSystem(); + String matlabArch = null; + if (os.isMacOsX()) { + matlabArch = "maci64"; + } else if (os.isLinux()) { + matlabArch = "glnxa64"; + } else { + throw new GradleException("Operating system " + os.toString() + " not supported"); + } + return matlabArch; + } + + /** + * Generates the mexfiles that are the output of this class, and stores them in the mexfiles + * list. The files that are generated are named $mexfileName.$mexfileSuffix, and are stored in + * $buildDir/lib . + */ + void populateMexfiles() { + if (getBuildDir() == null || mexfileNames == null) { + throw new GradleException("buildDir and mexfileNames must not be null"); + } + mexfiles = new ArrayList<>(); + for (String mexfileName : mexfileNames) { + String fullMexfileName = mexfileName + "." + mexSuffix(); + File mexfile = new File(libDir(), fullMexfileName); + mexfiles.add(mexfile); + } + } + + /** + * Generates a Map between the mexfiles and their corresponding object files. + * + * @return HashMap from mexfiles to object files. If any mexfile is missing its corresponding + * object file, a GradleException is thrown. + */ + private Map mapMexfilesToObjectFiles() { + + // A linked hashmap is used to preserve order -- which doesn't matter so + // much for actual use (though it is convenient), but matters a lot for testing + Map mexfileMap = new LinkedHashMap<>(); + List mexfiles = getMexfiles(); + List objfiles = getObjectFiles(); + for (String mexfileName : mexfileNames) { + File mexfile = getFileByName(mexfiles, mexfileName); + File objfile = getFileByName(objfiles, mexfileName); + if (objfile == null) { + throw new GradleException("No object file for mexfile " + mexfileName); + } + mexfileMap.put(mexfile, objfile); + } + return mexfileMap; + } + + /** + * Finds the file out of a list of files that has a particular name when the file type is + * removed. + * + * @param files List of files + * @param fileName Name of desired match, assumed to have no file type attached to it. + * @return File with a name that contains the desired match, or null if no match is found. + */ + private File getFileByName(List files, String fileName) { + File foundFile = null; + for (File file : files) { + String nameOfFile = file.getName(); + int finalDot = nameOfFile.lastIndexOf('.'); + String nameWithoutType = nameOfFile.substring(0, finalDot); + if (nameWithoutType.equals(fileName)) { + foundFile = file; + break; + } + } + return foundFile; + } + + /** + * Generates the command to perform compilation of a source file. The command includes the + * MATLAB include directory as an include path, and includes the mexfile compiler directive. + */ + @Override + public String generateCompileCommand(File sourceFile) { + + // generate the include path + String matlabIncludePath = matlabPath + "/extern/include"; + return generateCompileCommand(sourceFile, matlabIncludePath, "DMATLAB_MEX_FILE"); + } + + public String matlabLibPath() { + String matlabLibPath = matlabPath + "/bin/" + matlabArch(); + return matlabLibPath; + } + + @Override + protected void populateBuiltFile() { + if (getOutputName() == null || getOutputName().isEmpty()) { + setOutputName(generateSharedObjectName()); + } + super.populateBuiltFile(); + } + + /** + * Generates the command to link the source files into a shared object. The commad includes the + * MATLAB library path and library names. If the user has not selected a name for the library, + * one will be generated from the project and C++ file paths. + */ + @Override + public String generateLinkCommand() { + + // if the build file is not set, then set it now to a default value + if (getOutputName() == null || getOutputName().isEmpty()) { + setOutputName(generateSharedObjectName()); + } + + // construct the path to the MATLAB shared object libraries + return generateLinkCommand(matlabLibPath()); + } + + /** + * Generates a name for the shared object library in the event that none has been set. This is + * done by taking the project name and adding to it the components of the C++ path name, + * separated by hyphens. For example, if the project directory is /path/to/pipeline/module1, and + * the source directory is /path/to/pipeline/module1/src/main/cpp/mex, the name of the library + * will be module1-src-main-cpp-mex, resulting in a shared object named + * libmodule1-src-main-cpp-mex.so or .dylib. + * + * @return Generated name for the shared object library. + */ + public String generateSharedObjectName() { + String objectNameStart = getProjectDir().getName(); + int projectDirLength = getProjectDir().getAbsolutePath().length(); + String truncatedCppPath = getCppFilePaths().get(0).substring(projectDirLength + 1); + String[] truncatedCppPathParts = truncatedCppPath.split("/"); + StringBuilder sharedObjectNameBuilder = new StringBuilder(); + sharedObjectNameBuilder.append(objectNameStart + "-"); + for (int i = 0; i < truncatedCppPathParts.length; i++) { + sharedObjectNameBuilder.append(truncatedCppPathParts[i]); + if (i < truncatedCppPathParts.length - 1) { + sharedObjectNameBuilder.append("-"); + } + } + return sharedObjectNameBuilder.toString(); + } + + /** + * Generates the mex command for a given file. + * + * @param mexfile The desired mexfile output. + * @param obj The object file with the mexFunction entry point for the mexfile. + * @return A complete mex command in string form. + */ + public String generateMexCommand(File mexfile, File obj) { + + // if the build file is not set, then set it now to a default value + if (getOutputName() == null || getOutputName().isEmpty()) { + setOutputName(generateSharedObjectName()); + } + + StringBuilder mexCommandBuilder = new StringBuilder(); + mexCommandBuilder.append(getCppCompiler() + " "); + mexCommandBuilder.append("-o " + mexfile.getAbsolutePath() + " "); + mexCommandBuilder.append(obj.getAbsolutePath() + " "); + mexCommandBuilder.append(argListToString(getLibraryPaths(), "-L")); + mexCommandBuilder.append("-L" + matlabLibPath() + " "); + mexCommandBuilder.append("-L" + libDir().getAbsolutePath() + " "); + mexCommandBuilder.append(argListToString(getLibraries(), "-l")); + mexCommandBuilder.append("-lmex -lmx -lmat "); + mexCommandBuilder.append("-l" + getOutputName() + " -shared"); + return mexCommandBuilder.toString(); + } + + /** + * Main method used by Gradle. This method starts by using the ZiggyCppPojo action() method to + * compile the source files and build the shared library. The desired mexfiles are then looped + * over, and the mexfile commands are run by a DefaultExecutor. + */ + @Override + public void action() { + + log.info(String.format("%s.action()\n", this.getClass().getSimpleName())); + + // Start by performing the compilation + compileAction(); + + // Map the mexfiles to their object files + Map mexfileMap = mapMexfilesToObjectFiles(); + + // remove the mexfile objects from the list of objects so they don't go into the + // shared object library + getObjectFiles().removeAll(mexfileMap.values()); + + // construct the shared object + linkAction(); + + // loop over mexfiles + for (File mexfile : mexfileMap.keySet()) { + String mexCommand = generateMexCommand(mexfile, mexfileMap.get(mexfile)); + log.info(mexCommand); + DefaultExecutor mexExec = getDefaultExecutor(); + + // It's not strictly necessary to do this, since all the files have full + // paths, but it's a good practice nonetheless + mexExec.setWorkingDirectory(objDir()); + + // execute the mex command + try { + int returnCode = mexExec.execute(new CommandLineComparable(mexCommand)); + if (returnCode != 0) { + throw new GradleException("Mexing of file " + mexfile.getName() + " failed"); + } + } catch (IOException e) { + throw new GradleException("Mexing of file " + mexfile + " failed", e); + } + } + } + + // Build type is not optional for the C++ Mex builds + @Override + public void setOutputType(BuildType buildType) { + log.warn("ZiggyCppMex does not support build types other than shared"); + } + + @Override + public void setOutputType(Object buildType) { + log.warn("ZiggyCppMex does not support build types other than shared"); + } + + // Setters and getters + public void setMexfileNames(List mexfileNames) { + this.mexfileNames = new ArrayList<>(); + this.mexfileNames.addAll(ZiggyCppPojo.objectListToStringList(mexfileNames)); + } + + public List getMexfileNames() { + return mexfileNames; + } + + public void setMatlabPath(Object matlabPath) { + this.matlabPath = matlabPath.toString(); + } + + public String getMatlabPath() { + return matlabPath; + } + + public List getMexfiles() { + if (mexfiles == null) { + populateMexfiles(); + } + return mexfiles; + } + + public File getProjectDir() { + return projectDir; + } + + public void setProjectDir(File projectDir) { + this.projectDir = projectDir; + } +} diff --git a/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/ZiggyCppPojo.java b/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/ZiggyCppPojo.java new file mode 100644 index 0000000..8a238bd --- /dev/null +++ b/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/ZiggyCppPojo.java @@ -0,0 +1,759 @@ +package gov.nasa.ziggy.buildutil; + +import java.io.File; +import java.io.FilenameFilter; +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; + +import org.apache.commons.exec.CommandLine; +import org.apache.commons.exec.DefaultExecutor; +import org.apache.commons.io.FileUtils; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import org.gradle.api.GradleException; +import org.gradle.internal.os.OperatingSystem; + +/** + * Performs compilation and linking of C++ code for Ziggy and for pipelines based on Ziggy. + * The command line options, source directory, include paths, library paths, library names, and + * type of build (executable, shared library, static library) and output file name + * must be specified in the Gradle task that makes use of this class. The compiler is determined + * from the CXX environment variable, thus is compatible with third-party packages that use the same + * convention. Actual source file names are deduced from listing the source directory, object file + * information is generated during the compile and saved for use in the link. + * + * If Gradle's JVM has cppdebug set to true as a system property, the compile and link commands will + * have appropriate options (-g and -Og). These will be placed after the compile / link options, thus + * will override any optimization options supplied to Gradle. + * + * NB: this is a POJO that makes minimal use of the Gradle API. In particular, it does not extend + * DefaultTask. This is because classes that extend DefaultTask are de facto impossible to unit test + * in Java. The ZiggyCpp class embeds a ZiggyCpp class to perform its actions and store its members. + * + * @author PT + * + */ + +public class ZiggyCppPojo { + + private static final Logger log = LoggerFactory.getLogger(ZiggyCppPojo.class); + + /** Type of build: shared object, static library, or standalone program */ + enum BuildType { + SHARED, STATIC, EXECUTABLE + } + + public static final String CPP_COMPILER_ENV_VAR = "CXX"; + public static final String[] CPP_FILE_TYPES = { ".c", ".cpp" }; + public static final String CPP_DEBUG_PROPERTY_NAME = "cppdebug"; + + public static final String DEFAULT_COMPILE_OPTIONS_GRADLE_PROPERTY = "defaultCppCompileOptions"; + public static final String DEFAULT_LINK_OPTIONS_GRADLE_PROPERTY = "defaultCppLinkOptions"; + public static final String DEFAULT_RELEASE_OPTS_GRADLE_PROPERTY = "defaultCppReleaseOptimizations"; + public static final String DEFAULT_DEBUG_OPTS_GRADLE_PROPERTY = "defaultCppDebugOptimizations"; + public static final String PIPELINE_ROOT_DIR_PROP_NAME = "pipelineRootDir"; + + /** Path to the C++ files to be compiled */ + private List cppFilePaths = null; + + /** Paths to the include files */ + private List includeFilePaths = new ArrayList<>(); + + /** Paths to the libraries needed in linking */ + private List libraryPaths = new ArrayList<>(); + + /** Libraries needed for linking (minus the "lib" prefix and all file type suffixes) */ + private List libraries = new ArrayList<>(); + + /** compile options (minus the initial hyphen) */ + private List compileOptions = new ArrayList<>(); + + /** linker options (minus the initial hyphen) */ + private List linkOptions = new ArrayList<>(); + + /** Optimizations, if any, desired for a build without cppdebug=true system property */ + private List releaseOptimizations = new ArrayList<>(); + + /** Optimizations, if any, desired for a build with cppdebug=true system property */ + private List debugOptimizations = new ArrayList<>(); + + /** Caller-selected build type */ + private BuildType outputType = null; + + /** Name of the output file (with no "lib" prefix or file type suffix) */ + private String name = null; + + /** C++ files found in the cppFilePath directory */ + private List cppFiles = new ArrayList<>(); + + /** Object files built from the C++ files */ + private List objectFiles = new ArrayList<>(); + + /** Desired output (executable or library) as a File */ + private File builtFile = null; + + /** Desired Gradle build directory, as a File */ + private File buildDir = null; + + /** Root directory for the parent Gradle project, as a File */ + private File rootDir = null; + + /** C++ compiler command including path to same */ + private String cppCompiler = null; + + /** Default executor used only for testing, do not use for real execution! */ + private DefaultExecutor defaultExecutor = null; + + /** Operating system, needed to set options and names for the linker command */ + private OperatingSystem operatingSystem = OperatingSystem.current(); + + // stores logger warning messages. Used only for testing. + private List loggerWarnings = new ArrayList<>(); + + /** + * Converts a list of arguments to a single string that can be used in a command line compiler + * call + * + * @param argList list of arguments + * @param prefix prefix for each argument ("-I", "-L", etc.) + * @return the list of arguments converted to a string, and with the prefix added to each + */ + public String argListToString(List argList, String prefix) { + StringBuilder argStringBuilder = new StringBuilder(); + for (String arg : argList) { + argStringBuilder.append(prefix + arg + " "); + } + return argStringBuilder.toString(); + } + + File objDir() { + return new File(buildDir, "obj"); + } + + File libDir() { + return new File(buildDir, "lib"); + } + + File binDir() { + return new File(buildDir, "bin"); + } + + /** + * Search the specified file path for C and C++ files, and populate the cppFiles list with same. + * If the file path is not set or does not exist, a GradleException will be thrown. + */ + private void populateCppFiles() { + + // check that the path is set and exists + if (cppFilePaths == null) { + throw new GradleException("C++ file path is null"); + } + + // clear any existing files, and also handle the null pointer case + // neither of these should ever occur in real life, but why risk it? + if (cppFiles == null || !cppFiles.isEmpty()) { + cppFiles = new ArrayList<>(); + } + + for (String cppFilePath : cppFilePaths) { + File cppFileDir = new File(cppFilePath); + if (!cppFileDir.exists()) { + String w = "C++ file path " + cppFilePath + " does not exist"; + log.warn(w); + addLoggerWarning(w); + + } else { + + // find all C and C++ files and add them to the cppFiles list + for (String fileType : CPP_FILE_TYPES) { + File[] cFiles = cppFileDir.listFiles(new FilenameFilter() { + public boolean accept(File dir, String name) { + return name.endsWith(fileType); + } + }); + for (File file : cFiles) { + cppFiles.add(file); + } + } + } + } + + if (!cppFiles.isEmpty()) { + Collections.sort(cppFiles); + // write the list of files to the log if info logging is set + StringBuilder fileListBuilder = new StringBuilder(); + for (File file : cppFiles) { + fileListBuilder.append(file.getName()); + fileListBuilder.append(" "); + } + log.info("List of C/C++ files in directory " + cppFilePaths + ": " + + fileListBuilder.toString()); + } + } + + /** + * Populate the builtFile member based on the name of the file to be built, its type, and the + * OS. Throws a GradleException if the name isn't defined, the output type isn't defined, or the + * OS is' something other than Mac OS, Linux, or Unix. + */ + protected void populateBuiltFile() { + + // handle error cases + if (name == null || outputType == null) { + throw new GradleException("Both output name and output type must be specified"); + } + + String outputDirectory = null; + String prefix = null; + String fileType = null; + // determine output directory + if (outputType == BuildType.EXECUTABLE) { + outputDirectory = binDir().getAbsolutePath(); + prefix = ""; + fileType = ""; + } else { + outputDirectory = libDir().getAbsolutePath(); + prefix = "lib"; + if (outputType == BuildType.STATIC) { + fileType = ".a"; + } else { + if (operatingSystem.isMacOsX()) { + fileType = ".dylib"; + } else if (operatingSystem.isLinux()) { + fileType = ".so"; + } else { + throw new GradleException( + "ZiggyCpp class does not support OS " + operatingSystem.getName()); + } + } + } + String outputFile = prefix + name + fileType; + builtFile = new File(outputDirectory, outputFile); + + } + + /** + * Determines the name of the object file that is generated when compiling a given source file. + * + * @param sourceFile source C/C++ file + * @return name of the source file with original type stripped off and replaced with ".o" + */ + public static String objectNameFromSourceFile(File sourceFile) { + String sourceName = sourceFile.getName(); + String strippedName = null; + for (String fileType : CPP_FILE_TYPES) { + if (sourceName.endsWith(fileType)) { + strippedName = sourceName.substring(0, sourceName.length() - fileType.length()); + break; + } + } + return strippedName + ".o"; + } + + /** + * Generates the command to compile a single source file + * + * @param sourceFile File of the C/C++ source that is to be compiled + * @return the compile command as a single string. This command will include the include files + * and command line options specified in the object, and will route the output to the correct + * output directory (specifically $buildDir/obj). It will also take care of setting options + * correctly for a debug build if the JVM has cppdebug=true set as a system property. + */ + public String generateCompileCommand(File sourceFile) { + return generateCompileCommand(sourceFile, null, null); + } + + /** + * Generates the command to compile a single source file, with additional options that are + * needed for mexfiles + * + * @param sourceFile File of the C/C++ source that is to be compiled + * @param matlabIncludePath String that indicates the location of MATLAB include files, can be + * null + * @param matlabCompilerDirective String that contains the MATLAB compiler directive, can be + * null + * @return the compile command as a single string. This command will include the include files + * and command line options specified in the object, and will route the output to the correct + * output directory (specifically $buildDir/obj). It will also take care of setting options + * correctly for a debug build if the JVM has cppdebug=true set as a system property. + */ + public String generateCompileCommand(File sourceFile, String matlabIncludePath, + String matlabCompilerDirective) { + + StringBuilder compileStringBuilder = new StringBuilder(); + + // compiler executable + compileStringBuilder.append(getCppCompiler() + " "); + + // compile only flag + compileStringBuilder.append("-c "); + + // define the output file + compileStringBuilder.append( + "-o " + objDir().getAbsolutePath() + "/" + objectNameFromSourceFile(sourceFile) + " "); + + // add the include paths + compileStringBuilder.append(argListToString(includeFilePaths, "-I")); + + // If there is a MATLAB include path, handle that now + if (matlabIncludePath != null && !matlabIncludePath.isEmpty()) { + compileStringBuilder.append("-I" + matlabIncludePath + " "); + } + + // add the command line options + compileStringBuilder.append(argListToString(compileOptions, "-")); + + // if there is a MATLAB compiler directive, handle that now + if (matlabCompilerDirective != null && !matlabCompilerDirective.isEmpty()) { + compileStringBuilder.append("-" + matlabCompilerDirective + " "); + } + + // depending on whether there is a cppdebug system property set to true, we either set up + // for debugging, or -- not. + boolean debug = false; + if (System.getProperty(CPP_DEBUG_PROPERTY_NAME) != null) { + debug = Boolean.getBoolean(CPP_DEBUG_PROPERTY_NAME); + } + if (debug) { + compileStringBuilder.append(argListToString(debugOptimizations, "-")); + } else { + compileStringBuilder.append(argListToString(releaseOptimizations, "-")); + } + compileStringBuilder.append(sourceFile.getAbsolutePath()); + + // send the results of this to the log if info mode is selected + log.info(compileStringBuilder.toString()); + + return compileStringBuilder.toString(); + + } + + /** + * Generates a linker command line. The line takes into account the linker options, the desired + * output type (executable, static library, or shared object), and library paths and names. + * + * @return Linker command line as a String. + */ + public String generateLinkCommand() { + return generateLinkCommand(null); + } + + /** + * Generates a linker command line. The line takes into account the linker options, the desired + * output type (executable, static library, or shared object), and library paths and names. + * + * @param matlabLibPath String that indicates the path to MATLAB shared objects, can be null + * @return Linker command line as a String. + */ + public String generateLinkCommand(String matlabLibPath) { + + StringBuilder linkStringBuilder = new StringBuilder(); + + // start with the actual command, which is either the compiler or the archive builder + if (outputType == BuildType.STATIC) { + linkStringBuilder.append("ar rs "); + } else { + linkStringBuilder.append(getCppCompiler() + " -o "); + } + + // add the name of the desired output file + linkStringBuilder.append(getBuiltFile().getAbsolutePath() + " "); + + // if this is an executable or shared object, add the linker options + // and library paths + if (outputType != BuildType.STATIC) { + linkStringBuilder.append(argListToString(libraryPaths, "-L")); + if (matlabLibPath != null && !matlabLibPath.isEmpty()) { + linkStringBuilder.append("-L" + matlabLibPath + " "); + } + + } + + // add release or debug options + if (outputType == BuildType.EXECUTABLE) { + linkStringBuilder.append(argListToString(linkOptions, "-")); + if (System.getProperty(CPP_DEBUG_PROPERTY_NAME) != null + && Boolean.getBoolean(CPP_DEBUG_PROPERTY_NAME)) { + linkStringBuilder.append(argListToString(debugOptimizations, "-")); + } else { + linkStringBuilder.append(argListToString(releaseOptimizations, "-")); + } + } + + // if this is to be a shared object, put in the "-shared" option + + if (outputType == BuildType.SHARED) { + linkStringBuilder.append("-shared "); + } + + // if the OS is Mac OS, set the install name. The install name assumes that the library + // will be installed in the build/lib directory under the root directory. + + if (operatingSystem.isMacOsX() && outputType == BuildType.SHARED) { + linkStringBuilder.append("-install_name " + getRootDir().getAbsolutePath() + + "/build/lib/" + getBuiltFile().getName() + " "); + } + + // add the object files + for (File objectFile : objectFiles) { + linkStringBuilder.append(objectFile.getName() + " "); + } + + // Add library names. These have come after the object files due to a positional + // dependence in the Linux linker + if (outputType != BuildType.STATIC) { + linkStringBuilder.append(argListToString(libraries, "-l")); + if (matlabLibPath != null && !matlabLibPath.isEmpty()) { + linkStringBuilder.append("-lmex -lmx -lmat "); + } + } + + log.info(linkStringBuilder.toString()); + return linkStringBuilder.toString(); + + } + + /** + * Returns a DefaultExecutor object. For normal execution, this always returns a new object, but + * if the defaultExecutor member is non-null, that is what is returned. This latter case should + * only happen in testing, when a mocked DefaultExecutor object is stored in defaultExecutor. + * + * @return a new DefaultExecutor (normal execution), or a mocked one (testing). + */ + protected DefaultExecutor getDefaultExecutor() { + if (defaultExecutor == null) { + return new DefaultExecutor(); + } + return defaultExecutor; + } + + /** + * Main action of the class. This method compiles the files and captures information about the + * resulting object files, then performs whatever linking / library building action is required. + * If any compile or the final link / library build command fails, a GradleException is thrown. + * Files in the include directories that end in .h or .hpp are copied to the build directory's + * include subdir. + */ + public void action() { + + log.info(String.format("%s.action()\n", this.getClass().getSimpleName())); + + // compile the source files + compileAction(); + + // perform the linker / archiver step + linkAction(); + } + + protected void compileAction() { + + // create the obj directory + + File objDir = objDir(); + if (!objDir.exists()) { + log.info("mkdir: " + objDir.getAbsolutePath()); + objDir.mkdirs(); + } + // loop over source files, compile them and add the object file to the object file list + for (File file : getCppFiles()) { + DefaultExecutor compilerExec = getDefaultExecutor(); + compilerExec.setWorkingDirectory(new File(cppFilePaths.get(0))); + try { + int returnCode = compilerExec + .execute(new CommandLineComparable(generateCompileCommand(file))); + + if (returnCode != 0) { + throw new GradleException("Compilation of file " + file.getName() + " failed"); + } + objectFiles.add(new File(objDir, objectNameFromSourceFile(file))); + } catch (IOException e) { + throw new GradleException( + "IOException occurred when attempting to compile " + file.getName(), e); + } + } + } + + protected void linkAction() { + + File objDir = objDir(); + DefaultExecutor linkExec = getDefaultExecutor(); + linkExec.setWorkingDirectory(objDir); + File destDir = null; + if (outputType.equals(BuildType.EXECUTABLE)) { + destDir = binDir(); + } else { + destDir = libDir(); + } + if (!destDir.exists()) { + log.info("mkdir: " + destDir.getAbsolutePath()); + destDir.mkdirs(); + } + try { + int returnCode = linkExec.execute(new CommandLineComparable(generateLinkCommand())); + if (returnCode != 0) { + throw new GradleException( + "Link / library construction of " + getBuiltFile().getName() + " failed"); + } + } catch (IOException e) { + throw new GradleException("IOException occurred during link / library construction of " + + getBuiltFile().getName(), e); + } + + // copy the files from each of the include directories to buildDir/include + File includeDest = new File(buildDir, "include"); + for (String include : includeFilePaths) { + File[] includeFiles = new File(include).listFiles(new FilenameFilter() { + @Override + public boolean accept(File dir, String name) { + return (name.endsWith(".h") || name.endsWith(".hpp")); + } + }); + for (File includeFile : includeFiles) { + try { + FileUtils.copyFileToDirectory(includeFile, includeDest); + } catch (IOException e) { + throw new GradleException("Unable to copy include files from" + include + " to " + + includeDest.getAbsoluteFile(), e); + } + } + } + + } + + /** + * Converts a list of Objects to a list of Strings, preserving their order. + * + * @param libraries2 List of objects to be converted. + * @return ArrayList of strings obtained by taking toString() of the objects in the objectList. + */ + static List objectListToStringList(List libraries2) { + List stringList = new ArrayList<>(); + for (Object obj : libraries2) { + stringList.add(obj.toString()); + } + return stringList; + } + + /** + * Converts a Gradle property to a list of Strings. The property can be a scalar or a list, Java + * Strings or Groovy GStrings. + * + * @param gradleProperty property to be converted. + * @return contents of gradleProperty as a list of Java Strings. + */ + @SuppressWarnings("unchecked") + static List gradlePropertyToList(Object gradleProperty) { + if (gradleProperty instanceof List) { + return objectListToStringList((List) gradleProperty); + } else { + List gradlePropertyList = new ArrayList<>(); + gradlePropertyList.add(gradleProperty); + return objectListToStringList((List) gradlePropertyList); + } + } + +// setters and getters + + public void setCppFilePath(Object cppFilePath) { + this.cppFilePaths = new ArrayList<>(); + cppFilePaths.add(cppFilePath.toString()); + } + + public List getCppFilePaths() { + return cppFilePaths; + } + + public void setCppFilePaths(List cppFilePaths) { + this.cppFilePaths = objectListToStringList(cppFilePaths); + } + + public List getIncludeFilePaths() { + return includeFilePaths; + } + + public void setIncludeFilePaths(List includeFilePaths) { + this.includeFilePaths = new ArrayList<>(); + this.includeFilePaths.addAll(objectListToStringList(includeFilePaths)); + } + + public List getLibraryPaths() { + return libraryPaths; + } + + public void setLibraryPaths(List libraryPaths) { + this.libraryPaths = new ArrayList<>(); + this.libraryPaths.addAll(objectListToStringList(libraryPaths)); + } + + public List getLibraries() { + return libraries; + } + + public void setLibraries(List libraries) { + this.libraries = new ArrayList<>(); + this.libraries.addAll(objectListToStringList(libraries)); + } + + public List getCompileOptions() { + return compileOptions; + } + + public void setCompileOptions(List compileOptions) { + this.compileOptions = new ArrayList<>(); + this.compileOptions.addAll(objectListToStringList(compileOptions)); + } + + public List getLinkOptions() { + return linkOptions; + } + + public void setLinkOptions(List linkOptions) { + this.linkOptions = new ArrayList<>(); + this.linkOptions.addAll(objectListToStringList(linkOptions)); + } + + public List getReleaseOptimizations() { + return releaseOptimizations; + } + + public void setReleaseOptimizations(List releaseOptimizations) { + this.releaseOptimizations = new ArrayList<>(); + this.releaseOptimizations.addAll(objectListToStringList(releaseOptimizations)); + } + + public List getDebugOptimizations() { + return debugOptimizations; + } + + public void setDebugOptimizations(List debugOptimizations) { + this.debugOptimizations = new ArrayList<>(); + this.debugOptimizations.addAll(objectListToStringList(debugOptimizations)); + } + + public BuildType getOutputType() { + return outputType; + } + + public void setOutputType(BuildType outputType) { + this.outputType = outputType; + } + + public void setOutputType(Object outputType) { + this.outputType = BuildType.valueOf(outputType.toString().toUpperCase()); + } + + public String getOutputName() { + return name; + } + + public void setOutputName(Object name) { + this.name = name.toString(); + } + + public List getCppFiles() { + // always generate the list afresh -- necessary because Gradle calls the ZiggyCpp + // method getCppFiles() prior to the actual build, at which time the directories of + // source files may or may not exist yet! Thus we can't afford to cache the C++ + // file list, since I can't tell whether Gradle creates a new ZiggyCpp object when + // it actually does the build, or whether it simply re-uses the one from pre-build. + populateCppFiles(); + return cppFiles; + } + + public List getObjectFiles() { + return objectFiles; + } + + public void setObjectFiles(List objectFiles) { + this.objectFiles.addAll(objectFiles); + } + + public void setObjectFiles(File objectFile) { + this.objectFiles.add(objectFile); + } + + public File getBuiltFile() { + if (builtFile == null) { + populateBuiltFile(); + } + return builtFile; + } + + public File getBuildDir() { + return buildDir; + } + + public void setBuildDir(File buildDir) { + this.buildDir = buildDir; + } + + public File getRootDir() { + return rootDir; + } + + public void setRootDir(File rootDir) { + this.rootDir = rootDir; + } + + public String getCppCompiler() { + if (cppCompiler == null) { + cppCompiler = System.getenv(CPP_COMPILER_ENV_VAR); + } + return cppCompiler; + } + + public OperatingSystem getOperatingSystem() { + return operatingSystem; + } + + // this method is intended for use only in testing, for that reason it is package-private + void setCppCompiler(String cppCompiler) { + this.cppCompiler = cppCompiler; + } + + // this method is intended for use only in testing, for that reason it is package-private + void setDefaultExecutor(DefaultExecutor defaultExecutor) { + this.defaultExecutor = defaultExecutor; + } + + // this method is intended for use only in testing, for that reason it is package-private + void setOperatingSystem(OperatingSystem operatingSystem) { + this.operatingSystem = operatingSystem; + } + + /** + * Thin wrapper for Apache Commons CommandLine class that provides an equals() method. This is + * needed for unit testing, since Mockito checks argument agreements using the argument's + * equals() method, and CommandLine doesn't have one. + * + * @author PT + */ + class CommandLineComparable extends CommandLine { + + public CommandLineComparable(String executable) { + super(CommandLine.parse(executable)); + } + + public boolean equals(Object o) { + if (o instanceof CommandLine) { + CommandLine oo = (CommandLine) o; + if (this.toString().contentEquals(oo.toString())) { + return true; + } + } + return false; + } + } + + // add a logger warning to the list of same. Used only for testing. + private void addLoggerWarning(String warning) { + loggerWarnings.add(warning); + } + + // retrieve the list of saved logger warnings. Used only for testing. + List loggerWarnings() { + return loggerWarnings; + } +} diff --git a/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/ZiggyVersionGenerator.java b/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/ZiggyVersionGenerator.java new file mode 100644 index 0000000..3f8afc2 --- /dev/null +++ b/buildSrc/src/main/java/gov/nasa/ziggy/buildutil/ZiggyVersionGenerator.java @@ -0,0 +1,188 @@ +package gov.nasa.ziggy.buildutil; + +import java.io.BufferedReader; +import java.io.BufferedWriter; +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.io.InputStreamReader; +import java.net.URISyntaxException; +import java.text.SimpleDateFormat; +import java.util.ArrayList; +import java.util.Date; +import java.util.List; + +import org.gradle.api.tasks.OutputFile; +import org.gradle.api.tasks.TaskAction; + +import com.google.common.collect.ImmutableList; + +import freemarker.template.Configuration; +import freemarker.template.Template; +import freemarker.template.TemplateException; +import freemarker.template.TemplateExceptionHandler; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Generates version info in the generated file ZiggyVersion.java$ git rev-list --all --abbrev=0 --abbrev-commit | awk '{print length()}' | sort -n | uniq -c + * 1040 4 + * 7000 5 + * 1149 6 + * 68 7 + * 8 8 + * 1 9 + * + * + */ +public class ZiggyVersionGenerator extends TessExecTask { + + private static final Logger log = LoggerFactory.getLogger(ZiggyVersionGenerator.class); + private static final String MAC_OS_X_OS_NAME = "Mac OS X"; + + public File outputFile; + public final String dateFormat = "dd-MMM-yyyy HH:mm:ss"; + private String osType; + + public void generateFile(BufferedWriter out) throws IOException, InterruptedException { + + osType = System.getProperty("os.name"); + log.debug("OS Type: " + osType); + Configuration config = new Configuration(); + config.setClassForTemplateLoading(this.getClass(), "/"); + config.setDefaultEncoding("UTF-8"); + config.setTemplateExceptionHandler(TemplateExceptionHandler.RETHROW_HANDLER); + + VersionInfo versionInfo = new VersionInfo(); + versionInfo.setBuildDate(getBuildDate()); + versionInfo.setSoftwareVersion(getGitRelease()); + versionInfo.setBranch(getGitBranch()); + versionInfo.setRevision(getGitRevision()); + + try { + config.getTemplate("ZiggyVersion.java.ftlh").process(versionInfo, out); + } catch (TemplateException e) { + throw new IllegalStateException("Error processing template", e); + } + } + + public List getProcessOutput(List command) + throws IOException, InterruptedException { + + ProcessBuilder processBuilder = new ProcessBuilder(command); + Process process = processBuilder.start(); + List lines = new ArrayList<>(); + + BufferedReader bufferedReader = new BufferedReader( + new InputStreamReader(process.getInputStream())); + + for (;;) { + String line = bufferedReader.readLine(); + if (line == null) { + break; + } + + lines.add(line); + } + + process.waitFor(); + return lines; + } + + public String getGitRevision() throws IOException, InterruptedException { + if (osType.equals(MAC_OS_X_OS_NAME)) { + return "Not Supported"; + } + List cmd = ImmutableList.of("git", "rev-parse", "--short=10", "HEAD"); + List output = getProcessOutput(cmd); + return output.get(output.size() - 1); + } + + public String getGitBranch() throws IOException, InterruptedException { + if (osType.equals(MAC_OS_X_OS_NAME)) { + return "Not Supported"; + } + List cmd = ImmutableList.of("git", "rev-parse", "--abbrev-ref", "HEAD"); + List output = getProcessOutput(cmd); + return output.get(output.size() - 1); + } + + public String getGitRelease() throws IOException, InterruptedException { + if (osType.equals(MAC_OS_X_OS_NAME)) { + return "Not Supported"; + } + List cmd = ImmutableList.of("git", "describe", "--always", "--abbrev=10"); + List output = getProcessOutput(cmd); + return output.get(output.size() - 1); + } + + public String getBuildDate() { + SimpleDateFormat dateFormatter = new SimpleDateFormat(dateFormat); + return dateFormatter.format(new Date()); + } + + @OutputFile + public File getOutputFile() { + return outputFile; + } + + public void setOutputFile(File output) { + outputFile = output; + } + + @TaskAction + public void action() throws IOException, InterruptedException { + try (BufferedWriter output = new BufferedWriter(new FileWriter(outputFile))) { + generateFile(output); + } + } + + /** + * Holds version information in a Java bean suitable for referencing from a template. + */ + public static class VersionInfo { + + private String buildDate; + private String softwareVersion; + private String revision; + private String branch; + + public String getBuildDate() { + return buildDate; + } + + public void setBuildDate(String dateStr) { + buildDate = dateStr; + } + + public String getSoftwareVersion() { + return softwareVersion; + } + + public void setSoftwareVersion(String versionStr) { + softwareVersion = versionStr; + } + + public String getRevision() { + return revision; + } + + public void setRevision(String revision) { + this.revision = revision; + } + + public String getBranch() { + return branch; + } + + public void setBranch(String branch) { + this.branch = branch; + } + } +} diff --git a/buildSrc/src/main/resources/ZiggyVersion.java.ftlh b/buildSrc/src/main/resources/ZiggyVersion.java.ftlh new file mode 100644 index 0000000..da7da6c --- /dev/null +++ b/buildSrc/src/main/resources/ZiggyVersion.java.ftlh @@ -0,0 +1,88 @@ +<#-- Template for generating ZiggyVersion.java --> +package gov.nasa.ziggy.util; + +import java.text.SimpleDateFormat; +import java.text.DateFormat; +import java.text.ParseException; +import java.util.Date; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +/** + * Provides code versioning information. + * + *

This has been automatically generated! Do not edit. + */ +public class ZiggyVersion { + + private static final DateFormat DATE_FORMAT = new SimpleDateFormat("dd-MMM-yyyy HH:mm:ss"); + + private static final Pattern NON_TAG_PATTERN = Pattern.compile("^.*-g[A-Fa-f0-9]+$"); + + /** + * Gets the build date. + * + * @return the date and time this file was created, as a Java date + * @throws IllegalStateException if there is an error interpreting the build date string + */ + public static Date getBuildDate() { + try { + return DATE_FORMAT.parse("${buildDate}"); + } catch (ParseException e) { + throw new IllegalStateException(e); + } + } + + /** + * Gets the software revision. The format will vary by revision control + * system. For Git repositories, this should be generated by "git describe". + * + * @return the software version, as a string + */ + public static String getSoftwareVersion() { + return "${softwareVersion}"; + } + + /** + * Gets the latest version control commit revision identifier. For Git repositories, this + * is the commit hash. + * + * @return the latest commit revision identifier, as a string + */ + public static String getRevision() { + return "${revision}"; + } + + /** + * Tests whether the software revision corresponds to a release. In TESS, + * we are running a release if it was compiled from a release branch. + * + * @return true, if the software revision is a released version + */ + public static boolean isRelease() { + return getBranch().startsWith("releases/") || (atTag() && getBranch().equals("HEAD")); + } + + /** + * Gets the branch of the software revision. + * + * @return the branch used to build this file + */ + public static String getBranch() { + return "${branch}"; + } + + /** + * Tests whether we have been checked out from a tag. If so, we + * will be in a detached head state, in which case the software + * revision obtained by "git describe" will not have a trailing + * "{@code -g}". + * + * @return true, if we have been checked out from a tag, false otherwise + */ + private static boolean atTag() { + Matcher matcher = NON_TAG_PATTERN.matcher(getSoftwareVersion()); + return !matcher.matches(); + } + +} diff --git a/buildSrc/src/main/sh/macosx-sdk-selection.sh b/buildSrc/src/main/sh/macosx-sdk-selection.sh new file mode 100644 index 0000000..1c0ba64 --- /dev/null +++ b/buildSrc/src/main/sh/macosx-sdk-selection.sh @@ -0,0 +1,31 @@ +#!/bin/bash +# + +# Get the current OS version number + +macosx_version=`uname -r` + +# depending on the version number, we need to set the SDK-related variables differently: + +case $macosx_version in + 14*) + SDKROOT='/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/' + MACOSX_DEPLOYMENT_TARGET='10.10' + ;; + 13*) + SDKROOT='/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/' + MACOSX_DEPLOYMENT_TARGET='10.9' + ;; + 12*) + SDKROOT='/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.8.sdk/' + MACOSX_DEPLOYMENT_TARGET='10.8' + ;; + 11*) + SDKROOT='/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.7.sdk/' + MACOSX_DEPLOYMENT_TARGET='10.7' + ;; + *) + SDKROOT='/Developer/SDKs/MacOSX10.6.sdk' + MACOSX_DEPLOYMENT_TARGET='10.6' + ;; +esac diff --git a/buildSrc/src/test/java/gov/nasa/ziggy/buildutil/ZiggyCppMexPojoTest.java b/buildSrc/src/test/java/gov/nasa/ziggy/buildutil/ZiggyCppMexPojoTest.java new file mode 100644 index 0000000..206ab0a --- /dev/null +++ b/buildSrc/src/test/java/gov/nasa/ziggy/buildutil/ZiggyCppMexPojoTest.java @@ -0,0 +1,432 @@ +package gov.nasa.ziggy.buildutil; + +import static org.junit.Assert.assertEquals; +import static org.mockito.Mockito.when; + +import java.io.File; +import java.io.IOException; +import java.nio.file.Files; +import java.util.ArrayList; +import java.util.List; + +import org.apache.commons.exec.DefaultExecutor; +import org.apache.commons.exec.ExecuteException; +import org.apache.commons.io.FileUtils; +import org.gradle.api.GradleException; +import org.gradle.internal.os.OperatingSystem; +import org.junit.After; +import org.junit.Before; +import org.junit.Rule; +import org.junit.Test; +import org.junit.rules.ExpectedException; +import org.mockito.InOrder; +import org.mockito.Mockito; + +import gov.nasa.ziggy.buildutil.ZiggyCppPojo.BuildType; + +public class ZiggyCppMexPojoTest { + + File tempDir = null; + File buildDir = null; + File rootDir = null; + File projectDir = null; + File srcDir = null; + File incDir = null; + ZiggyCppMexPojo ziggyCppMexObject = null; + + DefaultExecutor defaultExecutor = Mockito.mock(DefaultExecutor.class); + + @Rule + public ExpectedException exception = ExpectedException.none(); + + @Before + public void before() throws IOException { + + // create a temporary directory for everything + tempDir = Files.createTempDirectory("rootDir").toFile(); + tempDir.deleteOnExit(); + + // rootDir is the same as tempDir + rootDir = tempDir; + + // projectDir + projectDir = new File(rootDir,"projectDir"); + projectDir.mkdir(); + + // build directory under project + buildDir = new File(projectDir, "build"); + buildDir.mkdir(); + + // lib, bin, obj, and include directories under build + new File(buildDir, "lib").mkdir(); + new File(buildDir, "obj").mkdir(); + new File(buildDir, "bin").mkdir(); + new File(buildDir, "include").mkdir(); + + // add a source directory that's several levels down + srcDir = new File(projectDir, "src/main/cpp/mex"); + srcDir.mkdirs(); + + // add an include directory that's several levels down + incDir = new File(projectDir, "src/main/include"); + incDir.mkdirs(); + + // create source files + createSourceFiles(); + + // create the ZiggyCppMexPojo object + ziggyCppMexObject = createZiggyCppMexPojo(); + } + + @After + public void after() throws IOException { + + // explicitly delete the temp directory + FileUtils.deleteDirectory(tempDir); + + // delete any cppdebug system properties + System.clearProperty(ZiggyCppPojo.CPP_DEBUG_PROPERTY_NAME); + + // delete the ZiggyCpp object + ziggyCppMexObject = null; + buildDir = null; + tempDir = null; + projectDir = null; + srcDir = null; + incDir = null; + } + +//*************************************************************************************** + + // Here begins the actual test classes + + /** Tests that the output type setters have no effect on the output type + * + */ + @Test + public void testOutputTypeSetters() { + ziggyCppMexObject.setOutputType("executable"); + assertEquals(ziggyCppMexObject.getOutputType(), BuildType.SHARED); + ziggyCppMexObject.setOutputType("static"); + assertEquals(ziggyCppMexObject.getOutputType(), BuildType.SHARED); + ziggyCppMexObject.setOutputType(BuildType.EXECUTABLE); + assertEquals(ziggyCppMexObject.getOutputType(), BuildType.SHARED); + ziggyCppMexObject.setOutputType(BuildType.STATIC); + assertEquals(ziggyCppMexObject.getOutputType(), BuildType.SHARED); + } + + /** + * Tests the setters and getters that are unique to the ZiggyCppMexPojo (the ones + * that are inherited from ZiggyCppPojo are not tested). + */ + @Test + public void testSettersAndGetters() { + + // these getter tests implicitly test the setters in createZiggyCppMexPojo(): + assertEquals(projectDir.getAbsolutePath(), ziggyCppMexObject.getProjectDir().getAbsolutePath()); + assertEquals("/dev/null/MATLAB_R2017b", ziggyCppMexObject.getMatlabPath()); + List mexfileNames = ziggyCppMexObject.getMexfileNames(); + assertEquals(2, mexfileNames.size()); + assertEquals("CSource1", mexfileNames.get(0)); + assertEquals("CppSource2", mexfileNames.get(1)); + } + + /** + * Tests the compile command generator, in particular to make certain that the MATLAB include + * path and MATLAB_MEX_FILE compiler directive are present + */ + @Test + public void testGenerateCompileCommand() { + + // test with debug options disabled + String compileString = ziggyCppMexObject.generateCompileCommand(new File("/dev/null/dmy1.c")); + String expectedString = "/dev/null/g++ -c -o " + buildDir.getAbsolutePath() + "/obj/dmy1.o " + + "-I" + srcDir.getAbsolutePath() + " -I" + incDir.getAbsolutePath() + + " -I/dev/null/MATLAB_R2017b/extern/include -Wall -fPic -DMATLAB_MEX_FILE -O2 -DNDEBUG -g " + + "/dev/null/dmy1.c"; + assertEquals(expectedString, compileString); + + // test with debug options enabled + System.setProperty("cppdebug", "true"); + compileString = ziggyCppMexObject.generateCompileCommand(new File("/dev/null/dmy1.c")); + expectedString = "/dev/null/g++ -c -o " + buildDir.getAbsolutePath() + "/obj/dmy1.o " + + "-I" + srcDir.getAbsolutePath() + " -I" + incDir.getAbsolutePath() + + " -I/dev/null/MATLAB_R2017b/extern/include -Wall -fPic -DMATLAB_MEX_FILE -Og -g " + + "/dev/null/dmy1.c"; + assertEquals(expectedString, compileString); + + // test with debug property present but set to false + System.setProperty("cppdebug", "false"); + compileString = ziggyCppMexObject.generateCompileCommand(new File("/dev/null/dmy1.c")); + expectedString = "/dev/null/g++ -c -o " + buildDir.getAbsolutePath() + "/obj/dmy1.o " + + "-I" + srcDir.getAbsolutePath() + " -I" + incDir.getAbsolutePath() + + " -I/dev/null/MATLAB_R2017b/extern/include -Wall -fPic -DMATLAB_MEX_FILE -O2 -DNDEBUG -g " + + "/dev/null/dmy1.c"; + assertEquals(expectedString, compileString); + } + + @Test + public void testGenerateSharedObjectName() { + String generatedName = ziggyCppMexObject.generateSharedObjectName(); + assertEquals("projectDir-src-main-cpp-mex", generatedName); + } + + @Test + public void testGenerateLinkCommand() { + configureLinkerOptions(ziggyCppMexObject); + ziggyCppMexObject.setOperatingSystem(OperatingSystem.LINUX); + String linkCommand = ziggyCppMexObject.generateLinkCommand(); + String expectedCommand = "/dev/null/g++ -o " + buildDir.getAbsolutePath() + "/lib/" + + "libdummy.so -L/dummy1/lib -L/dummy2/lib " + +"-L/dev/null/MATLAB_R2017b/bin/glnxa64 -shared o1.o o2.o -lhdf5 -lnetcdf -lmex -lmx -lmat "; + assertEquals(expectedCommand, linkCommand); + + // now test the library name for empty object name + ziggyCppMexObject = createZiggyCppMexPojo(); + ziggyCppMexObject.setOutputName(""); + configureLinkerOptions(ziggyCppMexObject); + ziggyCppMexObject.setOperatingSystem(OperatingSystem.LINUX); + linkCommand = ziggyCppMexObject.generateLinkCommand(); + expectedCommand = "/dev/null/g++ -o " + buildDir.getAbsolutePath() + "/lib/" + + "libprojectDir-src-main-cpp-mex.so -L/dummy1/lib -L/dummy2/lib " + +"-L/dev/null/MATLAB_R2017b/bin/glnxa64 -shared o1.o o2.o -lhdf5 -lnetcdf -lmex -lmx -lmat "; + assertEquals(expectedCommand, linkCommand); + + // test for Mac OS + ziggyCppMexObject = createZiggyCppMexPojo(); + configureLinkerOptions(ziggyCppMexObject); + ziggyCppMexObject.setOperatingSystem(OperatingSystem.MAC_OS); + linkCommand = ziggyCppMexObject.generateLinkCommand(); + expectedCommand = "/dev/null/g++ -o " + buildDir.getAbsolutePath() + "/lib/" + + "libdummy.dylib -L/dummy1/lib -L/dummy2/lib " + +"-L/dev/null/MATLAB_R2017b/bin/maci64 " + +"-shared -install_name " + rootDir.getAbsolutePath()+"/build/lib/libdummy.dylib" + + " o1.o o2.o -lhdf5 -lnetcdf -lmex -lmx -lmat "; + assertEquals(expectedCommand, linkCommand); + } + + @Test + public void testGenerateMexCommand() { + ziggyCppMexObject.setOperatingSystem(OperatingSystem.LINUX); + configureLinkerOptions(ziggyCppMexObject); + File mexfile = new File(buildDir, "lib/o1.mexmaci64"); + File objFile = new File(buildDir, "obj/o1.o"); + String mexCommand = ziggyCppMexObject.generateMexCommand(mexfile, objFile); + String expectedCommand = "/dev/null/g++ -o " + mexfile.getAbsolutePath() + " " + + objFile.getAbsolutePath() + " -L/dummy1/lib -L/dummy2/lib " + + "-L/dev/null/MATLAB_R2017b/bin/glnxa64 -L" + buildDir.getAbsolutePath() + "/lib " + + "-lhdf5 -lnetcdf -lmex -lmx -lmat -ldummy -shared"; + assertEquals(expectedCommand, mexCommand); + + // test for empty library object name + ziggyCppMexObject = createZiggyCppMexPojo(); + ziggyCppMexObject.setOperatingSystem(OperatingSystem.LINUX); + configureLinkerOptions(ziggyCppMexObject); + ziggyCppMexObject.setOutputName(""); + mexCommand = ziggyCppMexObject.generateMexCommand(mexfile, objFile); + expectedCommand = "/dev/null/g++ -o " + mexfile.getAbsolutePath() + " " + + objFile.getAbsolutePath() + " -L/dummy1/lib -L/dummy2/lib " + + "-L/dev/null/MATLAB_R2017b/bin/glnxa64 -L" + buildDir.getAbsolutePath() + "/lib " + + "-lhdf5 -lnetcdf -lmex -lmx -lmat -lprojectDir-src-main-cpp-mex -shared"; + assertEquals(expectedCommand, mexCommand); + + // test for Mac OS + ziggyCppMexObject = createZiggyCppMexPojo(); + configureLinkerOptions(ziggyCppMexObject); + ziggyCppMexObject.setOperatingSystem(OperatingSystem.MAC_OS); + mexCommand = ziggyCppMexObject.generateMexCommand(mexfile, objFile); + expectedCommand = "/dev/null/g++ -o " + mexfile.getAbsolutePath() + " " + + objFile.getAbsolutePath() + " -L/dummy1/lib -L/dummy2/lib " + + "-L/dev/null/MATLAB_R2017b/bin/maci64 -L" + buildDir.getAbsolutePath() + "/lib " + + "-lhdf5 -lnetcdf -lmex -lmx -lmat -ldummy -shared"; + assertEquals(expectedCommand, mexCommand); + } + + @Test + public void testAction() throws ExecuteException, IOException { + + // set the mocked executor into the object + ziggyCppMexObject.setOperatingSystem(OperatingSystem.LINUX); + ziggyCppMexObject.setDefaultExecutor(defaultExecutor); + InOrder executorCalls = Mockito.inOrder(defaultExecutor); + + // call the method + ziggyCppMexObject.action(); + + // check the calls -- first the 4 compile commands + executorCalls.verify(defaultExecutor).setWorkingDirectory(new File(projectDir, + "src/main/cpp/mex")); + executorCalls.verify(defaultExecutor).execute(ziggyCppMexObject.new CommandLineComparable( + ziggyCppMexObject.generateCompileCommand(new File(srcDir, "CSource1.c")))); + executorCalls.verify(defaultExecutor).setWorkingDirectory(new File(projectDir, + "src/main/cpp/mex")); + executorCalls.verify(defaultExecutor).execute(ziggyCppMexObject.new CommandLineComparable( + ziggyCppMexObject.generateCompileCommand(new File(srcDir, "CSource2.c")))); + executorCalls.verify(defaultExecutor).setWorkingDirectory(new File(projectDir, + "src/main/cpp/mex")); + executorCalls.verify(defaultExecutor).execute(ziggyCppMexObject.new CommandLineComparable( + ziggyCppMexObject.generateCompileCommand(new File(srcDir, "CppSource1.cpp")))); + executorCalls.verify(defaultExecutor).setWorkingDirectory(new File(projectDir, + "src/main/cpp/mex")); + executorCalls.verify(defaultExecutor).execute(ziggyCppMexObject.new CommandLineComparable( + ziggyCppMexObject.generateCompileCommand(new File(srcDir, "CppSource2.cpp")))); + + // then the link command for the dynamic library (and also make sure that 2 of the 4 files + // got removed from the list of object files) + executorCalls.verify(defaultExecutor).setWorkingDirectory(new File(buildDir, + "obj")); + List allObjectFiles = ziggyCppMexObject.getObjectFiles(); + assertEquals(2, allObjectFiles.size()); + executorCalls.verify(defaultExecutor).execute(ziggyCppMexObject.new CommandLineComparable( + ziggyCppMexObject.generateLinkCommand())); + + // then the mex commands + executorCalls.verify(defaultExecutor).setWorkingDirectory(new File(buildDir, + "obj")); + executorCalls.verify(defaultExecutor).execute(ziggyCppMexObject.new CommandLineComparable( + ziggyCppMexObject.generateMexCommand(new File(buildDir, "lib/CSource1.mexa64"), + new File(buildDir, "obj/CSource1.o")))); + executorCalls.verify(defaultExecutor).setWorkingDirectory(new File(buildDir, + "obj")); + executorCalls.verify(defaultExecutor).execute(ziggyCppMexObject.new CommandLineComparable( + ziggyCppMexObject.generateMexCommand(new File(buildDir, "lib/CppSource2.mexa64"), + new File(buildDir, "obj/CppSource2.o")))); + + } + + // Here are unit tests that exercise various error cases + + @SuppressWarnings("serial") + @Test + public void testErrorMexfileMissingSourceFile() { + ziggyCppMexObject.setMexfileNames(new ArrayList() {{ + add("CSource3"); + }}); + ziggyCppMexObject.setDefaultExecutor(defaultExecutor); + exception.expect(GradleException.class); + exception.expectMessage("No object file for mexfile CSource3"); + ziggyCppMexObject.action(); + } + + @Test + public void testErrorMexReturnCode() throws ExecuteException, IOException { + ziggyCppMexObject.setDefaultExecutor(defaultExecutor); + configureLinkerOptions(ziggyCppMexObject); + ziggyCppMexObject.setOperatingSystem(OperatingSystem.LINUX); + File mexfile = new File(buildDir, "lib/CSource1.mexa64"); + File objFile = new File(buildDir, "obj/CSource1.o"); + String mexCommand = ziggyCppMexObject.generateMexCommand(mexfile, objFile); + when(defaultExecutor.execute(ziggyCppMexObject.new CommandLineComparable( + mexCommand))).thenReturn(1); + exception.expect(GradleException.class); + exception.expectMessage("Mexing of file CSource1.mexa64 failed"); + ziggyCppMexObject.action(); + } + + @Test + public void testBadMexSuffix() { + ziggyCppMexObject.setOperatingSystem(OperatingSystem.WINDOWS); + exception.expect(GradleException.class); + ziggyCppMexObject.mexSuffix(); + } + + @Test + public void testBadMatlabArch() { + ziggyCppMexObject.setOperatingSystem(OperatingSystem.WINDOWS); + exception.expect(GradleException.class); + ziggyCppMexObject.matlabArch(); + } + + @SuppressWarnings("serial") + @Test + public void testNoBuildDir() { + ZiggyCppMexPojo ziggyCppMexObject = new ZiggyCppMexPojo(); + ziggyCppMexObject.setMexfileNames(new ArrayList() {{ + add("CSource1"); + add("CppSource2"); + }}); + exception.expect(GradleException.class); + exception.expectMessage("buildDir and mexfileNames must not be null"); + ziggyCppMexObject.populateMexfiles(); + } + + @Test + public void testNoMexfiles() { + ZiggyCppMexPojo ziggyCppMexObject = new ZiggyCppMexPojo(); + ziggyCppMexObject.setBuildDir(buildDir); + exception.expect(GradleException.class); + exception.expectMessage("buildDir and mexfileNames must not be null"); + ziggyCppMexObject.populateMexfiles(); + } + +//*************************************************************************************** + + // here begins assorted setup and helper methods + + public void createSourceFiles() throws IOException { + + // create 4 temporary "C/C++" files in the source directory + new File(srcDir, "CSource1.c").createNewFile(); + new File(srcDir, "CSource2.c").createNewFile(); + new File(srcDir, "CppSource1.cpp").createNewFile(); + new File(srcDir, "CppSource2.cpp").createNewFile(); + + new File(srcDir, "Header1.h").createNewFile(); + new File(incDir, "Header2.hpp").createNewFile(); + } + + @SuppressWarnings("serial") + public ZiggyCppMexPojo createZiggyCppMexPojo() { + ZiggyCppMexPojo ziggyCppMexObject = new ZiggyCppMexPojo(); + ziggyCppMexObject.setBuildDir(buildDir); + ziggyCppMexObject.setProjectDir(projectDir); + ziggyCppMexObject.setRootDir(rootDir); + ziggyCppMexObject.setCppCompiler("/dev/null/g++"); + ziggyCppMexObject.setCppFilePath(srcDir.getAbsolutePath()); + ziggyCppMexObject.setMatlabPath("/dev/null/MATLAB_R2017b"); + ziggyCppMexObject.setOutputName("dummy"); + ziggyCppMexObject.setMexfileNames(new ArrayList() {{ + add("CSource1"); + add("CppSource2"); + }}); + ziggyCppMexObject.setIncludeFilePaths(new ArrayList() {{ + add(srcDir.getAbsolutePath()); + add(incDir.getAbsolutePath()); + }}); + ziggyCppMexObject.setCompileOptions(new ArrayList() {{ + add("Wall"); + add("fPic"); + }}); + ziggyCppMexObject.setReleaseOptimizations(new ArrayList() {{ + add("O2"); + add("DNDEBUG"); + add("g"); + }}); + ziggyCppMexObject.setDebugOptimizations(new ArrayList() {{ + add("Og"); + add("g"); + }}); + return ziggyCppMexObject; + } + + public void configureLinkerOptions(ZiggyCppPojo ziggyCppObject) { + // first we need to add some object files + File o1 = new File(buildDir, "obj/o1.o"); + File o2 = new File(buildDir, "obj/o2.o"); + ziggyCppObject.setObjectFiles(o1); + ziggyCppObject.setObjectFiles(o2); + + // also some linker options and libraries + List linkerOptions = new ArrayList<>(); + linkerOptions.add("u whatevs"); + ziggyCppObject.setLinkOptions(linkerOptions); + List libraryPathOptions = new ArrayList<>(); + libraryPathOptions.add("/dummy1/lib"); + libraryPathOptions.add("/dummy2/lib"); + ziggyCppObject.setLibraryPaths(libraryPathOptions); + List libraryOptions = new ArrayList<>(); + libraryOptions.add("hdf5"); + libraryOptions.add("netcdf"); + ziggyCppObject.setLibraries(libraryOptions); + } +} diff --git a/buildSrc/src/test/java/gov/nasa/ziggy/buildutil/ZiggyCppPojoTest.java b/buildSrc/src/test/java/gov/nasa/ziggy/buildutil/ZiggyCppPojoTest.java new file mode 100644 index 0000000..daa7d05 --- /dev/null +++ b/buildSrc/src/test/java/gov/nasa/ziggy/buildutil/ZiggyCppPojoTest.java @@ -0,0 +1,764 @@ +package gov.nasa.ziggy.buildutil; + +import java.io.File; +import java.io.FileNotFoundException; +import java.io.IOException; +import java.io.PrintWriter; +import java.lang.reflect.InvocationTargetException; +import java.lang.reflect.Method; +import java.nio.file.Files; +import java.util.ArrayList; +import java.util.List; +import java.util.stream.Collectors; + +import org.apache.commons.exec.CommandLine; +import org.apache.commons.exec.DefaultExecutor; +import org.apache.commons.exec.ExecuteException; +import org.apache.commons.io.FileUtils; +import org.gradle.api.GradleException; +import org.gradle.internal.os.OperatingSystem; +import org.junit.After; +import org.junit.Before; +import org.junit.Ignore; +import org.junit.Rule; +import org.junit.Test; +import org.junit.rules.ExpectedException; +import org.mockito.InOrder; +import org.mockito.Mockito; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.when; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertTrue; + +import gov.nasa.ziggy.buildutil.ZiggyCppPojo; +import gov.nasa.ziggy.buildutil.ZiggyCppPojo.BuildType; + +/** + * Unit test class for ZiggyCppPojo class. + * @author PT + * + */ +public class ZiggyCppPojoTest { + + File tempDir = null; + File buildDir = null; + File srcDir = null; + File rootDir = new File("/dev/null/rootDir"); + ZiggyCppPojo ziggyCppObject; + + DefaultExecutor defaultExecutor = Mockito.mock(DefaultExecutor.class); + + @Rule + public ExpectedException exception = ExpectedException.none(); + + @Before + public void before() throws IOException { + + // create a temporary directory for everything + tempDir = Files.createTempDirectory("ZiggyCpp").toFile(); + tempDir.deleteOnExit(); + + // directory for includes + new File(tempDir, "include").mkdir(); + + // directory for source + srcDir = new File(tempDir, "src"); + srcDir.mkdir(); + + // build directory + buildDir = new File(tempDir,"build"); + + // directory for libraries + new File(buildDir,"lib").mkdir(); + + // directory for includes + new File(buildDir, "include").mkdir(); + + // directory for built source + new File(buildDir, "src").mkdir(); + + // directory for objects + new File(buildDir, "obj").mkdir(); + + // directory for executables + new File(buildDir, "bin").mkdir(); + + // create C++ source and header files + createSourceFiles(); + + // create the ZiggyCpp object + ziggyCppObject = createZiggyCppObject(buildDir); + + } + + @After + public void after() throws IOException { + + // explicitly delete the temp directory + FileUtils.deleteDirectory(tempDir); + + // delete any cppdebug system properties + System.clearProperty(ZiggyCppPojo.CPP_DEBUG_PROPERTY_NAME); + + // delete the ZiggyCpp object + ziggyCppObject = null; + buildDir = null; + tempDir = null; + } + +//*************************************************************************************** + + // here begins the test methods + + /** + * Tests all getter and setter methods. + * @throws InvocationTargetException + * @throws IllegalArgumentException + * @throws IllegalAccessException + * @throws NoSuchMethodException + */ + @Test + public void getterSetterTest() throws NoSuchMethodException, IllegalAccessException, + IllegalArgumentException, InvocationTargetException { + + List dummyArguments = new ArrayList<>(); + dummyArguments.add("DuMMY"); + dummyArguments.add("dUmMy"); + assertEquals(tempDir.getAbsolutePath() + "/src", ziggyCppObject.getCppFilePaths().get(0)); + + testStringListSettersAndGetters("IncludeFilePaths", new String[] { + tempDir.getAbsolutePath() + "/src", + tempDir.getAbsolutePath() + "/include"}); + testStringListSettersAndGetters("CompileOptions", new String[] { + "Wall", "fPic"}); + testStringListSettersAndGetters("ReleaseOptimizations", new String[] { + "O2", "DNDEBUG", "g"}); + testStringListSettersAndGetters("DebugOptimizations", new String[] { + "Og", "g"}); + ziggyCppObject.setLibraries(dummyArguments); + testStringListSettersAndGetters("Libraries", new String[]{"DuMMY", "dUmMy"}); + ziggyCppObject.setLibraryPaths(dummyArguments); + testStringListSettersAndGetters("LibraryPaths", new String[]{"DuMMY", "dUmMy"}); + ziggyCppObject.setLinkOptions(dummyArguments); + testStringListSettersAndGetters("LinkOptions", new String[]{"DuMMY", "dUmMy"}); + + ziggyCppObject.setOutputName("outputName"); + assertEquals("outputName", ziggyCppObject.getOutputName()); + + ziggyCppObject.setOutputType(BuildType.EXECUTABLE); + assertEquals(BuildType.EXECUTABLE, ziggyCppObject.getOutputType()); + ziggyCppObject.setOutputType("executable"); + assertEquals(BuildType.EXECUTABLE, ziggyCppObject.getOutputType()); + + ziggyCppObject.setOutputType(BuildType.SHARED); + assertEquals(BuildType.SHARED, ziggyCppObject.getOutputType()); + ziggyCppObject.setOutputType("shared"); + assertEquals(BuildType.SHARED, ziggyCppObject.getOutputType()); + + ziggyCppObject.setOutputType(BuildType.STATIC); + assertEquals(BuildType.STATIC, ziggyCppObject.getOutputType()); + ziggyCppObject.setOutputType("static"); + assertEquals(BuildType.STATIC, ziggyCppObject.getOutputType()); + + assertEquals(buildDir, ziggyCppObject.getBuildDir()); + assertEquals(buildDir.getAbsolutePath(), ziggyCppObject.getBuildDir().getAbsolutePath()); + + } + + /** + * Tests the ability to find C/C++ source files in the source directory and add them to + * the ZiggyCppPojo as a list of File objects + */ + @Test + public void testGetCppFiles() { + List cppFiles = ziggyCppObject.getCppFiles(); + assertEquals(2, cppFiles.size()); + List cppFilePaths = cppFiles.stream().map(s -> s.getAbsolutePath()) + .collect(Collectors.toList()); + assertTrue(cppFilePaths.contains(tempDir.getAbsolutePath() + "/src/ZiggyCppMain.cpp")); + assertTrue(cppFilePaths.contains(tempDir.getAbsolutePath() + "/src/GetString.cpp")); + + cppFiles = ziggyCppObject.getCppFiles(); + assertEquals(2, cppFiles.size()); + cppFilePaths = cppFiles.stream().map(s -> s.getAbsolutePath()) + .collect(Collectors.toList()); + assertTrue(cppFilePaths.contains(tempDir.getAbsolutePath() + "/src/ZiggyCppMain.cpp")); + assertTrue(cppFilePaths.contains(tempDir.getAbsolutePath() + "/src/GetString.cpp")); + } + + @Test + public void testGetCppFilesMultipleDirectories() throws FileNotFoundException { + + // put a source directory in build, and populate it + new File(buildDir, "src/cpp").mkdirs(); + createAdditionalSource(); + + // create the list of directories to check out + List cppPaths = new ArrayList<>(); + cppPaths.add(tempDir.toString() + "/src"); + cppPaths.add(buildDir.toString() + "/src/cpp"); + ziggyCppObject.setCppFilePaths(cppPaths); + List cppFiles = ziggyCppObject.getCppFiles(); + int nFiles = cppFiles.size(); + assertEquals(3, nFiles); + + List cppFilePaths = cppFiles.stream().map(s -> s.getAbsolutePath()) + .collect(Collectors.toList()); + assertTrue(cppFilePaths.contains(buildDir.getAbsolutePath() + "/src/cpp/GetAnotherString.cpp")); + assertTrue(cppFilePaths.contains(tempDir.getAbsolutePath() + "/src/ZiggyCppMain.cpp")); + assertTrue(cppFilePaths.contains(tempDir.getAbsolutePath() + "/src/GetString.cpp")); + } + + /** + * Tests the argListToString method, which converts a list of arguments to a string, with a common + * prefix added to each list element + */ + @Test + public void argListToStringTest() { + String compileOptionString = ziggyCppObject.argListToString(ziggyCppObject.getCompileOptions(), + "-"); + assertEquals("-Wall -fPic ", compileOptionString); + } + + /** + * Tests the code that determines the File that is to be the output of the compile and link + * process. + */ + @Test + public void populateBuiltFileTest() { + + // executable + ziggyCppObject.setOutputType("executable"); + File builtFile = ziggyCppObject.getBuiltFile(); + assertEquals(buildDir.getAbsolutePath() + "/bin/dummy", builtFile.getAbsolutePath()); + + // shared library + ZiggyCppPojo ziggyCppShared = createZiggyCppObject(buildDir); + ziggyCppShared.setOutputType("shared"); + ziggyCppShared.setOperatingSystem(OperatingSystem.MAC_OS); + builtFile = ziggyCppShared.getBuiltFile(); + String builtFilePath = builtFile.getAbsolutePath(); + String sharedObjectFileType = ".dylib"; + assertEquals(buildDir.getAbsolutePath() + "/lib/libdummy" + sharedObjectFileType, builtFilePath); + ziggyCppShared = createZiggyCppObject(buildDir); + ziggyCppShared.setOutputType("shared"); + ziggyCppShared.setOperatingSystem(OperatingSystem.LINUX); + builtFile = ziggyCppShared.getBuiltFile(); + builtFilePath = builtFile.getAbsolutePath(); + sharedObjectFileType = ".so"; + assertEquals(buildDir.getAbsolutePath() + "/lib/libdummy" + sharedObjectFileType, builtFilePath); + + // static library + ZiggyCppPojo ziggyCppStatic = createZiggyCppObject(buildDir); + ziggyCppStatic.setOutputType("static"); + builtFile = ziggyCppStatic.getBuiltFile(); + builtFilePath = builtFile.getAbsolutePath(); + assertEquals(buildDir.getAbsolutePath() + "/lib/libdummy.a", builtFilePath); + } + + /** + * Tests the process of converting a source File to an object file name (with the + * path of the former stripped away). + */ + @Test + public void objectNameFromSourceFileTest() { + File f1 = new File("/tmp/dummy/s1.c"); + String s1 = ZiggyCppPojo.objectNameFromSourceFile(f1); + assertEquals("s1.o", s1); + File f2 = new File("/tmp/dummy/s1.cpp"); + String s2 = ZiggyCppPojo.objectNameFromSourceFile(f2); + assertEquals("s1.o", s2); + } + + /** + * Tests the method that generates compile commands. + */ + @Test + public void generateCompileCommandTest() { + File f1 = new File("/tmp/dummy/s1.c"); + String compileCommand = ziggyCppObject.generateCompileCommand(f1); + String expectedString = "/dev/null/g++ -c -o " + buildDir.getAbsolutePath() + "/obj/s1.o " + + "-I" + tempDir.getAbsolutePath() + "/src -I" + tempDir.getAbsolutePath() + + "/include -Wall -fPic -O2 -DNDEBUG -g /tmp/dummy/s1.c"; + assertEquals(expectedString, compileCommand); + + // set up for debugging + System.setProperty(ZiggyCppPojo.CPP_DEBUG_PROPERTY_NAME, "true"); + compileCommand = ziggyCppObject.generateCompileCommand(f1); + expectedString = "/dev/null/g++ -c -o " + buildDir.getAbsolutePath() + "/obj/s1.o " + + "-I" + tempDir.getAbsolutePath() + "/src -I" + tempDir.getAbsolutePath() + + "/include -Wall -fPic -Og -g /tmp/dummy/s1.c"; + assertEquals(expectedString, compileCommand); + + // have the debugging property but set to false + System.setProperty(ZiggyCppPojo.CPP_DEBUG_PROPERTY_NAME, "false"); + compileCommand = ziggyCppObject.generateCompileCommand(f1); + expectedString = "/dev/null/g++ -c -o " + buildDir.getAbsolutePath() + "/obj/s1.o " + + "-I" + tempDir.getAbsolutePath() + "/src -I" + tempDir.getAbsolutePath() + + "/include -Wall -fPic -O2 -DNDEBUG -g /tmp/dummy/s1.c"; + assertEquals(expectedString, compileCommand); + + // test .cpp file type + f1 = new File("/tmp/dummy/s1.cpp"); + compileCommand = ziggyCppObject.generateCompileCommand(f1); + expectedString = "/dev/null/g++ -c -o " + buildDir.getAbsolutePath() + "/obj/s1.o " + + "-I" + tempDir.getAbsolutePath() + "/src -I" + tempDir.getAbsolutePath() + + "/include -Wall -fPic -O2 -DNDEBUG -g /tmp/dummy/s1.cpp"; + assertEquals(expectedString, compileCommand); + } + + /** + * Tests the method that generates link commands. + */ + @Test + public void generateLinkCommandTest() { + + configureLinkerOptions(ziggyCppObject); + ziggyCppObject.setOutputType("executable"); + String linkString = ziggyCppObject.generateLinkCommand(); + assertEquals("/dev/null/g++ -o " + buildDir.getAbsolutePath() + "/bin/dummy -L/dummy1/lib -L/dummy2/lib " + + "-u whatevs -O2 -DNDEBUG -g o1.o o2.o -lhdf5 -lnetcdf ", linkString); + + // now try it with debug enabled + System.setProperty(ZiggyCppPojo.CPP_DEBUG_PROPERTY_NAME, "true"); + linkString = ziggyCppObject.generateLinkCommand(); + assertEquals("/dev/null/g++ -o " + buildDir.getAbsolutePath() + "/bin/dummy -L/dummy1/lib -L/dummy2/lib " + + "-u whatevs -Og -g o1.o o2.o -lhdf5 -lnetcdf ", linkString); + System.setProperty(ZiggyCppPojo.CPP_DEBUG_PROPERTY_NAME, "false"); + + + // Now for a shared object library + ziggyCppObject = createZiggyCppObject(buildDir); + configureLinkerOptions(ziggyCppObject); + ziggyCppObject.setOutputType("shared"); + ziggyCppObject.setOperatingSystem(OperatingSystem.LINUX); + String sharedObjectFileType = ".so"; + linkString = ziggyCppObject.generateLinkCommand(); + assertEquals("/dev/null/g++ -o " + buildDir.getAbsolutePath() + "/lib/libdummy" + sharedObjectFileType + + " -L/dummy1/lib -L/dummy2/lib -shared" + + " o1.o o2.o -lhdf5 -lnetcdf ", linkString); + + // For a Mac, there has to be an install name as well + ziggyCppObject = createZiggyCppObject(buildDir); + configureLinkerOptions(ziggyCppObject); + ziggyCppObject.setOutputType("shared"); + ziggyCppObject.setOperatingSystem(OperatingSystem.MAC_OS); + sharedObjectFileType = ".dylib"; + linkString = ziggyCppObject.generateLinkCommand(); + assertEquals("/dev/null/g++ -o " + buildDir.getAbsolutePath() + "/lib/libdummy" + sharedObjectFileType + + " -L/dummy1/lib -L/dummy2/lib -shared" + + " -install_name /dev/null/rootDir/build/lib/libdummy.dylib " + + "o1.o o2.o -lhdf5 -lnetcdf ", linkString); + + // debug enabled shouldn't do anything + System.setProperty(ZiggyCppPojo.CPP_DEBUG_PROPERTY_NAME, "true"); + linkString = ziggyCppObject.generateLinkCommand(); + assertEquals("/dev/null/g++ -o " + buildDir.getAbsolutePath() + "/lib/libdummy" + sharedObjectFileType + + " -L/dummy1/lib -L/dummy2/lib -shared" + + " -install_name /dev/null/rootDir/build/lib/libdummy.dylib " + + "o1.o o2.o -lhdf5 -lnetcdf ", linkString); + System.setProperty(ZiggyCppPojo.CPP_DEBUG_PROPERTY_NAME, "false"); + + // static library case + ziggyCppObject = createZiggyCppObject(buildDir); + configureLinkerOptions(ziggyCppObject); + ziggyCppObject.setOutputType("static"); + linkString = ziggyCppObject.generateLinkCommand(); + assertEquals("ar rs " + buildDir.getAbsolutePath() + "/lib/libdummy.a o1.o o2.o ", linkString); + + // debug enabled shouldn't do anything + System.setProperty(ZiggyCppPojo.CPP_DEBUG_PROPERTY_NAME, "true"); + linkString = ziggyCppObject.generateLinkCommand(); + assertEquals("ar rs " + buildDir.getAbsolutePath() + "/lib/libdummy.a o1.o o2.o ", linkString); + System.setProperty(ZiggyCppPojo.CPP_DEBUG_PROPERTY_NAME, "false"); + + } + + /** + * Tests the method that executes the main action (compiles and links). + * @throws ExecuteException + * @throws IOException + */ + @Test + public void actionTest() throws ExecuteException, IOException { + + // set values for the ZiggyCppPojo + ziggyCppObject.setOutputName("testOutput"); + ziggyCppObject.setOutputType("executable"); + + // set the mocked executor into the object + ziggyCppObject.setDefaultExecutor(defaultExecutor); + InOrder executorCalls = Mockito.inOrder(defaultExecutor); + + // call the method + ziggyCppObject.action(); + + // check the calls to the executor and their order + executorCalls.verify(defaultExecutor).setWorkingDirectory(new File(tempDir, "src")); + executorCalls.verify(defaultExecutor).execute(ziggyCppObject.new CommandLineComparable( + ziggyCppObject.generateCompileCommand(new File(srcDir, "GetString.cpp")))); + executorCalls.verify(defaultExecutor).setWorkingDirectory(new File(tempDir, "src")); + executorCalls.verify(defaultExecutor).execute(ziggyCppObject.new CommandLineComparable( + ziggyCppObject.generateCompileCommand(new File(srcDir, "ZiggyCppMain.cpp")))); + executorCalls.verify(defaultExecutor).setWorkingDirectory(new File(buildDir, "obj")); + executorCalls.verify(defaultExecutor).execute(ziggyCppObject.new CommandLineComparable( + ziggyCppObject.generateLinkCommand())); + + // test that the include files were copied + File buildInclude = new File(buildDir, "include"); + File buildInclude1 = new File(buildInclude, "ZiggyCppMain.h"); + assertTrue(buildInclude1.exists()); + File buildInclude2 = new File(buildInclude, "ZiggyCppLib.h"); + assertTrue(buildInclude2.exists()); + + + // create a new object for linking a shared object + ziggyCppObject = createZiggyCppObject(buildDir); + ziggyCppObject.setOutputName("testOutput"); + ziggyCppObject.setOutputType("shared"); + ziggyCppObject.setDefaultExecutor(defaultExecutor); + ziggyCppObject.action(); + executorCalls.verify(defaultExecutor).setWorkingDirectory(new File(tempDir, "src")); + executorCalls.verify(defaultExecutor).execute(ziggyCppObject.new CommandLineComparable( + ziggyCppObject.generateCompileCommand(new File(srcDir, "GetString.cpp")))); + executorCalls.verify(defaultExecutor).setWorkingDirectory(new File(tempDir, "src")); + executorCalls.verify(defaultExecutor).execute(ziggyCppObject.new CommandLineComparable( + ziggyCppObject.generateCompileCommand(new File(srcDir, "ZiggyCppMain.cpp")))); + executorCalls.verify(defaultExecutor).setWorkingDirectory(new File(buildDir, "obj")); + executorCalls.verify(defaultExecutor).execute(ziggyCppObject.new CommandLineComparable( + ziggyCppObject.generateLinkCommand())); + + // and once more for a static library + // create a new object for linking a shared object + ziggyCppObject = createZiggyCppObject(buildDir); + ziggyCppObject.setOutputName("testOutput"); + ziggyCppObject.setOutputType("static"); + ziggyCppObject.setDefaultExecutor(defaultExecutor); + ziggyCppObject.action(); + executorCalls.verify(defaultExecutor).setWorkingDirectory(new File(tempDir, "src")); + executorCalls.verify(defaultExecutor).execute(ziggyCppObject.new CommandLineComparable( + ziggyCppObject.generateCompileCommand(new File(srcDir, "GetString.cpp")))); + executorCalls.verify(defaultExecutor).setWorkingDirectory(new File(tempDir, "src")); + executorCalls.verify(defaultExecutor).execute(ziggyCppObject.new CommandLineComparable( + ziggyCppObject.generateCompileCommand(new File(srcDir, "ZiggyCppMain.cpp")))); + executorCalls.verify(defaultExecutor).setWorkingDirectory(new File(buildDir, "obj")); + executorCalls.verify(defaultExecutor).execute(ziggyCppObject.new CommandLineComparable( + ziggyCppObject.generateLinkCommand())); + + } + + // The following tests exercise various error conditions that return GradleExceptions. There are 2 error + // conditions that are not tested by these methods, they occur when the DefaultExecutor throws an IOException. + // These cases are considered sufficiently trivial that we omit them. + + /** + * Tests that a null value for the C++ file path produces the correct error. + */ + @Test + public void testCppFilePathNullError() { + ZiggyCppPojo ziggyCppError = new ZiggyCppPojo(); + exception.expect(GradleException.class); + exception.expectMessage("C++ file path is null"); + ziggyCppError.getCppFiles(); + } + + /** + * Tests that a nonexistent C++ file path produces the correct warning message. + */ + @Test + public void testCppFilePathDoesNotExist() { + ZiggyCppPojo ziggyCppError = new ZiggyCppPojo(); + ziggyCppError.setCppFilePath("/this/path/does/not/exist"); + ziggyCppError.getCppFiles(); + List w = ziggyCppError.loggerWarnings(); + assertTrue(w.size() > 0); + assertTrue(w.contains("C++ file path /this/path/does/not/exist does not exist")); + } + + /** + * Tests that a missing output name produces the correct error. + */ + @Test + public void testErrorNoOutputName() { + ZiggyCppPojo ziggyCppError = new ZiggyCppPojo(); + ziggyCppError.setCppFilePath(tempDir.getAbsolutePath() + "/src"); + ziggyCppError.setBuildDir(buildDir); + ziggyCppError.setOutputType("executable"); + exception.expect(GradleException.class); + exception.expectMessage("Both output name and output type must be specified"); + ziggyCppError.getBuiltFile(); + } + + /** + * Tests that a missing output type produces the correct error. + */ + @Test + public void testErrorNoOutputType() { + ZiggyCppPojo ziggyCppError = new ZiggyCppPojo(); + ziggyCppError.setCppFilePath(tempDir.getAbsolutePath() + "/src"); + ziggyCppError.setBuildDir(buildDir); + ziggyCppError.setOutputName("dummy"); + exception.expect(GradleException.class); + exception.expectMessage("Both output name and output type must be specified"); + ziggyCppError.getBuiltFile(); + } + + /** + * Tests that missing both the output type and the built file name produces the correct error. + */ + @Test + public void testErrorNoOutputTypeOrName() { + ZiggyCppPojo ziggyCppError = new ZiggyCppPojo(); + List cppPaths = new ArrayList<>(); + ziggyCppError.setCppFilePath(tempDir.getAbsolutePath() + "/src"); + ziggyCppError.setBuildDir(buildDir); + exception.expect(GradleException.class); + exception.expectMessage("Both output name and output type must be specified"); + ziggyCppError.getBuiltFile(); + } + + /** + * Tests that a non-zero compiler return value produces the correct error. + * @throws ExecuteException + * @throws IOException + */ + @Test + public void testCompilerError() throws ExecuteException, IOException { + ziggyCppObject.setOutputType("executable"); + ziggyCppObject.setDefaultExecutor(defaultExecutor); + when(defaultExecutor.execute(any(CommandLine.class))).thenReturn(1); + exception.expect(GradleException.class); + exception.expectMessage("Compilation of file GetString.cpp failed"); + ziggyCppObject.action(); + } + + /** + * Tests that a non-zero linker return value produces the correct error. + * @throws ExecuteException + * @throws IOException + */ + @Test + public void testLinkerError() throws ExecuteException, IOException { + ziggyCppObject.setOutputType("executable"); + ziggyCppObject.setDefaultExecutor(defaultExecutor); + String linkerCommand = ziggyCppObject.generateLinkCommand() + "GetString.o ZiggyCppMain.o "; + when(defaultExecutor.execute(ziggyCppObject.new CommandLineComparable(linkerCommand))).thenReturn(1); + exception.expect(GradleException.class); + exception.expectMessage("Link / library construction of dummy failed"); + ziggyCppObject.action(); + } + + /** + * Test that an invalid OS produces the correct error. + */ + @Test + public void testInvalidOsError() { + ZiggyCppPojo ziggyCppError = new ZiggyCppPojo(); + ziggyCppError.setBuildDir(buildDir); + ziggyCppError.setOperatingSystem(OperatingSystem.WINDOWS); + ziggyCppError.setOutputName("dummy"); + ziggyCppError.setOutputType("shared"); + exception.expect(GradleException.class); + exception.expectMessage("ZiggyCpp class does not support OS " + + ziggyCppError.getOperatingSystem().getName()); + ziggyCppError.getBuiltFile(); + } + +//*************************************************************************************** + + // here begins assorted setup and helper methods + + /** + * Creates four source files for use in the test. Two are C++ files in the tempDir/src + * directory, ZiggyCppMain.cpp and GetString.cpp. One is the header file ZiggyCppMain.h, + * also in tempDir/src. The final one is ZiggyCppLib.h, in tempdir/include. + * @throws FileNotFoundException + */ + public void createSourceFiles() throws FileNotFoundException { + + // content for ZiggyCppMain.cpp + String[] mainSourceContent = + {"#include \"ZiggyCppMain.h\"" , + "#include ", + "#include \"ZiggyCppLib.h", + "", + "using namespace std;", + "", + "int main(int argc, const char* argv[]) {", + " string s = getString();", + " cout << s << endl;", + "}" + }; + + // content for the getString function + String[] getStringContent = + {"#include \"ZiggyCppMain.h\"", + "", + "using namespace std;", + "", + "string getString() {", + " return string(\"hello world!\");", + "}" + }; + + // content for the ZiggyCppMain.h header + String[] ziggyCppMainHeaderContent = + {"#ifndef ZIGGY_CPP", + "#define ZIGGY_CPP", + "#endif", + "", + "string getString();" + }; + + // content for the ZiggyCppLib.h header + String[] ziggyCppLibHeaderContent = + {"#ifndef ZIGGY_CPP_LIB", + "#define ZIGGY_CPP_LIB", + "#endif" + }; + + // create the source files first + PrintWriter mainSource = new PrintWriter(tempDir.getAbsolutePath() + "/src/ZiggyCppMain.cpp"); + for (String line : mainSourceContent) { + mainSource.println(line); + } + mainSource.close(); + + PrintWriter getStringSource = new PrintWriter(tempDir.getAbsolutePath() + "/src/GetString.cpp"); + for (String line : getStringContent) { + getStringSource.println(line); + } + getStringSource.close(); + + // put the main header in the src directory + PrintWriter h1 = new PrintWriter(tempDir.getAbsolutePath() + "/src/ZiggyCppMain.h"); + for (String line : ziggyCppMainHeaderContent) { + h1.println(line); + } + h1.close(); + + // put the other header in the include directory + PrintWriter h2 = new PrintWriter(tempDir.getAbsolutePath() + "/include/ZiggyCppLib.h"); + for (String line : ziggyCppLibHeaderContent) { + h2.println(line); + } + h2.close(); + + } + + /** + * Add an additional source file in another directory to test accumulating files from multiple directories + * @throws FileNotFoundException + */ + public void createAdditionalSource() throws FileNotFoundException { + + // content for the getString function + String[] getStringContent = + {"#include \"ZiggyCppMain.h\"", + "", + "using namespace std;", + "", + "string getAnotherString() {", + " return string(\"hello again world!\");", + "}" + }; + + PrintWriter getStringSource = new PrintWriter(buildDir.getAbsolutePath() + "/src/cpp/GetAnotherString.cpp"); + for (String line : getStringContent) { + getStringSource.println(line); + } + getStringSource.close(); + + } + + /** + * Creates a ZiggyCpp object that is properly formed and has the following members populated: + * cppFilePath + * includeFilePaths + * compileOptions + * name + * Project (with a mocked Project object) + * @return properly-formed ZiggyCpp object + */ + @SuppressWarnings("serial") + public ZiggyCppPojo createZiggyCppObject(File buildDir) { + ZiggyCppPojo ziggyCppObject = new ZiggyCppPojo(); + ziggyCppObject.setCppFilePath(tempDir.getAbsolutePath() + "/src"); + ziggyCppObject.setIncludeFilePaths(new ArrayList(){{ + add(tempDir.getAbsolutePath() + "/src"); + add(tempDir.getAbsolutePath() + "/include"); + }}); + ziggyCppObject.setCompileOptions(new ArrayList() {{ + add("Wall"); + add("fPic"); + }}); + ziggyCppObject.setReleaseOptimizations(new ArrayList() {{ + add("O2"); + add("DNDEBUG"); + add("g"); + }}); + ziggyCppObject.setDebugOptimizations(new ArrayList() {{ + add("Og"); + add("g"); + }}); + ziggyCppObject.setOutputName("dummy"); + ziggyCppObject.setBuildDir(buildDir); + ziggyCppObject.setCppCompiler("/dev/null/g++"); + ziggyCppObject.setRootDir(rootDir); + return ziggyCppObject; + } + + public void testStringListSettersAndGetters(String fieldName, String[] initialValues) + throws NoSuchMethodException, IllegalAccessException, IllegalArgumentException, + InvocationTargetException { + String getter = "get" + fieldName; + String setter = "set" + fieldName; + int nOrigValues = initialValues.length; + Method getMethod = ZiggyCppPojo.class.getDeclaredMethod(getter, null); + Method setMethod = ZiggyCppPojo.class.getDeclaredMethod(setter, List.class); + List initialGetValues = (List) getMethod.invoke(ziggyCppObject, null); + assertEquals(nOrigValues, initialGetValues.size()); + for (int i=0 ; i replacementValues = new ArrayList<>(); + replacementValues.add("R1"); + replacementValues.add("R2"); + setMethod.invoke(ziggyCppObject, replacementValues); + List replacementGetValues = (List) getMethod.invoke(ziggyCppObject, null); + assertEquals(replacementValues.size(), replacementGetValues.size()); + for (int i=0 ; i linkerOptions = new ArrayList<>(); + linkerOptions.add("u whatevs"); + ziggyCppObject.setLinkOptions(linkerOptions); + List libraryPathOptions = new ArrayList<>(); + libraryPathOptions.add("/dummy1/lib"); + libraryPathOptions.add("/dummy2/lib"); + ziggyCppObject.setLibraryPaths(libraryPathOptions); + List libraryOptions = new ArrayList<>(); + libraryOptions.add("hdf5"); + libraryOptions.add("netcdf"); + ziggyCppObject.setLibraries(libraryOptions); + + } +} diff --git a/buildSrc/src/test/resources/gov/nasa/tess/buildutil/processed-jaxb.xsd b/buildSrc/src/test/resources/gov/nasa/tess/buildutil/processed-jaxb.xsd new file mode 100644 index 0000000..6138e05 --- /dev/null +++ b/buildSrc/src/test/resources/gov/nasa/tess/buildutil/processed-jaxb.xsd @@ -0,0 +1,19 @@ + + + + + + + + + + + + + + + + + + + diff --git a/buildSrc/src/test/resources/gov/nasa/tess/buildutil/raw-jaxb-generated.xsd b/buildSrc/src/test/resources/gov/nasa/tess/buildutil/raw-jaxb-generated.xsd new file mode 100644 index 0000000..ee199fe --- /dev/null +++ b/buildSrc/src/test/resources/gov/nasa/tess/buildutil/raw-jaxb-generated.xsd @@ -0,0 +1,20 @@ + + + + + + + + + + + + + + + + + + + + diff --git a/doc/NPR-7150.2/Ziggy-NPR7150.2C.xlsx b/doc/NPR-7150.2/Ziggy-NPR7150.2C.xlsx new file mode 100644 index 0000000..1b4bd6a Binary files /dev/null and b/doc/NPR-7150.2/Ziggy-NPR7150.2C.xlsx differ diff --git a/doc/build.gradle b/doc/build.gradle new file mode 100644 index 0000000..0933172 --- /dev/null +++ b/doc/build.gradle @@ -0,0 +1,115 @@ +defaultTasks 'build' + +def getDate() { + def date = new Date() + def formattedDate = date.format('yyyyMMdd') + return formattedDate +} + +task cleanestDryRun(type: Exec) { + description = "Removes pdf and .gradle directories (DRY RUN)." + + outputs.upToDateWhen { false } + + workingDir = rootDir + commandLine "sh", "-c", "git clean --force -x -d --dry-run" +} + +task cleanest(type: Exec) { + description = "Removes pdf and .gradle directories." + + outputs.upToDateWhen { false } + + workingDir = rootDir + commandLine "sh", "-c", "git clean --force -x -d" +} + +subprojects { + defaultTasks 'build' + + task build() { + } + + task makeDocId() { + description = "Generates a doc-id.sty file." + + inputs.files fileTree(dir: projectDir, include: '**/*.tex', exclude: '**/build/**').files + outputs.file "$buildDir/doc-id.sty" + + makeDocId.doFirst { + mkdir buildDir + } + + doLast() { + if (!project.hasProperty('docId')) { + return + } + + exec { + workingDir buildDir + commandLine "bash", "-c", "echo -E '\\newcommand{\\DOCID}{$docId}' > doc-id.sty" + } + } + } + + task compileLaTex(dependsOn: makeDocId) { + description = "Compiles the .tex files into a .pdf file." + + inputs.files fileTree(dir: projectDir, include: '**/*.tex', exclude: '**/build/**').files + outputs.files fileTree(dir: buildDir, include: '**/*.pdf').files + + doFirst { + mkdir buildDir + } + + doLast { + if (!project.hasProperty('texFileName')) { + return + } + + // Execute twice to update references and a third time for BibTeX. + 3.times { + exec { + executable 'pdflatex' + workingDir project.workingDir + args '-output-directory=build' + args '-interaction=nonstopmode' + args '-halt-on-error' + args texFileName + } + } + } + } + build.dependsOn compileLaTex + + task publish(dependsOn: build) { + description = "Publishes the .pdf file into the pdf directory." + + inputs.dir buildDir + outputs.files fileTree(rootDir.getPath() + '/pdf').include('**/*-' + getDate() + '.pdf').files + + doFirst() { + mkdir rootDir.getPath() + '/pdf/' + publishDir + } + + doLast() { + if (!project.hasProperty('texFileName') || !project.hasProperty('publishDir') || !project.hasProperty('docId')) { + return + } + + copy { + from(buildDir) { + rename '^(.*).pdf$', docId + '-$1-' + getDate() + '.pdf' + } + into rootDir.getPath() + '/pdf/' + publishDir + include '**/*.pdf' + } + } + } + + task clean() { + doLast() { + delete buildDir + } + } +} diff --git a/doc/legal/NASA-Corporate-CLA.pdf b/doc/legal/NASA-Corporate-CLA.pdf new file mode 100644 index 0000000..2e822fc Binary files /dev/null and b/doc/legal/NASA-Corporate-CLA.pdf differ diff --git a/doc/legal/NASA-Individual-CLA.pdf b/doc/legal/NASA-Individual-CLA.pdf new file mode 100644 index 0000000..2a3adc6 Binary files /dev/null and b/doc/legal/NASA-Individual-CLA.pdf differ diff --git a/doc/section-508/Ziggy-Section-508-Checklist.docx b/doc/section-508/Ziggy-Section-508-Checklist.docx new file mode 100644 index 0000000..1f24f8d Binary files /dev/null and b/doc/section-508/Ziggy-Section-508-Checklist.docx differ diff --git a/doc/settings.gradle b/doc/settings.gradle new file mode 100644 index 0000000..5dff80a --- /dev/null +++ b/doc/settings.gradle @@ -0,0 +1,2 @@ +include 'hdf5-module-interface' +include 'datastore' diff --git a/doc/user-manual/advanced-topics.md b/doc/user-manual/advanced-topics.md new file mode 100644 index 0000000..28085bb --- /dev/null +++ b/doc/user-manual/advanced-topics.md @@ -0,0 +1,73 @@ +## Advanced Topics + +Here's where we get into some more fun stuff. Much of it is connected with the use of the console, so the advanced topics and Ziggy console section are combined. + +### High Performance Computing + +You don't really want to try to process 55 TB of flight data on your laptop, do you? + +[High Performance Computing Overview](select-hpc.md) + +[The Remote Execution Dialog Box](remote-dialog.md) + +[HPC Cost Estimation](hpc-cost.md) + +[Deleting Tasks](delete-tasks.md) + +### Data Receipt + +More about how Ziggy pulls information into the datastore. + +[Data Receipt Execution Flow](data-receipt.md) + +[Data Receipt Display](data-receipt-display.md) + + diff --git a/doc/user-manual/alerts.md b/doc/user-manual/alerts.md new file mode 100644 index 0000000..356356d --- /dev/null +++ b/doc/user-manual/alerts.md @@ -0,0 +1,17 @@ +## Alerts Panel + +Ziggy uses alerts to tell the pipeline operator that something has happened that they ought to know about. Alerts are displayed on the `Alerts` status panel. It looks like this: + +![](images/monitoring-alerts.png) + +There are two flavors of alert that you're likely to see: warnings and errors. Warnings will turn the alerts stoplight yellow, errors turn it red. The alerts panel shows which task generated the alert, when it happened, and a hopefully-useful message. If there are no alerts, the stoplight will be green. + +Sadly, in this case it tells you pretty much what you already knew: task 12 blew up. + +### Acknowledging Alerts + +Once an alert arrives, the stoplight color will stay whatever color is appropriate for that alert (at least it will stay that color until another alert comes in). This may not be convenient: once you've dealt with whatever problem caused the alert, you'll want to start running again; at which point you'll want the alert stoplight to be green again so you can see if any new alerts come in, rather than staying yellow or red because of some issue that's already been fixed. To make the stoplight color turn back to green, use the `Ack` button. The alerts will still be shown on the table but the stoplight will return to green. + +### Clearing Alerts + +Alternately, once an alert is addressed you may want to get it completely out of the table of alerts. The `Clear` button will clear all alerts from the display and return the stoplight to green. \ No newline at end of file diff --git a/doc/user-manual/building-pipeline.md b/doc/user-manual/building-pipeline.md new file mode 100644 index 0000000..e6a537f --- /dev/null +++ b/doc/user-manual/building-pipeline.md @@ -0,0 +1,152 @@ +## Building Your Pipeline + +The question of whether your pipeline even needs a build system is one that only you can answer. If all the components of the pipeline are written in interpreted languages (Python, shell script, etc.), you might be able to get away without one! + +In this article, we review the "build system" for the sample pipeline, and in the process offer arguments why some kind of build system is a good idea in general. + +Before we move on, you should make sure that your environment variables `PIPELINE_CONFIG_PATH` and `ZIGGY_ROOT` are set correctly (as a reminder of what that means, take a look at the "Set up the Environment Variables" section of [the article on configuring a pipeline](configuring-pipeline.md)). + +### The Sample Pipeline Directory + +Just in case you haven't looked yet, here's what the sample pipeline directory should look like: + +```console +sample-pipeline$ ls +build-env.sh config data etc multi-data src +sample-pipeline$ +``` + +As we've discussed, the `src` directory is the various bits of source code for the pipeline, `etc` is the location of the pipeline properties file, `config` is the location of the XML files that define the pipeline. The `data` is the initial source of the data files that will be used by the sample pipeline. + +At the top you can see `build-env.sh`, which is the "build system" for the sample pipeline. In this case, the sample pipeline is so simple that none of the grown-up build systems were seen as needed or even desirable; a shell script would do what was needed. + +If you run the shell script from the command line (`./build-env.sh`), you should quickly see something that looks like this: + +```console +sample-pipeline$ /bin/bash ./build-env.sh +Collecting h5py + Using cached h5py-3.7.0-cp38-cp38-macosx_10_9_x86_64.whl (3.2 MB) +Collecting Pillow + Using cached Pillow-9.2.0-cp38-cp38-macosx_10_10_x86_64.whl (3.1 MB) +Collecting numpy + Using cached numpy-1.23.4-cp38-cp38-macosx_10_9_x86_64.whl (18.1 MB) +Installing collected packages: Pillow, numpy, h5py +Successfully installed Pillow-9.2.0 h5py-3.7.0 numpy-1.23.4 +sample-pipeline$ +``` + + + +Meanwhile, the directory now looks like this: + +```console +sample-pipeline$ ls +build build-env.sh config data etc multi-data src +sample-pipeline$ ls build +bin env pipeline-results +sample-pipeline$ +``` + +There's now a `build` directory that contains additional directories: `bin, data-receipt`, and `env`. + +#### The build-env.sh Shell Script + +Let's go through the build-env.sh script in pieces. The first piece is the familiar code chunk that sets up some shell variables with paths: + +```bash +# Check for a SAMPLE_PIPELINE_PYTHON_ENV. +if [ -n "$SAMPLE_PIPELINE_PYTHON_ENV" ]; then + if [ -z "$ZIGGY_HOME" ]; then + echo "SAMPLE_PIPELINE_PYTHON_ENV set but ZIGGY_HOME not set!" + exit 1 + fi +else + etc_dir="$(dirname "$PIPELINE_CONFIG_PATH")" + sample_home="$(dirname "$etc_dir")" + ZIGGY_HOME="$(dirname "$sample_home")" + SAMPLE_PIPELINE_PYTHON_ENV=$sample_home/build/env +fi +``` + +Next: + +```bash +# put the build directory next to the env directory in the directory tree +BUILD_DIR="$(dirname "$SAMPLE_PIPELINE_PYTHON_ENV")" +mkdir -p $SAMPLE_PIPELINE_PYTHON_ENV + +# Create and populate the data receipt directory from the sample data +DATA_RECEIPT_DIR=$BUILD_DIR/data-receipt +mkdir -p $DATA_RECEIPT_DIR +cp $sample_home/data/* $DATA_RECEIPT_DIR +``` + +Here we create the `build` directory and its `env` and `data-receipt` directories. The contents of the data directory from the sample directory gets copied to `data-receipt`. + +```bash +# build the bin directory in build +BIN_DIR=$BUILD_DIR/bin +mkdir -p $BIN_DIR +BIN_SRC_DIR=$sample_home/src/main/sh + +# Copy the shell scripts from src to build. There's probably some good shell script +# way to do this, but I'm too lazy. +cp $BIN_SRC_DIR/permuter.sh $BIN_DIR/permuter +cp $BIN_SRC_DIR/flip.sh $BIN_DIR/flip +cp $BIN_SRC_DIR/averaging.sh $BIN_DIR/averaging +chmod -R a+x $BIN_DIR +``` + +Here we construct `build/bin` and copy the shell scripts from `src/main/sh` to `build/bin`. In the process, we strip off the `.sh` suffixes. The shell script copies in `build/bin` now match what Ziggy expects to see. + +```bash +python3 -m venv $SAMPLE_PIPELINE_PYTHON_ENV + +# We're about to activate the environment, so we should make sure that the environment +# gets deactivated at the end of script execution. +trap 'deactivate' EXIT + +source $SAMPLE_PIPELINE_PYTHON_ENV/bin/activate + +# Build the environment with the needed packages. +pip3 install h5py Pillow numpy + +# Get the location of the environment's site packages directory +SITE_PKGS=$(python3 -c "from distutils.sysconfig import get_python_lib; print(get_python_lib())") + +# Copy the pipeline major_tom package to the site-packages location. +cp -r $ZIGGY_HOME/sample-pipeline/src/main/python/major_tom $SITE_PKGS + +# Copy the Ziggy components to the site-packages location. +cp -r $ZIGGY_HOME/src/main/python/hdf5mi $SITE_PKGS +cp -r $ZIGGY_HOME/src/main/python/zigutils $SITE_PKGS + +exit 0 +``` + +Here at last we build the Python environment that will be used for the sample pipeline. The environment is built in `build/env`, and the packages `h5py`, `Pillow`, and `numpy` are installed in the environment. Finally, the sample pipeline source code and the Ziggy utility modules are copied to the `site-packages` directory of the environment, so that they will be on Python's search path when the environment is activated. + +### Okay, but Why? + +Setting up the `build` directory this way has a number of advantages. First and foremost, everything that Ziggy will eventually need to find is someplace in `build`. If your experience with computers is anything like mine, you know that 90% of what we spend our time doing is figuring out why various search paths aren't set correctly. Putting everything you need in one directory minimizes this issue. + +The second advantage is related: if you decide that you need to clean up and start over, you can simply delete the `build` directory. You don't need to go all over the place looking for directories and deleting them. Indeed, most grown up build systems will include a command that will automatically perform this deletion. + +### Why a Build System? + +If you look at build-env.sh, you'll see that it's relatively simple: just 80 lines, including comments and whitespace lines. Nonetheless, you wouldn't want to have to type all this in every time you want to generate the `build` directory! Having a build system -- any build system -- allows you to ensure that the build is performed reproducibly -- every build is the same as all the ones before. It allows you to implement changes to the build in a systematic way. + +### Which Build System? + +Good question! There are a lot of them out there. + +For simple systems, good old `make` is a solid choice. Support for `make` is nearly universal, and most computers ship with some version of it already loaded so you won't even need to install it yourself. + +For more complex systems, we're enamored of Gradle. The secret truth of build systems is that a build is actually a program: it describes the steps that need to be taken, their order, and provides conditionals of various kinds (like, "if the target is up-to-date, skip this step"). Gradle embraces this secret truth and runs with it, which makes it in many ways easier to use for bigger systems than `make`, with its frequently bizarre use of symbols and punctuation to represent relationships within the software. + +### Postscript + +Note that while going through all this exposition, we've also sneakily built the sample pipeline! This positions us for the next (exciting) step: [running the pipeline](running-pipeline.md)! + +Unfortunately, before you get there, we'll need to talk some about [relational databases](rdbms.md). + diff --git a/doc/user-manual/change-param-values.md b/doc/user-manual/change-param-values.md new file mode 100644 index 0000000..ae5a6b4 --- /dev/null +++ b/doc/user-manual/change-param-values.md @@ -0,0 +1,28 @@ +## Changing Module Parameter Values + +To see how this works, go back to the `Configuration` tab and select the `Parameter Library` option from the menu on the left hand side. You'll see this: + +![](images/parameter-library.png) + +As you can see, all of the parameter sets defined in `pl-sample.xml` are represented here. What else does the table tell us? + +- The `Type` column is the name of the Java class that supports the module parameter set. For now you can ignore this. +- The `Version` column shows the current version of the parameter set. They all show zero because none of the parameter sets has been modified since they were imported from `pl-sample.xml`. +- The `Locked` column shows whether the current version of each parameter set is locked. What that means is this: before a version of a parameter set is used in the pipeline, it's unlocked, and the user can make changes to it; once the version has been used in processing, that version becomes locked, and any changes the user makes will create a new version (that is unused, hence unlocked). The versioning and locking features allow Ziggy to preserve a permanent record of the parameters used in each instance of each pipeline. + +Now: double-click the Algorithm Parameters row in the table. You'll get a new dialog box: + + + +The parameters that were defined as booleans in `pl-sample.xml` have check boxes you can check or uncheck. The other parameter types mostly behave the way you expect, but the array types offer some additional capabilities. If you click the `dummy array parameter` parameter, it will change thusly: + + + +If you click the "X", all the values will be deleted, which is rarely what you want. Instead click the other button. You'll get this window: + + + +This allows you to edit the array elements, remove them, add elements, etc., in a more GUI-natural way. Go ahead and change the second element (`idx` of 1) to 4. Click `ok`, then on the Edit Parameter Set dialog click `Save`. The `Version` for Algorithm Parameters will now be set to 1, and the `Locked` checkbox is unchecked. + +If you were to now run the sample pipeline, when you returned to the parameter library window, version 1 of `Algorithm Parameters` will show as locked. + diff --git a/doc/user-manual/configuring-pipeline.md b/doc/user-manual/configuring-pipeline.md new file mode 100644 index 0000000..9fd2bd8 --- /dev/null +++ b/doc/user-manual/configuring-pipeline.md @@ -0,0 +1,241 @@ +## Configuring a Pipeline + +In this article, we'll walk through the process by which you can write your own pipeline and connect it to Ziggy. As we do so, we'll show how the sample pipeline addresses each of the steps, so you can see a concrete example. For this reason, it's probably worthwhile to have the sample-pipeline folder open as we go along (though we'll make use of screen shots, so it's not absolutely essential to have that open; just recommended). + +It also might be worthwhile to take open the [article on pipeline architecture](pipeline-architecture.md) in a separate window, as we'll be referring to it below. + +### Write the Algorithm Software + +At the heart of your pipeline are the algorithm packages that process the data and generate the results; on the architecture diagram, it's the big green "Algorithms" box on the bottom. On the one hand, we can't help you much with this -- only you know what you want your pipeline to do! On the other hand, Ziggy doesn't really put any particular requirements on how you do this. You can write what you want, the way you want it, in the language you want. At the moment, Ziggy has especially good support for C++, Java, MATLAB, and Python as algorithm languages, but really, it can be anything! + +In the sample pipeline, the algorithm code is in sample-pipeline/src/main/python/major_tom/major_tom.py; with a little luck, [this link](../../sample-pipeline/src/main/python/major_tom/major_tom.py) will open the file for you! There are 4 algorithm functions, each of which does some rather dopey image processing on PNG images: one of them permutes the color maps, one performs a left-right flip, one does an up-down flip, and one averages together a collection of PNG files. They aren't written particularly well, and I can't advocate for using them as an example of how to write Python code, but the point is that they don't do anything in particular to be usable in Ziggy. + +#### Pipeline Design + +That said, when you write your pipeline, there are a number of design issues that you must implicitly address: + +- What steps will the pipeline perform, and in what order? +- What will be the file name conventions for the inputs and outputs of each step? +- What additional information will each step need: instrument models, parameters, etc. + +The reason I bring this up is that these are the things that you'll need to teach to Ziggy so it knows how to run your pipeline for you. We'll get into that in the next few sections. + +### Write the Pipeline Configuration Files + +The issues described above are collectively the "pipeline configuration." This is represented on the architecture diagram by the green box in the upper left, "Pipeline Configuration (XML)." As advertised, Ziggy uses a set of XML files to define the pipeline steps, data types, etc. In the interest of this article not being longer than *Dune*, we're going to cover each of them in its own article: + +[Module Parameters](module-parameters.md) + +[Data File Types](data-file-types.md) + +[Pipeline Definition](pipeline-definition.md) + +### Write the "Glue" Code between Algorithms and Ziggy + +When writing the algorithm software, we asserted that there were no particular requirements on how the algorithms were written, which is true. However, it's also true that inevitably there has to be a certain amount of coding to the expectations of Ziggy, and this is what we mean when we talk about the "glue" code. "Glue" code is the code that is called directly by Ziggy and which then calls the algorithm code. For physicists and electrical engineers, you can think of this as providing an impedance match between Ziggy and the algorithms. + +Ziggy has really 3 requirements for the code it calls: + +1. The code has to be callable as a single executable with no arguments (i.e., it has to be something that could run at the command line). +2. The code should be effectively unable to error out. What this means is that any non-trivial content should be in a try / catch or try / except block. +3. The code has to return a value of 0 for successful execution, any other integer value for failure. + +There's also a "desirement:" in the event that the algorithm code fails, Ziggy would like a stack trace to be provided in a particular format. + +The good news here is that Ziggy will provide tools that make it easy to accomplish the items above. Also, Ziggy's "contract" with the algorithm code is as follows: + +1. The data files and instrument model files needed as input will be provided in the working directory that the algorithm uses (so you don't need to worry about search paths for these inputs). +2. Results files can be written to the working directory (so you don't need to worry about sending them someplace special). +3. Ziggy will provide a file in the working directory that specifies the names of all data files, the names of all model files, and the contents of all parameter sets needed by the algorithm. Thus it is not necessary for the "glue" code to hunt around in the working directory looking for files with particular name conventions; just open the file that defines the inputs and read its contents. + +#### "Glue" Code in the Sample Pipeline + +In the case of the sample pipeline, the "glue" code is actually written in 2 pieces: each algorithm module has 1 Python script plus 1 shell script that calls Python and gives it the name of the Python script to execute. We'll examine each of these in turn. + +##### Python-Side "Glue" Code + +For the purposes of this discussion we'll use the code that calls the `permute_colors` Python function: in the Python source directory (src/main/python/major_tom), it's [permuter.py](../../sample-pipeline/src/main/python/major_tom/permuter.py). This is the Python code that performs some housekeeping and then calls the permuter function in major_tom.py. + +The `permuter.py` module starts off with the usual collection of import statements: + +```Python +from zigutils.stacktrace import ZiggyErrorWriter +from hdf5mi.hdf5 import Hdf5ModuleInterface +from zigutils.pidfile import write_pid_file +from major_tom import permute_color +``` + +These are the Ziggy-specific imports (the other imports are standard Python packages and modules). The first, `ZiggyErrorWriter`, provides the class that writes the stack trace in the event of a failure; `Hdf5ModuleInterface` provides a specialized HDF5 module that reads the file with Ziggy's input information and writes the stack trace; `write_pid_file` writes to the working directory a hidden file that contains the process ID for the Python process that runs the algorithm, which is potentially useful later as a diagnostic; `permute_color` is the name of the algorithm function that `permuter.py` will run. + +The next block of code looks like this: + +```python +# define an instance of the HDF5 read/write class as a global variable. +hdf5_module_interface = Hdf5ModuleInterface() + +if __name__ == '__main__': + try: + + # generate the process ID (PID) file in the working directory. + write_pid_file() + + # Read inputs: note that the inputs contain the names of all the files + # that are to be used in this process, as well as model names and + # parameters. All files are in the working directory. + inputs = hdf5_module_interface.read_file("permuter-inputs-0.h5") + data_file = inputs.dataFilenames + parameters = inputs.moduleParameters.Algorithm_Parameters + models = inputs.modelFilenames + +``` + +The main thing of interest here is that the HDF5 file with the inputs information is opened and read. The file name will always be the module name followed by `-inputs-0.h5`. The data object that's read from that file provides the data file names as a Python list in the `.filenames` field; all the module parameters in the `.moduleParameters` field; and the names of the models as a Python list in the `.modelFilenames` field. Note that, as described in [the article on parameter sets](module-parameters.md), the module parameter set named `Algorithm Parameters` in the parameters XML file is renamed to `Algorithm_Parameters` here. + +Next: + +```Python + # Handle the parameter values that can cause an error or cause + # execution to complete without generating output + dir_name = os.path.basename(os.getcwd()) + if dir_name == "st-0": + throw_exception = parameters.throw_exception_subtask_0 + else: + throw_exception = False + + if dir_name == "st-1": + produce_output = parameters.produce_output_subtask_1 + else: + produce_output = True +``` + +The main thing that's interesting here is that it shows how to access the individual parameters within a parameter set. You may well ask: What is this code actually doing with the parameters? Well, it's setting up to allow a demonstration of some of Ziggy's features later when we run the pipeline. For now, just focus on the fact that the parameters are subfields of the parameter struct, and that whitespace in the parameter names has been turned to underscores. + +```python + # Run the color permuter. The permute_color function will produce + # the output with the correct filename. + permute_color(data_file, throw_exception, produce_output) +``` + +Here the actual algorithm code is called. Thus we see that the Python-side "glue" has taken the information from Ziggy and reorganized it for use by the algorithm code. + +In this case the algorithm code doesn't return anything because it writes its outputs directly to the working directory. This is a choice, but not the only one: it would also be allowed for the algorithm code to return results, and for the "glue" code to perform some additional operations on them and write them to the working directory. + +Anyway, moving on to the last chunk of the Python-side "glue" code, we see this: + +Anyway, moving on to the last chunk of the Python-side "glue" code, we see this: + +```python + # Sleep for a user-specified interval. This is here just so the + # user can watch execution run on the pipeline console. + time.sleep(parameters.execution_pause_seconds) + exit(0) + + except Exception: + ZiggyErrorWriter() + exit(1) +``` + +The first part of this block does something you definitely won't want to do in real life: it uses a module parameter to insert a pause in execution! We do this here for demonstration purposes only: the algorithms in the sample pipeline are so simple that they run instantaneously, but in real life that won't happen; so this slows down the execution so you can watch what's happening and get more of a feel for what real life will be like. + +Once execution completes successfully, the "glue" returns a value of 0. If an exception occurs at any point in all the foregoing, we jump to the `except` block: the stack trace is written and the "glue" returns 1. + +The foregoing is all the Python "glue" code needed by an algorithm. The equivalent code in MATLAB or Java or C++ are all about equally simple. + +###### Digression: Do We Really Need In-Language "Glue" Code? + +Not really. In principle, all of the "glue" code, above, could have been included in the algorithm function itself. So why do it this way? + +It's really a combination of convenience and division of labor. By separating the stuff that Ziggy needs into one file and the algorithm itself into another, the subject matter experts are free to develop the algorithm as they see fit and without need to worry about how it will fit into the pipeline. This also makes it easier for the algorithm code to run in a standalone or interactive way. This is especially helpful because the algorithm code is usually written, or at least prototyped, before the pipeline integration is performed. For example: in this case, all of the Python algorithm code in the sample pipeline was developed and debugged by running Python interactively; thus I provided an interface to each function that was optimal for interactive use (in this case, just send the name of the file you want to process as an argument). Once I was happy with the algorithm code, I wrote the Python "glue." + +This also means that the algorithm packages can still be run interactively, which is generally useful. If the algorithm functions also had all of the Ziggy argle-bargle in them, it wouldn't be possible to run the algorithm outside of the context of either Ziggy itself or else an environment that emulates Ziggy. + +Like I say, this isn't the only way to write a pipeline; but over time we've found that something like this has been the best way to do business. + +##### Shell Script "Glue" Code + +The shell script that provides the connection between permuter.py and Ziggy is in src/main/sh: [permuter.sh](../../sample-pipeline/src/main/sh/permuter.sh). The script can be seen as having 2 blocks of code. Here's the first: + +```bash +# Check for a SAMPLE_PIPELINE_PYTHON_ENV. +if [ -n "$SAMPLE_PIPELINE_PYTHON_ENV" ]; then + if [ -z "$ZIGGY_HOME" ]; then + echo "SAMPLE_PIPELINE_PYTHON_ENV set but ZIGGY_HOME not set!" + exit 1 + fi +else + etc_dir="$(dirname "$PIPELINE_CONFIG_PATH")" + sample_home="$(dirname "$etc_dir")" + ZIGGY_HOME="$(dirname "$sample_home")" + SAMPLE_PIPELINE_PYTHON_ENV=$sample_home/build/env +fi +``` + +All this really does is define 2 variables: `ZIGGY_HOME`, the location of the Ziggy main directory; and `SAMPLE_PIPELINE_PYTHON_ENV`, the location of the Python environment for use in running the pipeline. + +The next block does all the actual work: + +```bash +# We're about to activate the environment, so we should make sure that the environment +# gets deactivated at the end of script execution. +trap 'deactivate' EXIT + +source $SAMPLE_PIPELINE_PYTHON_ENV/bin/activate + +# Get the location of the environment's site packages directory +SITE_PKGS=$(python3 -c "from distutils.sysconfig import get_python_lib; print(get_python_lib())") + +# Use the environment's Python to run the permuter Python script +python3 $SITE_PKGS/major_tom/permuter.py + +# capture the Python exit code and pass it to the caller as the script's exit code +exit $? +``` + +Again, really simple: activate the environment; find the location of the environment's site-packages directory; run `permuter.py` in the Python copy in the environment; return the exit code from `permuter.py` as the exit code from the shell script. The `trap` statement at the top is the way that shell scripts implement a kind of try-catch mentality: it says that when the script exits, no matter the reason or the condition of the exit, deactivate the Python environment on the way out the door. + +All of the above has focused on the permuter algorithm, but the same pattern of "glue" files and design patterns are used for the flip and averaging algorithms. + +All of the above has focused on the permuter algorithm, but the same pattern of "glue" files and design patterns are used for the flip and averaging algorithms. + +### Set up the Properties File + +As you can probably imagine, Ziggy actually uses a lot of configuration items: it needs to know numerous paths around your file system, which relational database application you want to use, how much heap space to provide to Ziggy, and on and on. All of this stuff is put into two locations for Ziggy: the pipeline properties file and the Ziggy properties file. + +For the sample pipeline, the pipeline properties file is [etc/sample.properties](../../sample-pipeline/etc/sample.properties). It uses a fairly standard name-value pair formalism, with capabilities for using property values or environment variables as elements of other properties. + +In real life, you would want the working properties file to be outside of the directories managed by the version control system. This allows you to modify the file without fear that you will accidentally push your changes back to the repository's origin! For our purposes, we've put together a pipeline properties file that you can use without modification, so feel free to just leave it where it is. We suggest that you start by copying the [pipeline.properties.EXAMPLE file](../../etc.pipeline.properties.EXAMPLE) to someplace outside of the Git-controlled directories, rename it, and modify it so that it suits your need. + +Meanwhile, The Ziggy properties file is [etc/ziggy.properties](../../etc/ziggy.properties), which is in the etc subdirectory of the main Ziggy directory. The properties here are things that you are unlikely to ever need to change, but which Ziggy needs. + +The properties file is a sufficiently important topic that it has its own separate article. See the article on [The Properties File](properties.md) for discussion of all the various properties in the pipeline properties file. + +### Set up the Environment Variables + +In a normal pipeline, you will need to set up only one environment variable: the variable `PIPELINE_CONFIG_PATH`, which has as its value the absolute path to the pipeline properties file. Ziggy can then use the pipeline properties file to get all its configuration parameters. + +For the sample pipeline, it was necessary to add a second environment variable: `ZIGGY_ROOT`, which is set to the absolute path to the top-level Ziggy directory. Why was this necessary? + +Under normal circumstances, the user would set the values of the path properties in the properties file based on their own arrangement of the file system, the location of the Ziggy directory, etc. All these things are known to the user, so the user can put all that path information into the pipeline properties file. + +In the case of the sample pipeline, we wanted to provide a properties file that would work for the end user, but we don't know anything about any end user's file system organization. We don't even know your username on your local system! So how do we make a properties file that will work for you without modification? + +Answer: the sample properties file sets all its paths relative to the top-level Ziggy directory. Of course, we here at Ziggy World Headquarters don't know that, either; it's unknown for the same reason that the user's username, file system arrangement, etc., are unknown. Thus, the user has to provide that information in the form of a second environment variable. And here we are. + +Anyway: set those properties now, before going any further. + +Anyway: set those properties now, before going any further. + +#### What About the Ziggy Properties File? + +The sample properties file contains a property that is the location of the Ziggy properties file. Thus there's no need to have a separate environment variable for that information. Like we said, to the extent possible we've put everything configuration related into the pipeline properties file. + +### And That's It + +Well, in fact we've covered quite a lot of material here! But once you've reached this point, you've covered everything that's needed to set up your own data analysis pipeline and connect it to Ziggy. + +That said, if you're paying attention you've probably noticed that this article ignored some issues, or at least posed some mysteries: + +- The shell script for the permuter module is `permuter.sh`, but the module name is `permuter`. Why isn't the module name `permuter.sh`? +- You activate a Python environment, but -- where did that environment come from? What's in it? + +These questions will be discussed in the article on [Building a Pipeline](building-pipeline.md). diff --git a/doc/user-manual/data-file-types.md b/doc/user-manual/data-file-types.md new file mode 100644 index 0000000..bfa61b3 --- /dev/null +++ b/doc/user-manual/data-file-types.md @@ -0,0 +1,131 @@ +## Data File Types + +As the user, one of your jobs is to define, for Ziggy, the file naming patterns that are used for the inputs and outputs for each algorithm, and the file name patterns that are used for instrument models. The place for these definitions is in data file type XML files. These have names that start with "pt-" (for "Pipeline Data Type"); in the sample pipeline, the data file type definitions are in [config/pt-sample.xml](../../sample-pipeline/conf/pt-sample.xml). + +Note that when we talk about data file types, we're not talking about data file formats (like HDF5 or geoTIFF). Ziggy doesn't care about data file formats; use whatever you like, as long as the algorithm software can read and write that format. + +### The Datastore and the Task Directory + +Before we get too deeply into the data file type definitions, we need to have a brief discussion about two directories that Ziggy uses: the datastore, on the one hand, and the task directories, on the other. + +#### The Datastore + +"Datastore" here is just a $10 word for an organized directory tree where Ziggy keeps the permanent copies of its various kinds of data files. Files from the datastore are provided as inputs to the algorithm modules; when the modules produce results, those outputs are transferred back to the datastore. + +Who defines the organization of the datastore? You do! The organization is implicitly defined when you define the data file types that go into, and come out of, the datastore. This will become clear in a few paragraphs (at least I hope it's clear). + +#### The Task Directory + +Each processing activity has its own directory, known as the "task directory." The task directory is where the algorithm modules look to find the files they operate on, and it's where they put the files they produce as results. Unlike the datastore, these directories are transient; once processing is complete, you can feel free to delete them at some convenient time. In addition, there are some other uses that benefit from the task directory. First, troubleshooting. In the event that a processing activity fails, you have in one place all the inputs that the activity uses, so it's easy to inspect files, watch execution, etc. In fact, you can even copy the task directory to some other system (say, your laptop) if that is a more convenient place to do the troubleshooting! Second, and relatedly, the algorithm modules are allowed to write files to the task directory that aren't intended to be persisted in the datastore. This means that the task directory is a logical place to put files that are used for diagnostics or troubleshooting or some other purpose, but which you don't want to save for posterity in the datastore. + +#### And My Point Is? + +The key point is this: the datastore can be, and generally is, heirarchical; the task directory is flat. Files have to move back and forth between these two locations. The implications of this are twofold. First, **the filenames used in the datastore and the task directory generally can't be the same.** You can see why: because the datastore is heirarchical, two files that sit in different directories can have the same name. If those two files are both copied to the task directory, one of them will overwrite the other unless the names are changed when the files go to the task directory. + +Second, and relatedly, **the user has to provide Ziggy with some means of mapping the two filenames to one another.** Sorry about that; but the organization of the datastore is a great power, and with great power comes great responsibility. + +### Mission Data + +Okay, with all that throat-clearing out of the way, let's take a look at some sample data file type definitions. + +```xml + + + +``` + +Each data file type has a name, and that name can have whitespace in it. That much makes sense. + +#### fileNameRegexForTaskDir + +This is how we define the file name that's used in the task directory. This is a [Java regular expression](https://docs.oracle.com/en/java/javase/12/docs/api/java.base/java/util/regex/Pattern.html) (regex) that the file has to conform to. For `raw data`, for example, a name like `some-kinda-name-set-1-file-9.png` would conform to this regular expression, as would `another_NAME-set-4-file-3.png`, etc. + +#### fileNameWithSubstitutionsForDatastore + +Remember that the task directory is a flat directory, while the datastore can be heirarchical. This means that each part of the path to the file in the datastore has to be available somewhere in the task directory name, and vice-versa, so that the two can map to each other. + +In the `fileNameWithSubstitutionsForDatastore`, we accomplish this mapping. The way that this is done is that each "group" (one of the things in parentheses) is represented with $ followed by the group number. Groups are numbered from left to right in the file name regex, starting from 1 (group 0 is the entire expression). In raw data, we see a value of `$2/L0/$1-$3.png`. This means that group 2 is used as the name of the directory under the datastore root; `L0` is the name of the next directory down; and groups 1 and 3 are used to form the filename. Thus, `some-kinda-name-set-1-file-9.png` in the task directory would translate to `set-1/L0/some-kinda-name-file-9.png` in the datastore. + +Looking at the example XML code above, you can (hopefully) see what we said about how you would be organizing the datastore. From the example, we see that the directories immediately under the datastore root will be `set-0, set-1`, etc. Each of those directories will then have, under it, an `L0` directory and an `L1` directory. Each of those directories will then contain PNG files. + +Notice also that the filenames of `raw data` files and `permuted colors` files in the datastore can potentially be the same! This is allowed because the `fileNameWithSubstitutionsForDatastore` values show that the files are in different locations in the datastore, and the `fileNameRegexForTaskDir` values show that their names in the task directory will be different, even though their names in the datastore are the same. + +### Instrument Model Types + +Before we can get into this file type definition, we need to answer a question: + +#### What is an Instrument Model, Anyway? + +Instrument models are various kinds of information that are needed to process the data. These can be things like calibration constants; the location in space or on the ground that the instrument was looking at when the data was taken; the timestamp that goes with the data; etc. + +Generally, instrument models aren't the data that the instrument acquired (that's the mission data, see above). This is information that is acquired in some other way that describes the instrument properties. Like mission data, instrument models can use any file format that the algorithm modules can read. + +#### Instrument Model Type Definition + +Here's our sample instrument model type definition: + +​ `` + +As with the data file types, model types are identified by a string (in this case, the `type` attribute) that can contain whitespace, and provides a regular expression that can be used to determine whether any particular file is a model of the specified type. In this case, in a fit of no-imagination, the regex is simply a fixed name of `sample-model.txt`. Thus, any processing algorithm that needs the `dummy model` will expect to find a file named `sample-model.txt` in its task directory. + +#### Wait, is That It? + +Sadly, no. Let's talk about model names and how they fit into all of this. + +##### Datastore Model Names + +Ziggy permanently stores every model of every kind that is imported into it. This is necessary because someday you may need to figure out what model was used for a particular processing activity, but on the other hand it may be necessary to change the model as time passes -- either because the instrument itself changes with time, or because your knowledge of the instrument changes (hopefully it improves). + +But -- in the example above, the file name "regex" is a fixed string! This means that the only file name that Ziggy can possibly recognize as an instance of `dummy model` is `sample-model.txt`. So when I import a new version of `sample-model.txt` into the datastore, what happens? To answer that, let's take a look at the `dummy model` subdirectory of the `models` directory in the datastore: + +```console +models$ ls "dummy\ model" +2022-10-31.0001-sample-model.txt +models$ +``` + +(Yes, I broke my own strongly-worded caution against using whitespace in names, and in a place where it matters a lot -- a directory name! Consistency, hobgoblins, etc.) + +As you can see, the name of the model in the datastore isn't simply `sample-model.txt`. It's had the date of import prepended, along with a version number. By making these changes to the name, Ziggy can store as many versions of a model as it needs to, even if the versions all have the same name at the time of the import. + +##### Task Directory Model Names + +Ziggy also maintains a record of the name the model file had at the time of import. When the model is provided to the task directory so the algorithms can use it, this original name is restored. This way, the user never needs to worry about Ziggy's internal renaming conventions; the algorithms can use whatever naming conventions the mission uses for the model files, even if the mission reuses the same name over and over again. + +##### Which Version is Sent to the Algorithms? + +The most recent version of each model is the one provided to the algorithms at runtime. If there were 9 different models in `dummy model`, the one with version number `0009` would be the one that is copied to the task directories. If, some time later, a tenth version was imported, then all subsequent processing would use version `0010`. + +##### What Happens if the Actual Model Changes? + +Excellent question! Imagine that, at some point in time, one or more models change -- not your knowledge of them, the actual, physical properties of your instrument change. Obviously you need to put a new model into the system to represent the new properties of the instrument. But equally obviously, if you ever go back and reprocess data taken prior to the change, you need to use the model that was valid at that time. How does Ziggy handle that? + +Answer: Ziggy always, *always* provides the most recent version of the model file. If you go and reprocess, the new processing will get the latest model. In order to properly represent a model that changes with time, **the changes across time must be reflected in the most recent model file!** Also, and relatedly, **the algorithm code must be able to pull model for the correct era out of the model file!** + +In practice, that might mean that your model file contains multiple sets of information, each of which has a datestamp; the algorithm would then go through the file contents to find the set of information with the correct datestamp, and use it. Or, it might mean that the "model" is values measured at discrete times that need to be interpolated by the algorithm. How the time-varying information is provided in the model file is up to you, but if you want to have a model that does change in time, this is how you have to do it. + +##### Model Names with Version Information + +The above example is kind of unrealistic because in real life, a mission that provides models that get updated will want to put version information into the file name; if for no other reason than so that when there's a problem and we need to talk about a particular model version, we can refer to the one we're concerned about without any confusion ("Is there a problem with sample model?" "Uh, which version of sample model?" "Dunno, it's just called sample model."). Thus, the file name might contain a timestamp, a version number, or both. + +If the model name already has this information, it would be silly for Ziggy to prepend its own versioning; it should use whatever the mission provides. Fortunately, this capability is provided: + +```xml + +``` + +In this case, the XML attribute `versionNumberGroup` tells Ziggy which regex group it should use as the version number, and the attribute `timestampGroup` tells it which to use as the file's timestamp. When Ziggy stores this model in the `versioned-model` directory, it won't rename the file; it will keep the original file name, because the original name already has a timestamp and a version number. + +In general, the user can include in the filename a version number; a timestamp; or both; or neither. Whatever the user leaves out, Ziggy will add to the filename for internal storage, and then remove again when providing the file to the algorithms. + +##### Models Never Get Overwritten in the Datastore + +One thing about supplying timestamp and version information in the filename is that it gives some additional protection against accidents. **Specifically: Ziggy will never import a model that has the same timestamp and version number as one already in the datastore.** Thus, you can never accidentally overwrite an existing model with a new one that's been accidentally given the same timestamp and version information. + +For models that don't provide that information in the filename, there's no protection against such an accident because there can't be any such protection. If you accidentally re-import an old version of `sample-model.txt`, Ziggy will assume it's a new version and store it with a new timestamp and version number. When Ziggy goes to process data, this version will be provided to the algorithms. diff --git a/doc/user-manual/data-receipt-display.md b/doc/user-manual/data-receipt-display.md new file mode 100644 index 0000000..e7ea57f --- /dev/null +++ b/doc/user-manual/data-receipt-display.md @@ -0,0 +1,11 @@ +## Data Receipt Display + +The console has the ability to display data receipt activities. From the `Configuration` tab, expand the `Data Receipt` folder and select `Available Datasets`. You'll see something like this: + +![](images/data-receipt-display.png) + +Double-clicking a row in the table brings up a display of all the files in the dataset: + +![](images/data-receipt-list.png) + +Note that the file names are the datastore names. \ No newline at end of file diff --git a/doc/user-manual/data-receipt.md b/doc/user-manual/data-receipt.md new file mode 100644 index 0000000..9ae1be2 --- /dev/null +++ b/doc/user-manual/data-receipt.md @@ -0,0 +1,157 @@ +## Data Receipt Execution Flow + +Remember data receipt? Here's where we get into how it works. You'll want to know this when you're setting up data transfers in your own mission. + +### Data Files and Manifest + +The sample pipeline's data receipt directory uses a copy of the files from the `data` subdirectory in the `sample-pipeline` main directory. Let's take a look at that directory now: + +```console +sample-pipeline$ ls data +nasa_logo-set-1-file-0.png +nasa_logo-set-1-file-3.png +nasa_logo-set-2-file-2.png +sample-pipeline-manifest.xml +nasa_logo-set-1-file-1.png +nasa_logo-set-2-file-0.png +nasa_logo-set-2-file-3.png +nasa_logo-set-1-file-2.png +nasa_logo-set-2-file-1.png +sample-model.txt +sample-pipeline$ +``` + +Most of these files are obviously the files that get imported. But what about the manifest? Here's the contents of the manifest: + +```xml + + + + + + + + + + + + +``` + +Some parts of this are obvious: the number of files in the delivery, the fact that every file has an entry in the manifest, every file's size is listed in the manifest. + +The `datasetId` is a unique identifier for a data delivery. This serves two purposes: + +1. It prevents you from re-importing the same files multiple times, as the `datasetId` from each successful import is saved and new imports are checked against them (exception: `datasetId` 0 can be reused). +2. When there's a problem with the delivery and you need to work the issue with whoever sent it to you, it lets you refer to exactly which one failed: "Yeah, uh, looks like dataset 12345 won't import. Got a minute?" + +The `checksumType` is the name of the algorithm that's used to generate a checksum for each file in the manifest. In this example we're using SHA1, which is a reasonable balance between calculation speed, checksum size, and checksum quality (anyway, we're not using these SHA1 hashes for secure communication, we're just using them to make sure the files didn't get corrupted). + +Each file has a `checksum` that's computed using the specified `checksumType`. + +### The Data Receipt Directory + +Data receipt needs to have a directory that's used as the source for files that get pulled into the datastore. There's a [property in the properties file](properties.md) that specifies this, namely `data.receipt.dir`. Ziggy allows this directory to be used in either of two ways. + +#### Files in the Data Receipt Directory + +Option 1 is for all the files, and the manifest, to be in the data receipt directory. In this case, the data receipt pipeline node will produce 1 task. + +#### Files in Data Receipt Subdirectories + +Option 2 is that there are no data files or manifests in the top-level data receipt directory. Instead, there are subdirectories within data receipt, each of which contains files for import and a manifest. In this case, data receipt will create a pipeline task per subdirectory. + +### What Data Receipt Does + +Here's the steps data receipt takes, in order. + +#### Validate the Manifest + +In this step, Ziggy checks that every file that's present in the manifest is also present in the directory, and that the size and checksum of every file is correct. Ziggy produces an acknowledgement file that lists each file with its transfer status (was the file there?) and its validation status (were the size and checksum correct?), plus an overall status for the transfer (did any file fail either of its validations?). + +#### Look for Files Not Listed in the Manifest + +The step above ensures that every file in the manifest was transferred, but it doesn't rule out that there were extra files transferred that aren't in the manifest. Ziggy now checks to make sure that there aren't any such extra files, and throws an exception if any are found. The idea here is that Ziggy can't tell whether an extra file is supposed to be imported or not, and if it is supposed to be imported there's no size or checksum information to validate it with. So to be on the safe side, better to stop and ask for help. + +#### Do the Imports + +At this point Ziggy loops through the directory and imports all the files into the datastore. As each file is imported into the datastore it's removed from the data receipt directory. + +#### Clean Up the Data Receipt Directory + +All manifest and acknowledgement files are transferred to the `manifests` sub-directory of the `logs` directory. + +Empty subdirectories of the data receipt directory are removed. If the data receipt directory has any remaining content other than the .manifests directory, an exception is thrown. An exception at this point due to non-empty directories means that files that were supposed to be imported weren't. + +### The Acknowledgement XML File + +In the interest of completeness, here's the content of the acknowledgement file for the sample pipeline data delivery: + +```xml + + + + + + + + + + + + +``` + +Note that the manifest file must end with "`-manifest.xml`", and the acknowledgement file will end in "`-manifest-ack.xml`", with the filename prior to these suffixes being the same for the two files. + +### Systems that Treat Directories as Data Files + +There may be circumstances in which it's convenient to put several files into a directory, and then to use a collection of directories of that form as "data files" for the purposes of data processing. For example, consider a system where there's a data file with an image, and then several files that are used to background-subtract the data file. Rather than storing each of those files separately, you might put the image file and its background files into a directory; import that directory, as a whole, into the datastore; then supply that directory, as a whole, as an input for a subtask. + +In that case, the manifest still needs to have an entry for each regular file, but in this case the name of the file includes the directory it sits in. Here's what that looks like in this example: + +```xml + + + + + + + + + + + + + +``` + +Now the only remaining issue is how to tell Ziggy to import the files in such a way that each of the `data-#####` directories is imported and stored as a unit. To understand how that's accomplished, let's look back at the data receipt node in `pd-sample.xml`: + +```xml + + + + +``` + +Meanwhile, the definition of the raw data type is in `pt-sample.xml`: + +```xml + +``` + +Taken together, these two XML snippets tell us that data receipt's import is going to import files that match the file name convention for the `raw data` file type. We can do the same thing when the "file" to import is actually a directory. If you define a data file type that has `fileNameRegexForTaskDir` set to `data-[0-9]{5}`, Ziggy will import directory `data-00001` and all of its contents as a unit and store that unit in the datastore, and so on. + +Note that the manifest ignores the fact that import of data is going to treat the `data-#####` directories as the "files" it imports, and the importer ignores that the manifest validates the individual files even if they are in these subdirectories. + +### Generating Manifests + +Ziggy also comes with a utility to generate manifests from the contents of a directory. Use `runjava generate-manifest`. This utility takes 3 command-line arguments: + +1. Manifest name, required. +2. Dataset ID, required. +3. Path to directory with files to be put into the manifest, optional (default: working directory). + diff --git a/doc/user-manual/datastore-task-dir.md b/doc/user-manual/datastore-task-dir.md new file mode 100644 index 0000000..7d12d9b --- /dev/null +++ b/doc/user-manual/datastore-task-dir.md @@ -0,0 +1,190 @@ +## The Datastore and the Task Directory + +The datastore is Ziggy's organized, permanent file storage system. The task directory is temporary file storage used by processing algorithms. Let's take a look at these now. + +### The Datastore + +Before we can look at the datastore, we need to find it! Fortunately, we can refer to the [properties file](properties.md). Sure enough, we see this: + +``` +pipeline.root.dir = ${ziggy.root}/sample-pipeline +pipeline.home.dir = ${pipeline.root.dir}/build +pipeline.results.dir = ${pipeline.home.dir}/pipeline-results +datastore.root.dir = ${pipeline.results.dir}/datastore +``` + +Well, you don't see all of those lines laid out as conveniently as the above, but trust me, they're all there. Anyway, what this is telling us is that Ziggy's data directories are in `build/pipeline-results/datastore`. Looking at that location we see this: + +```console +datastore$ tree +├── models +│   └── dummy model +│   └── 2022-10-31.0001-sample-model.txt +├── set-1 +│   ├── L0 +│   │   ├── nasa_logo-file-0.png +│   │   ├── nasa_logo-file-1.png +│   │   ├── nasa_logo-file-2.png +│   │   └── nasa_logo-file-3.png +│   ├── L1 +│   │   ├── nasa_logo-file-0.png +│   │   ├── nasa_logo-file-1.png +│   │   ├── nasa_logo-file-2.png +│   │   └── nasa_logo-file-3.png +│   ├── L2A +│   │   ├── nasa_logo-file-0.png +│   │   ├── nasa_logo-file-1.png +│   │   ├── nasa_logo-file-2.png +│   │   └── nasa_logo-file-3.png +│   ├── L2B +│   │   ├── nasa_logo-file-0.png +│   │   ├── nasa_logo-file-1.png +│   │   ├── nasa_logo-file-2.png +│   │   └── nasa_logo-file-3.png +│   └── L3 +│   └── averaged-image.png +└── set-2 + datastore$ +``` + +Summarizing what we see: + +- a `models` directory, with a `dummy model` subdirectory and within that a sample model. +- A `set-1` directory and a `set-2` directory. The `set-2` directory layout mirrors the layout of `set-1`; take a look if you don't believe me, I didn't bother to expand set-2 in the interest of not taking up too much space. +- Within `set-1` we see a directory `L0` with some PNG files in it, a directory `L1` with some PNG files, and then `L2A`, `L2B`, and `L3` directories which (again, trust me or look for yourself) contain additional PNG files. + +Where did all this come from? Let's take a look again at part of the `pt-sample.xml` file: + +```xml + + + +``` + +If you don't remember how data file type definitions worked, feel free to [go to the article on Data File Types](data-file-types.md) for a quick refresher course. In any event, you can probably now see what we meant when we said that the data file type definitions implicitly define the structure of the datastore. The `set-1/L0` and `set-2/L0` directories come from the `fileNameWithSubstitutionsForRegex` value for raw data; similarly the permuted colors data type defines the `set-1/L1` and `set-2/L2` directories. + +#### Model Names in the Datastore + +If you look at the figure above, you've probably noticed that the name of the dummy model has been mangled in some peculiar fashion: instead of `sample-model.txt`, the name is `2002-09-19.0001-sample-model.txt`. What's up with that? + +Well -- models are different from mission data in that it's sometimes necessary to update models; but it's also necessary to keep every copy of every model, because we need to be able to work out the full provenance of all Ziggy's data products, which includes knowing which models were used for every processing activity. But the name "regex" for the dummy model is just `sample-model.txt`. That means that every version of the model has to have the same name, which means that ordinarily a new model file would overwrite an old one. And that's not acceptable. + +So: when a model is imported, its "datastore name" includes the import date and a version number (starting at 1) that get prepended to the file name. Thus the name of the file we see in the datastore. Version numbers increase monotonically and are never re-used (thus are unique), but multiple models of a given type can have the same timestamp. + +If you think you're going to have more than 9,999 versions of a model, let us know and we'll change the format to accommodate you. + +Note that when the model gets copied to a task directory, any Ziggy-supplied version number or timestamp will be removed and the filename seen by the algorithm will match the original name of the model file. + +#### Supplying Your Own Version Information for Models + +Of course, it's also possible (indeed, likely) that any actual flight mission will include version and datestamp information in the name of its model files. You can imagine a model name regex that's something like "`calibration-constants-([0-9]{8}T[0-9]{6})-v([0-9]+).xml`", where the first group in the regex is a datestamp in ISO 8601 format (YYYYMMDDThhmmss), and the second group is a version number. In this case, you might want Ziggy to use the datestamp and version number from the filename and not append its own (if for no other reason than having 2 datestamps and 2 version numbers in the datastore filename would look dumb). You can make this happen by specifying in the XML the group number for the timestamp and the group number for the version: + +```xml + +``` + +You can use this mechanism to specify that filenames have version numbers in them; or timestamps, or both. If the filename has only one of the two, Ziggy will prepend its version of the other to the filename when storing in the datastore. + +### The Task Directory + +When Ziggy runs algorithm code, it doesn't the algorithm direct access to files in the datastore. Instead, each pipeline task gets its own directory, known as the "task directory" (clever!). + +To find the task directory, look first to the pipeline results location in the properties file: + +``` +pipeline.results.dir=${pipeline.home.dir}/pipeline-results +``` + +The pipeline-results directory contains a number of subdirectories. First, let's look at task-data: + +``` +task-data$ ls +1-2-permuter 1-4-flip 1-6-averaging 2-10-flip 2-8-permuter +1-3-permuter 1-5-flip 1-7-averaging 2-11-flip 2-9-permuter +task-data$ +``` + +Every pipeline task has its own directory. The name of a task's directory is the instance number, the task number, and the module name, separated by hyphens. If we drill down into `1-2-permuter`, we see this: + +```console +1-2-permuter$ ls -R +ARRIVE_PFE.1667003280019 QUEUED_PBS.1667003279236 st-1 +PBS_JOB_FINISH.1667003320029 permuter-inputs.h5 st-2 +PBS_JOB_START.1667003280021 st-0 st-3 + +1-2-permuter/st-0: +SUB_TASK_FINISH.1667003287519 nasa_logo-set-2-file-0-perm.png permuter-inputs-0.h5 sample-model.txt +SUB_TASK_START.1667003280036 nasa_logo-set-2-file-0.png permuter-stdout-0.log + +1-2-permuter/st-1: +SUB_TASK_FINISH.1667003294982 nasa_logo-set-2-file-1-perm.png permuter-inputs-0.h5 sample-model.txt +SUB_TASK_START.1667003287523 nasa_logo-set-2-file-1.png permuter-stdout-0.log + +1-2-permuter/st-2: +SUB_TASK_FINISH.1667003302619 nasa_logo-set-2-file-2-perm.png permuter-inputs-0.h5 sample-model.txt +SUB_TASK_START.1667003294987 nasa_logo-set-2-file-2.png permuter-stdout-0.log + +1-2-permuter/st-3: +SUB_TASK_FINISH.1667003310303 nasa_logo-set-2-file-3-perm.png permuter-inputs-0.h5 sample-model.txt +SUB_TASK_START.1667003302623 nasa_logo-set-2-file-3.png permuter-stdout-0.log +1-2-permuter$ +``` + +At the top level there's some stuff we're not going to talk about now. What's interesting is the contents of the subtask directory, st-0: + +- The sample model is present with its original (non-datastore) name, `sample-model.txt`. +- The inputs file for this subtask is present, also with its original (non-datastore) name, `nasa-logo-set-2-file-0.png`. +- The outputs file for this subtask is present: `nasa-logo-set-2-file-0-perm.png`. +- The HDF5 file that contains filenames is present: `permuter-inputs-0.h5`. +- There's a file that contains all of the standard output (i.e., printing) from the algorithm: `permuter-stdout-0.log`. +- There are a couple of files that show the Linux time that the subtask started and completed processing. + +### The Moral of this Story + +So what's the takeaway from all this? Well, there's actually a couple: + +- Ziggy maintains separate directories for its permanent storage in the datastore and temporary storage for algorithm use in the task directory. +- The task directory, in turn, contains one directory for each subtask. +- The subtask directory contains all of the content that the subtask needs to run. This is convenient if troubleshooting is needed: you can copy a subtask directory to a different computer to be worked on, rather than being forced to work on it on the production file system used by Ziggy. +- There's some name mangling between the datastore and the task directory. +- You can put anything you want into the subtask or task directory; Ziggy only pulls back the results it's been told to pull back. This means that, if you want to dump a lot of diagnostic information into each subtask directory, which you only use if something goes wrong in that subtask, feel free; Ziggy won't mind. + +### Postscript: Copies vs. Symbolic Links + +If you look closely at the figure that shows the task directory, you'll notice something curious: the input and output "files" aren't really files. They're symbolic links. Specifically, they're symbolic links to files in the datastore. Looking at an example: + +```console +st-0$ ls -l +total 64 +-rw-r--r-- 1 0 Oct 31 16:01 SUB_TASK_FINISH.1667257285445 +-rw-r--r-- 1 0 Oct 31 16:01 SUB_TASK_START.1667257269376 +lrwxr-xr-x 1 104 Oct 31 16:01 nasa_logo-set-2-file-0-perm.png -> ziggy/sample-pipeline/build/pipeline-results/datastore/set-2/L1/nasa_logo-file-0.png +lrwxr-xr-x 1 104 Oct 31 16:01 nasa_logo-set-2-file-0.png -> ziggy/sample-pipeline/build/pipeline-results/datastore/set-2/L0/nasa_logo-file-0.png +-rw-r--r-- 1 25556 Oct 31 16:01 permuter-inputs-0.h5 +-rw-r--r-- 1 174 Oct 31 16:01 permuter-stdout-0.log +lrwxr-xr-x 1 126 Oct 31 16:01 sample-model.txt -> ziggy/sample-pipeline/build/pipeline-results/datastore/models/dummy model/2022-10-31.0001-sample-model.txt +st-0$ +``` + +Ziggy allows the user to select whether to use actual copies of the files or symbolic links. This is configured in -- yeah, you got it -- the properties file: + +``` +moduleExe.useSymlinks = true +``` + +The way this works is obvious for the input files: Ziggy puts a symlink in the working directory, and that's all there is to it. For the outputs file, what happens is that the algorithm produces an actual file of results; when Ziggy goes to store the outputs file, it moves it to the datastore and replaces it in the working directory with a symlink. This is a lot of words to say that you can turn this feature on or off at will and your code doesn't need to do anything different either way. + +The advantages of the symlinks are fairly obvious: + +- Symbolic links take up approximately zero space on the file system. If you use symbolic links you avoid having multiple copies of every file around (one in the datastore, one in the subtask directory). For large data volumes, this can be valuable. +- Similarly, symbolic links take approximately zero time to instantiate. Copies take actual finite time. Again, for large data volumes, it can be a lot better to use symlinks than copies in terms of how much time your processing needs. + +There are also situations in which the symlinks may not be a good idea: + +- It may be the case that you're using one computer to run the worker and database, and a different one to run the algorithms. In this situation, the datastore can be on a file system that's mounted on the worker machine but not the compute machine, in which case the symlink solution won't work (the compute node can't see the datastore, so it can't follow the link). diff --git a/doc/user-manual/delete-tasks.md b/doc/user-manual/delete-tasks.md new file mode 100644 index 0000000..d870088 --- /dev/null +++ b/doc/user-manual/delete-tasks.md @@ -0,0 +1,25 @@ +## Deleting Tasks + +Sometimes it's necessary to stop the execution of tasks after they start running. Tasks that are running as jobs under control of a batch system at an HPC facility will provide command line tools for this, but they're a hassle to use when you're trying to delete a large number of jobs. Trying to delete tasks running locally is likewise hassle-tastic. + +Fortunately, Ziggy will let you do this from the console. + +### Delete all Jobs for a Task + +To delete all jobs for a task, go to the tasks table on the Instances panel, right click the task, and select `Delete tasks` from the pop-up menu: + +![delete-task-menu-item](images/delete-task-menu-item.png) + +You'll be prompted to confirm that you want to delete the task. When you do that, you'll see something like this: + +![delete-in-progress](images/delete-in-progress.png) + +The state of the task will be immediately moved to `ERROR`, P-state `ALGORITHM_COMPLETE`. The instance will go to state `ERRORS_RUNNING` because the other task is still running; once it completes, the instance will go to `ERRORS_STALLED`. Meanwhile, the alert looks like this: + +![delete-alert](images/delete-alert.png) + +As expected, it notifies you that the task stopped due to deletion and not due to an error of some kind. + +### Delete all Tasks for an Instance + +This is the same idea, except it's the pop-up menu for the instance table, and you select `Delete all tasks`. \ No newline at end of file diff --git a/doc/user-manual/display-logs.md b/doc/user-manual/display-logs.md new file mode 100644 index 0000000..d226c1e --- /dev/null +++ b/doc/user-manual/display-logs.md @@ -0,0 +1,23 @@ +## Log Files Display + +Ziggy provides a mechanism for viewing task logs that is more convenient than going to the `logs` directory and hunting around. + +To use this feature, go to the Instance tab under the Operations tab. Select the task of interest and right-click to bring up the tasks menu: + + + +Select `List task logs`. You'll get this dialog box: + +![](images/logs-list.png) + +By default the logs are ordered by name, which means that they're also ordered by time, from earliest to latest. If you double-click on one of the rows in the table, the log file in question will be opened in a new window: + +![](images/task-log-display.png) + +The log will always be opened with the view positioned at the end of the log, since that's most often where you can find messages that inform you about the problems. In this case, that's not true, so you can use the `To Top` button to jump to the start of the log, or simply scroll around until you find what you're looking for: + +![](images/task-log-showing-problem.png) + +Here you can see the stack trace produced by the Python algorithm when it deliberately threw an exception, and the Java stack trace that was generated when Ziggy detected that the algorithm had thrown an exception. As it happens, the log shows exactly what the problem is: the user set the parameter that tells subtask 0 to fail, and subtask 0 duly failed. + +Note that this is the same content we saw in the subtask algorithm log in the subtask directory (if you don't remember what I'm talking about, it's near the bottom of [the Log Files article](log-files.md)). The difference is that the file in the subtask directory only has the output from one subtask, while the task log has all the logging from all the subtasks plus additional logging from the Ziggy components that run and manage the algorithm execution. diff --git a/doc/user-manual/downloading-and-building-ziggy.md b/doc/user-manual/downloading-and-building-ziggy.md new file mode 100644 index 0000000..3f41721 --- /dev/null +++ b/doc/user-manual/downloading-and-building-ziggy.md @@ -0,0 +1,105 @@ +## Downloading and Building Ziggy + +Before you start, you should check out the [system requirements](system-requirements.md) article. This will ensure that you have the necessary hardware and software to follow the steps in this article. + +### Downloading Ziggy + +Ziggy's source code is stored on GitHub, which you probably know already since you're reading this document, which is also stored on GitHub along with Ziggy. A discussion of GitHub is way beyond the scope of this document, but the things you need to know to download Ziggy are a really small subset of that: + +TODO BW put in what we need to say about GitHub. + +Once you've done that, you should see something like this in your Ziggy folder: + +```console +ziggy$ ls +LICENSE.pdf doc gradlew script-plugins +README.md etc ide settings.gradle +build.gradle gradle licenses src +buildSrc gradle.properties sample-pipeline test +ziggy$ +``` + +Let's go through these items: + +- The files `build.gradle`, `settings.gradle`, `gradle.properties`, and `gradlew` are used by our build system. Hopefully you won't need to know anything more about them than that. +- Likewise, directories `gradle` and `script-plugins` are used in the build. +- The `buildSrc` directory contains some Java and Groovy classes that are part of Ziggy but are used by the build. The idea here is that Gradle allows users to extend it by defining new kinds of build tasks; those new kinds of build tasks are implemented as Java or Groovy classes and by convention are put into a "buildSrc" folder. This is something else you probably won't need to worry about; certainly not any time soon. +- The `doc` directory contains this user manual, plus a bunch of other, more technical documentation. +- The `etc` directory contains files that are used as configuration inputs to various programs. This is things like: the file that tells the logger how to format text lines, and so on. Two of these are going to be particularly useful and important to you: the ziggy.properties file, and the pipeline.properties.EXAMPLE file. These files provide all of the configuration that Ziggy needs to locate executables, working directories, data storage, and etc. We'll go into a lot of detail on this at the appropriate time. +- The `ide` directory contains auxiliary files that are useful if you want to develop Ziggy in the Eclipse IDE. +- The `licenses` directory contains information about both the Ziggy license and the licenses of third-party applications and libraries that Ziggy uses. +- The `src` directory contains the directory tree of Ziggy source code, both main classes and test classes. +- The `test` directory contains test data for Ziggy's unit tests. +- The `sample-pipeline` directory contains the source and such for the sample pipeline. + +### Building Ziggy + +Before you build Ziggy, you'll need to set up 4 environment variables: + +- The `JAVA_HOME` environment variable should contain the location of the Java Development Kit (JDK) you want to use for Java compilation. +- The `CC` environment variable should contain the location of the C compiler you want to use for C compilation. +- The `CXX` environment variable should contain the location of the C++ compiler you want to use for C++ compilation. +- When Ziggy goes to build its copy of the HDF5 libraries, for some reason the HDF5 build system doesn't always find the necessary include files for building the Java API. If this happens to you, create the environment variable JNIFLAGS: `export JNIFLAGS="-I/include -I/include/"`. Here ``is the location of your JDK (so it should be identical to the contents of `JAVA_HOME`, above). The `/include` directory will have an additional subdirectory of include files that are specific to a particular OS (for example, on my Macbook this is `$JAVA_HOME/include/darwin`). With a properly-set `JNIFLAGS` environment variable, you should be able to build HDF5 without difficulty (famous last words...). + +Once you've got that all set up, do the following: + +1. Open a terminal window. +2. Change directory to the Ziggy main directory (the one that looks like the figure above). +3. At the command line, type `./gradlew`. + +The first time you do this, it will take a long time to run and there will be a lot of C and C++ compiling. This is because Gradle is building the HDF5 libraries, which can take a long time. Gradle will also download all the third party libraries and Jarfiles Ziggy relies upon. Barring the unforeseen, eventually you should see something like this in your terminal: + +```console +ziggy$ ./gradlew +ar: creating archive ziggy/build/lib/libziggy.a +ar: creating archive ziggy/build/lib/libziggymi.a + +> Task :compileJava +Note: Some input files use or override a deprecated API. +Note: Recompile with -Xlint:deprecation for details. + + +BUILD SUCCESSFUL in 14s +19 actionable tasks: 19 executed +ziggy$ +``` + +At this point, it's probably worthwhile to run Ziggy's unit tests to make sure nothing has gone wrong. To do this, at the command line type `./gradlew test`. The system will run through a large number of tests (around 700) over the course of about a minute, and hopefully you'll get another "BUILD SUCCESSFUL" message at the end. + +If you look at the Ziggy folder now, you'll see the following: + +```console +ziggy$ ls +LICENSE.pdf buildSrc gradle.properties sample-pipeline test +README.md doc gradlew script-plugins +build etc ide settings.gradle +build.gradle gradle licenses src +ziggy$ +``` + +Pretty much the same, except that now there's a `build` folder. What's in the `build` folder? + +```console +build$ ls +bin etc lib obj schema tmp +classes include libs resources src +build$ +``` + +The main folders of interest here are: + +- The `bin` folder, which has all the executables. +- The `lib` folder, which has shared object libraries (i.e., compiled C++) +- The `libs` folder, which has Jarfiles (so why not name it "jars"? I don't know). +- The `etc` folder, which is a copy of the main `ziggy/etc` folder. + +Everything that Ziggy uses in execution comes from the subfolders of `build`. That's why there's a copy of `etc` in `build`: the `etc` in the main directory isn't used, the one in `build` is, so that everything that Ziggy needs is in one place. + +Before we move on, a few useful details about building Ziggy: + +- Ziggy makes use of a number of dependencies. Most of them are Jarfiles, which means that they can be downloaded and used without further ado, but at least one, the [HDF5 library](https://www.hdfgroup.org/solutions/hdf5/), requires compilation via the C++ compiler. This is the most time-consuming part of the build. +- The first time you run `./gradlew`, the dependencies will be automatically downloaded. On subsequent builds with `./gradlew`, the dependencies will mostly not be downloaded, but instead cached copies will be used. This means that the subsequent uses are much faster. +- Why "mostly not ... downloaded?" Well, the build system checks the dependencies to see whether any new versions have come out. New versions of the third party libraries are automatically downloaded in order to ensure that Ziggy remains up-to-date with security patches. So on any given invocation of `./gradlew`, there might be a library or two that gets updated. +- Gradle has lots of commands (known in the lingo as "tasks") other than `test`. Most notably, the `./gradlew clean` command will delete the build directory so you can start the build over from the beginning. The `./gradlew build` command will first build the code and then run the unit tests. + +If all is going well, you're now ready to move on to [defining your own pipeline](configuring-pipeline.md)! diff --git a/doc/user-manual/event-handler-basics.md b/doc/user-manual/event-handler-basics.md new file mode 100644 index 0000000..eefe56d --- /dev/null +++ b/doc/user-manual/event-handler-basics.md @@ -0,0 +1,196 @@ +## Event Handler Basics + +So far we've talked about Ziggy pipelines as being completely driven by human-in-the-loop control: a pipeline operator is required to launch any pipelines, after which the pipeline runs without further human assistance until it either completes or fails. This is acceptable at the scale of the sample pipeline that ships with Ziggy for demonstration purposes. As the volume of data increases, and more relevantly as the frequency of data deliveries increases, this becomes less desirable. At some data volume it becomes essential for Ziggy to have its own system for determining that some action is required and initiating that action without any help from a human. + +Fortunately, Ziggy has this capability in the form of its event handler mechanism. + +Note that, for the exercises in this section, I've reset the sample pipeline back to its initial state. The commands for that are as follows: + +``` +runjava cluster stop +rm -rf sample-pipeline/build +/bin/bash sample-pipeline/build-env.sh +runjava cluster init +runjava cluster start gui & +``` + +### What it Does + +To get a sense of how the event handler works, do the following: + +First, get the data receipt directory set up with content. If you followed my suggestion to reset the pipeline back to its initial state, you don't have to do anything further. If you've rejected my suggestion, then you have to copy all the files in `sample-pipeline/data` to `sample-pipeline/build/data-receipt`. As a reminder, the `sample-pipeline/build/data-receipt` directory should look like this when you're done: + +```console +data-receipt$ ls -1 +nasa_logo-set-1-file-0.png +nasa_logo-set-1-file-1.png +nasa_logo-set-1-file-2.png +nasa_logo-set-1-file-3.png +nasa_logo-set-2-file-0.png +nasa_logo-set-2-file-1.png +nasa_logo-set-2-file-2.png +nasa_logo-set-2-file-3.png +sample-model.txt +sample-pipeline-manifest.xml +data-receipt$ +``` + +Now that you've done that, start the pipeline and the console (if they aren't already running); from the `Configuration` tab select `Event Definitions`. You'll see a table like this appear: + +![](images/event-defs-display.png) + +As shown, an event definition has a name, a directory, a pipeline, and a toggle that enables or disables the event handler. Right now this one is turned off. As you might have inferred from this, when this event handler is turned on, it watches the `sample-pipeline/build/data-receipt` directory for an event; when it sees an event, it starts the `sample` pipeline. At its essence, that's all a Ziggy event handler is: a tool that watches a specified directory and then starts a specified pipeline. + +Next step: click the check-box under `Enabled`. Wait for 10-15 seconds, and -- + +-- nothing happens. The pipeline doesn't start, the `Pi` stoplight doesn't turn green. Nothing. + +If you read the preceding paragraphs carefully, you'll see that we left something out. We said that an event handler watches a directory for an event, and it then starts a pipeline. We never defined what constitutes an event! Clearly, it's not just the presence of files in the watched directory: there are files in the watched directory, and the event handler isn't interested. + +### What is an Event? + +If you think about it, you'll soon realize that it wouldn't be safe for Ziggy to simply start running whenever files show up in the watched directory. For one thing, what would stop Ziggy from starting the pipeline while the files it needs are still being put into the watched directory? For another, what if there are things happening that need to be treated as multiple different events? For another, what if Ziggy needs more information than just, "Go do something"? + +To solve these problems, we defined an event signal, known as a ready file, as follows: + +**A Ziggy ready file is a zero-length file, where the filename contains all the information Ziggy needs to know how it should respond to the event.** + +More specifically, the filename convention for a ready file is as follows: + +``` +