Skip to content

Latest commit

 

History

History
114 lines (74 loc) · 5.14 KB

README.md

File metadata and controls

114 lines (74 loc) · 5.14 KB

#HPCC-Platform

####Table of Contents

  1. HPCC-Platform Overview
  2. About the Module
  3. Setup
  4. Usage
  5. Limitations
  6. License
  7. Contact Us

##Overview

The HPCC-Platform is a massive parallel-processing computing platform that solves Big Data problems.

##About the Module

This module is designed for a quick and easy installation and setup of the HPCC-Platform cluster on environments utilizing puppet as an administrative tool for linux environments. It is not designed to be used heavily in production (as having puppet running alongside a roxie or thor node is unnecessary resource overhead.) Please use this tool to become better aquanted with our implementation of a High Performance Computing Cluster to see if it meets your organizations Big Data needs. Documentation and whitepapers regarding the running of the HPCC-Platform can be found at www.hpccsystems.com.

##Setup

####Master Configuration

Under your nodes.pp or site.pp it will be necessary to declare the hpcc class with a role of 'controller' upon the puppetmaster controlling your cluster. It is necessary as this module utilizes puppets file resource heavily to ensure that the environment.xml is propagated across the entire cluster (See Envgen for more information.) The appropriate way to setup your 'controller' instantiation of the class is as such.

# $(confdir)/environments/<myenv>/manifests/nodes.pp
node 'puppetmaster.exampledomain' {
  class { 'hpcc':
    role => 'controller',
  }
}

####Computation Nodes

Under your nodes.pp or site.pp it will be necessary to include the hpcc class. All information for the computational nodes should be pulled directly from the hpcc::params class. The master is the only truly unique node and as such should be the only declaration with modified parameters (if necessary.)

# $(confdir)/environments/<myenv>/manifests/nodes.pp
node "computationNode1.exampledomain" {
  include hpcc
}

####Keygen

One of the ways that the HPCC-Platform initially communicates and syncs up between all the nodes in a cluster is through ssh keys. These keys are handled using the Keygen script. This all get handled automatically by the module.

####Envgen

Envgen is the script that allows us to map nodes in the cluster to specific roles within the HPCC-Platform. It sets up our support, Roxie and Thor nodes and their subsequent slaves. The file this creates (called environment.xml) is how each node in the cluster knows what it's roles are and which services it must be running.

  • .../hpcc/files/environment.xml automatically generated by the hpcc instance with role => 'controller' set. This instantiation of the hpcc class needs to be on the puppetmaster as it controls the generation of environment.xml, as well as acts as a provider of the file to all the hpcc nodes within your cluster.

####The Iplist File

.../hpcc/files/iplist must be implemented. You may use hpcc/files/iplist.example as a reference for how to format the file. The iplist file is used to help generate the environment.xml manifest that will configure all your nodes.

##Usage

####Running the cluster

The primary tool you will use to run the cluster is located in /opt/HPCCSystems/sbin and is called hpcc-run.sh. This tool allows us to remotely run hpcc-init or individual component commands across all the machines that are declared in our environment.xml file (which was in turn, generated from Envgen and the iplist file.) The tool should be run as the user 'hpcc' and the following examples will clarify the usage.

sudo -u hpcc /opt/HPCCSystems/sbin/hpcc-run.sh -a hpcc-init status

sudo -u hpcc /opt/HPCCSystems/sbin/hpcc-run.sh -a hpcc-init start

sudo -u hpcc /opt/HPCCSystems/sbin/hpcc-run.sh -c <component name> <action>

sudo -u hpcc /opt/HPCCSystems/sbin/hpcc-run.sh -c mythor start

##Limitations

The platform is designed and packaged for the following operating systems. Any other operating systems have not been tested upon.

  • CentOS 5
  • CentOS 6
  • Ubuntu 10.04 LTS
  • Ubuntu 12.04 LTS
  • Ubuntu 13.10
  • Ubuntu 14.04 LTS

##License

HPCC SYSTEMS software Copyright (C) 2012 HPCC Systems.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

##Contact

In order to contact us, please visit our community forums at http://hpccsystems.com/bb/.