Skip to content

Connection Handling

mjpearson edited this page Sep 13, 2010 · 27 revisions

The ‘PandraCore’ connection handler provides managed access to the underlying Cassandra/Thrift transports and API.

Core natively supports named connection pooling against Thrift’s TBinaryProtocol, TBinaryProtocolAccelerated and thrift_protocol.so, tweakable read/write modes (active connection, round-robin and random), dynamic consistency levels, robust logging and error correction and a complete abstraction suite against the Thrift API. It’s therefore straight forward to create your own data model without any reliance on the packaged object model (Containers) while retaining the power of Core’s socket pool.

Getting Connected

Connection pooling allow nodes in a single ring to be connected to and managed on a host by host basis, or collectively auto-discoverd based on a handful of seeding hosts. This small guide shows how to do both, how to setup authentication, toggle read/write consistency levels and interact with the Thrift API directly.

Pooling Overview

Pools are labelled collections of open sockets to Cassandra, usually named the same as the Keyspace they’re connecting to. Pandra’s default pool name is ‘Keyspace1’, matching the default Cassandra install. This can be changed in config.php to reflect your working keyspace, or at runtime by passing it to the connection methods.

Once hosts have been attached to the pool (described later), a range of host selection schemes will come into play including marking downed hosts.

Read/Write host selection modes* can be toggled independently. eg :

PandraCore::setWriteMode(PandraCore::MODE_ROUND);
PandraCore::setReadMode(PandraCore::MODE_ACTIVE);

*By default, both Read and Write modes are random.

PandraCore::MODE_ROUND Iterates over each host in the pool (round robin)

PandraCore::MODE_RANDOM Randomly selects a node in the active pool

PandraCore::MODE_ACTIVE Always selects a single host*

*set by PandraCore::setActive(‘connection id’), otherwise will use the last connected node.

Core will attempt 2 successive retries before marking a host as ‘down’, whereby it will close the connection and trim the host from the connection pool. The host will be marked in Memcached or APC if available for a cooldown period of 10 seconds to avoid other PHP instances polling the dead host.

This is configurable of course.

Note when using multiple pools (keyspaces) in the same instance, it’s currently necessary to toggle the active pool as required, via setActivePool(‘keyspace name’);. Future iterations of the Container models will be keyspace aware.

Authentication

Setting up an authentication token for your keyspace is simple, for a properly configured Cassandra instance simply call
PandraCore::authKeySpace('Keyspace1', 'username', 'password');

… prior to any API calls. This will bind all clients in your keyspace pool.

Seeded connections

Core can take an array of hosts, even if that array is 1 element long, and discover Cassandra’s logical topology on it’s own.

PandraCore::connectSeededKeyspace(array('127.0.0.1', '10.0.0.20')); 

This works for a single host or dozens, and is the generally preferred method for talking to your cluster.

Host by Host

Individual hosts can be attached to a connection pool by calling the connect() method :

PandraCore::connect('unique id', '127.0.0.1'); 

optionally, including a pool name and alternate port :

PandraCore::connect('unique1', '127.0.0.1', 'MyKeyspace');
PandraCore::connect('unique2', '127.0.0.1', 'MyKeyspace', 5690); 

It’s necessary to give each connection a unique ID as well as a hostname – it serves as the handle for managing the node.

To disconnect a host :

PandraCore::disconnect('uniqueid);

… disconnectAll(); drops everything.

Retrieving the Thrift client

Sometimes it’s necessary to retrieve the underlying Thrift client for use in your code, for example where an API has no meaning in the ‘Pandra’ context or you’re running a newer Cassandra/Thrift build than what’s supported by this library.

To retrieve the client, simply call getClient(). Calling this method will invoke the pool manager, and authenticate the connection if necessary.

PandraCore::getClient();

By default, getClient will return a client via read mode. To select the write mode client for a specific pool…

PandraCore::getClient(TRUE, 'MyKeyspace');

… Although unless you’re managing these modes for a specific purpose, there’s usually no need to make this distinction.

Putting it all together

The following example details to two pools – one seeded connection pool against the default ‘Keyspace’, and the other a self-managed pool, within the same PHP instance. If you have a cluster to play with, it’s worth reconfiguring this for your setup and downing some hosts to see how it behaves. Check out syslog while you’re doing this – Pandra will tell you of any problems and continue gracefully unless there’s a complete failure. We know Cassandra is eventually consistent – even if one host is up in a cluster, that’s still enough for a consistency ‘ONE’ write.

<?php
require_once '/path/to/pandra/config.php';
PandraCore::addLogger('Syslog');

$keySpace = 'Keyspace1';
$seeds = array('127.0.0.1', '10.0.0.250', '10.0.0.251');

//  Prepare an authentication request
PandraCore::authKeySpace($keySpace, 'jsmith', 'havebadpass');

if (!PandraCore::connectSeededKeyspace($seeds, $keySpace)) {
  echo 'Not Connected!<br>';
  // Retrieve the underlying exception - based on the logger declaration earlier,
  // this should also appear in Syslog.
  print_r(PandraCore::getLastError());
  exit;
}

// --- Now talk to a few select hosts in a 'MyKeySpace' 
// (default port 9160) which don't need authentication

$keySpace = 'MyKeySpace';
$seeds = array('192.168.2.12', '192.168.3.12');

foreach ($seeds as $host) {
  if (!PandraCore::connect(md5($host), $host, $keySpace)) {
    echo 'Not Connected!<br>';
    print_r(PandraCore::getLastError());
    exit; 
  }
}

// Setup round robin access on writes, with quorum consistency
// ** this is global, and affects both pools!
PandraCore::setWriteMode(PandraCore::MODE_ROUND);
PandraCore::setWriteConsistency(cassandra_ConsistencyLevel::QUORUM);

// describe the ring directly from Thrift
$client = PandraCore::getClient(FALSE, 'Keyspace1');
$tokenMap = $client->describe_ring('Keyspace1);
echo $keySpace.' token map :<br>';
var_dump($tokenMap);

// describe the 'mykeyspace' ring directly from Thrift
$client = PandraCore::getClient(FALSE, $keySpace);
$tokenMap = $client->describe_ring($keySpace);
echo $keySpace.' token map:<br>';
var_dump($tokenMap);
?>