Skip to content

Getting Started

elandau edited this page Jan 30, 2012 · 34 revisions

Initializing Astyanax

To start using Astyanax you must first create a context that captures the connection pool, API and monitoring for you cassandra client. Default implementations for these components are provided but you can provide your own. Configuration and monitoring are some of the components that you will most likely want to implement to bind to your company’s specific implementations.

AstyanaxContext<Keyspace> keyspace = new AstyanaxContext.Builder()
	.forCluster("ClusterName")
	.forKeyspace("KeyspaceName")
	.withAstyanaxConfiguration(new AstyanaxConfigurationImpl())
	.withConnectionPoolConfiguration(new ConnectionPoolConfigurationImpl("MyConnectionPool")
		.setPort(MockConstants.PORT)
		.setMaxConnsPerHost(1)
	)
    	.withConnectionPoolMonitor(new CountingConnectionPoolMonitor())
	.buildKeyspace(ThriftFamilyFactory.getInstance());

Define your column family structure

Cassandra internally stores all keys, columns and values as byte arrays. Astyanax makes use of serializers to convert to and from various primitives and more complex data types. To avoid having to specify serializers for every call you must first set up the column family definition. This definition will most likely be a static final in your DAO or other higher level wrapper. Notice that the definition is at the column family and not the keyspace level. This allows you to have different key types for column families within the same keyspace.

ColumnFamily<String, String> CF_USER_INFO =
  new ColumnFamily<String, String>(
    "Standard1",              // Column Family Name
    StringSerializer.get(),   // Key Serializer
    StringSerializer.get());  // Column Serializer

Inserting data

Data may be inserted either one column at a time or in batches. You may combine updates to multiple column families in the same keyspace into a single batch.

// Inserting data
MutationBatch m = keyspace.prepareMutationBatch();

m.withRow(CF_USER_INFO, "acct1234")
  .putColumn("firstname", "john", null)
  .putColumn("lastname", "smith", null)
  .putColumn("address", "555 Elm St", null)
  .putColumn("age", 30, null);

m.withRow(CF_USER_STATS, "acct1234")
  .incrementCounterColumn("loginCount", 1);

try {
  OperationResult<Void> result = m.execute();
} catch (ConnectionException e) {
}

Reading data from cassandra

Astyanax queries begin with a single interface and guide you through the possible query options in a natural flow starting with different key queries (single, slice, range) followed by column level qualifiers. Once the query is ready you many call several different execute() methods which provide different internal implementations for parallelizing the query execution on the client. Each query will return the proper level of information (Rows, Row, ColumnList, Column, Count) with the appropriate types derived from the ColumnFamily definition object.

Reading a single row

OperationResult<ColumnList<String>> result =
  ks.prepareQuery(CF_USER_INFO, keyName)
    .getKey("Key1").
    .execute();
ColumnList<String> columns = result.getResult();

// Lookup columns in response by name 
int age        = columns.getColumnByName("age").getIntegerValue();
long counter   = columns.getColumnByName("loginCount").getLongValue();
String address = columns.getColumnByName("address").getStringValue();

// Or, iterate through the columns
for (Column<String> c : result.getResult()) {
  System.out.println(c.getName());
}

Reading a set of non-contiguous rows

OperationResult<Rows<String, String>> result =
  ks.prepareQuery(CF_STANDARD1)
    .getKeySlice("Key1", "Key2", "Key3")
    .execute();

// Iterate rows and their columns 
for (Row<String, String> row : result.getResult()) {
  System.out.println(row.getKey());
  for (Column<String> column : row.getColumns()) {
  System.out.println(column.getName());
  }
}

Composite columns

Astyanax provides a simple annotation based definition for composite columns. The ordinal annotation attribute is necessary to guarantee the order in which the composite components are serialized.

// Annotated composite class
Class SessionEvent{
  @Component(ordinal=0) String   sessiondId;
  @Component(ordinal=1) TimeUUID timestamp;
}

static AnnotatedCompositeSerializer<SessionEvent> eventSerializer
      = new AnnotatedCompositeSerializer<SessionEvent>(SessionEvent.class);
static ColumnFamily<String, SessionEvent> CF_SESSION_EVENTS
  = new ColumnFamily<String, SessionEvent>("SessionEvents", 
    StringSerializer.get(), eventSerializer);

// Querying cassandra for all columns of row key "SomeSessionId" starting with 
keyspace.prepareQuery(CF_SESSION_EVENTS)
  .getKey("SomeUserName")
  .withColumnRange(
      eventSerializer.makeEndpoint(“sessionid1", Equality.EQUAL).toBytes(),
      eventSerializer.makeEndpoint("sessionid1", Equality.LESS_THAN_EQUAL).toBytes(),
      false, 100)
  .execute();

Pagination

Pagination is done transparently by calling autoPaginate(true) where appropriate. It is possible to paginate key ranges (all using the same non paginating column slice), index queries and columns of a single row. Astyanax handles all of the internal details of how to determine the next rowkey or column name without returning duplicates to the caller.

ColumnList<String> columns;
try {
    RowQuery<String, String> query = keyspace
        .prepareQuery(CF_STANDARD1)
        .getKey("TheRowKey")
        .autoPaginate(true)
        .withColumnRange(new RangeBuilder().setLimit(10).build());
	
    while (!(columns = query.execute().getResult()).isEmpty()) {
        for (Column<String> c : columns) {
            System.out.println.info(c.getName());
        }
    }
} catch (ConnectionException e) {
}

Iterate through the entire keyspace

This is arguably a very bad use case of cassandra but one that is very useful for column families that store a small number of rows. Astyanax provides two methods for executing this type of query. The first, wraps the iterator interface and transparently queries cassandra when .next() is called. The second queries each token range from a separate thread and returns the rows via a callback interface.

try {
    OperationResult<Rows<String, String>> rows = keyspace.prepareQuery(CF_STANDARD1)
	.getAllRows()
	.setRowLimit(100)  // This is the page size
	.withColumnRange(new RangeBuilder().setMaxSize(10).build())
	.setExceptionCallback(new ExceptionCallback() {
            @Override
            public boolean onException(ConnectionException e) {
            	Assert.fail(e.getMessage());
                return true;
            }
	})
	.execute();
    for (Row<String, String> row : rows.getResult()) {
        LOG.info("ROW: " + row.getKey() + " " + row.getColumns().size());
    }
} catch (ConnectionException e) {
    Assert.fail();
}

The ExceptionCallback is necessary since the Iterator cannot throw a checked exception. The callback gives the caller the opportunity to force Astyanax to retry the query.

keyspace.prepareQuery(CF_STANDARD1)
    .getAllRows()
    .setRowLimit(100)  // This is the page size
    .setRepeatLastToken(false)
    .withColumnRange(new RangeBuilder().setLimit(2).build())
    .executeWithCallback(new RowCallback<String, String>() {
        @Override
        public void success(Rows<String, String> rows) {
            for (Row<String, String> row : rows) {
                System.out.println("ROW: " + row.getKey() + " " + row.getColumns().size());
            }
        }
        @Override
        public boolean failure(ConnectionException e) {
            return false;
        }
    });

Range Builder

The range builder simplifies specifying the start and end columns for a column range query.