Kafka New Consumer

If you want to use the kafka new consumer api KafkaConsumer, since kafka 0.9.0.0.

 * The consumer is not thread-safe. 
 * The consumer maintains TCP connections to the necessary brokers to fetch data. 
 * Failure to close the consumer after use will leak these connections.

Use the new receiver pool, config like this

<!-- Kafka New Receiver Pool -->
<bean id="messageReceiverPool" class="org.darkphoenixs.kafka.pool.KafkaMessageNewReceiverPool"
      init-method="init" destroy-method="destroy">
    <property name="destination" ref="destination"/>
    <property name="messageAdapter" ref="messageAdapter"/>
    <property name="config" value="kafka/newconsumer.properties"/>
    <property name="poolSize" value="10"/>
    <property name="model" value="MODEL_1"/>
    <property name="batch" value="NON_BATCH"/>
    <property name="commit" value="AUTO_COMMIT"/>
</bean>

destination : org.darkphoenixs.kafka.core.KafkaDestination(Topic Name).

messageAdapter : org.darkphoenixs.kafka.core.KafkaMessageAdapter.

config : the consumer config file.

pros : the consumer config properties(same as config).

model : MODEL_1 is one consumer per thread, MODEL_2 is decouple consumption and processing.

batch : BATCH or NON_BATCH message processing.

commit : AUTO_COMMIT or SYNC_COMMIT or ASYNC_COMMIT.

poolSize : the size is the consumer thread pool size.

handleMultiple : how many multiple is the consumer thread pool size, MODEL_2 to take effect.

retryCount : the retry count for fault-tolerant, NON_BATCH to take effect.

Note: init-method and destroy-method are required.

Model Pros and Cons

One Consumer Per Thread (MODEL_1)

PRO : It is the easiest to implement
PRO : It is often the fastest as no inter-thread co-ordination is needed
PRO : It makes in-order processing on a per-partition basis very easy to implement (each thread just processes messages in the order it receives them).
CON : More consumers means more TCP connections to the cluster (one per thread). In general Kafka handles connections very efficiently so this is generally a small cost.
CON : Multiple consumers means more requests being sent to the server and slightly less batching of data which can cause some drop in I/O throughput.
CON : The number of total threads across all processes will be limited by the total number of partitions.

Decouple Consumption and Processing (MODEL_2)

PRO : This option allows independently scaling the number of consumers and processors. This makes it possible to have a single consumer that feeds many processor threads, avoiding any limitation on partitions.
CON : Guaranteeing order across the processors requires particular care as the threads will execute independently an earlier chunk of data may actually be processed after a later chunk of data just due to the luck of thread execution timing. For processing that has no ordering requirements this is not a problem.
CON : Manually committing the position becomes harder as it requires that all threads co-ordinate to ensure that processing is complete for that partition.

User Guide

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kafka New Consumer

Model Pros and Cons

One Consumer Per Thread (MODEL_1)

Decouple Consumption and Processing (MODEL_2)

Clone this wiki locally