-
Notifications
You must be signed in to change notification settings - Fork 22
Kafka New Consumer
业余布道师 edited this page Aug 13, 2016
·
6 revisions
If you want to use the kafka new consumer api KafkaConsumer
* The consumer is not thread-safe.
* The consumer maintains TCP connections to the necessary brokers to fetch data.
* Failure to close the consumer after use will leak these connections.
Use the new receiver pool, config like this
<!-- Kafka New Receiver Pool -->
<bean id="messageReceiverPool" class="org.darkphoenixs.kafka.pool.KafkaMessageNewReceiverPool"
init-method="init" destroy-method="destroy">
<!-- Kafka New Receiver Pool -->
<property name="messageAdapter" ref="messageAdapter"/>
<property name="config" value="kafka/newconsumer.properties"/>
<property name="poolSize" value="10"/>
<property name="model" value="MODEL_1"/>
</bean>
messageAdapter
: org.darkphoenixs.kafka.core.KafkaMessageAdapter.
config
: the consumer config file.
pros
: the consumer config properties(same as config).
model
: MODEL_1 is one consumer per thread, MODEL_2 is decouple consumption and processing.
poolSize
: When MODEL is MODEL_1, the size is the consumer thread pool size. When MODEL is MODEL_2, the size is the handle thread pool size, the consumer thread pool size as same as the topic partition number.
Note: init-method
and destroy-method
are required.
- PRO : It is the easiest to implement
- PRO : It is often the fastest as no inter-thread co-ordination is needed
- PRO : It makes in-order processing on a per-partition basis very easy to implement (each thread just processes messages in the order it receives them).
- CON : More consumers means more TCP connections to the cluster (one per thread). In general Kafka handles connections very efficiently so this is generally a small cost.
- CON : Multiple consumers means more requests being sent to the server and slightly less batching of data which can cause some drop in I/O throughput.
- CON : The number of total threads across all processes will be limited by the total number of partitions.
- PRO : This option allows independently scaling the number of consumers and processors. This makes it possible to have a single consumer that feeds many processor threads, avoiding any limitation on partitions.
- CON : Guaranteeing order across the processors requires particular care as the threads will execute independently an earlier chunk of data may actually be processed after a later chunk of data just due to the luck of thread execution timing. For processing that has no ordering requirements this is not a problem.
- CON : Manually committing the position becomes harder as it requires that all threads co-ordinate to ensure that processing is complete for that partition.