You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have been observing that the KafkaConsumer API is significantly slower compared to the previous low-level Kafka API we were using (e.g., FetchRequest, FetchResponse, ByteBufferMessageSet). Below is a detailed overview of the issue and the current implementation, along with an explanation of the bottlenecks and potential optimization suggestions.
Performance Issues Use Case:
The application requires fetching 1,000 messages starting from a specific user-provided offset and returning the next offset (1001) in the response. This offset will then be used as input for subsequent requests.
Despite using MAX_POLL_RECORDS_CONFIG=1000, the consumer API fetches only ~300 records per poll in ~2 seconds. Fetching 1,000 records typically takes ~4 polls, resulting in a total time of ~8–10 second
I have tried different consumer settings like
MAX_PARTITION_FETCH_BYTES_CONFIG,FETCH_MIN_BYTES_CONFIG,MAX_POLL_RECORDS_CONFIG etc I have tried to increase max poll records..In 2 seconds it is not able to fetch 1000 records and returning 0 records.
Observed Delays:
Consumer Assignment and Seeking:
The time taken for consumer.assign() and consumer.seek() operations adds to the overall latency.
Polling:
The consumer.poll() call often returns fewer records than expected, resulting in multiple iterations to achieve the required batch size.
Comparison with Low-Level API:
The low-level Kafka API (e.g., FetchRequest and FetchResponse) performs better, with reduced latency for fetching records. It appears to bypass some of the high-level abstractions (e.g., consumer group coordination and offset management) that introduce overhead.
Consumer Creation Method
j
public static KafkaConsumer<String, String> createConsumer(String clientName, int fetchSize) {
Properties props = new Properties();
props.setProperty(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaBrokerStr);
props.setProperty(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.setProperty(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.setProperty(ConsumerConfig.GROUP_ID_CONFIG, groupId);
props.setProperty(ConsumerConfig.CLIENT_ID_CONFIG, clientName);
props.setProperty(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
props.setProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.setProperty(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "1000");
// Security and additional properties...
return new KafkaConsumer<>(props);
}
github-actionsbot
changed the title
Subject: KafkaConsumer API Performance Challenges and Optimization Suggestions
SNOW-1820229: Subject: KafkaConsumer API Performance Challenges and Optimization Suggestions
Nov 21, 2024
gmungi
changed the title
SNOW-1820229: Subject: KafkaConsumer API Performance Challenges and Optimization Suggestions
Subject: KafkaConsumer API Performance Challenges and Optimization Suggestions ( Kafka clients 3.6.1)
Nov 21, 2024
This issue falls outside the scope of Snowflake Kafka Connector support, as it pertains to tuning the Kafka Connect framework versus other Kafka APIs. For these types of questions, Confluent support is a good resource.
Hi All,
We have been observing that the KafkaConsumer API is significantly slower compared to the previous low-level Kafka API we were using (e.g., FetchRequest, FetchResponse, ByteBufferMessageSet). Below is a detailed overview of the issue and the current implementation, along with an explanation of the bottlenecks and potential optimization suggestions.
Performance Issues
Use Case:
The application requires fetching 1,000 messages starting from a specific user-provided offset and returning the next offset (1001) in the response. This offset will then be used as input for subsequent requests.
Despite using MAX_POLL_RECORDS_CONFIG=1000, the consumer API fetches only ~300 records per poll in ~2 seconds. Fetching 1,000 records typically takes ~4 polls, resulting in a total time of ~8–10 second
I have tried different consumer settings like
MAX_PARTITION_FETCH_BYTES_CONFIG,FETCH_MIN_BYTES_CONFIG,MAX_POLL_RECORDS_CONFIG etc I have tried to increase max poll records..In 2 seconds it is not able to fetch 1000 records and returning 0 records.
Observed Delays:
Consumer Assignment and Seeking:
The time taken for consumer.assign() and consumer.seek() operations adds to the overall latency.
Polling:
The consumer.poll() call often returns fewer records than expected, resulting in multiple iterations to achieve the required batch size.
Comparison with Low-Level API:
The low-level Kafka API (e.g., FetchRequest and FetchResponse) performs better, with reduced latency for fetching records. It appears to bypass some of the high-level abstractions (e.g., consumer group coordination and offset management) that introduce overhead.
Consumer Creation Method
j
public static KafkaConsumer<String, String> createConsumer(String clientName, int fetchSize) {
Properties props = new Properties();
props.setProperty(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaBrokerStr);
props.setProperty(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.setProperty(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.setProperty(ConsumerConfig.GROUP_ID_CONFIG, groupId);
props.setProperty(ConsumerConfig.CLIENT_ID_CONFIG, clientName);
props.setProperty(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
props.setProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.setProperty(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "1000");
// Security and additional properties...
return new KafkaConsumer<>(props);
}
public static KafkaConsumer<String,String> createConsumer(String clientName,int fetchSize) {
Properties props = new Properties();
String kafkaBrokerStr = Config.getConsumerPropValue("kafkabrokerslist");
String groupId = Config.getConsumerPropValue("group.id");
props.setProperty(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaBrokerStr);
props.setProperty(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.setProperty(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
}
public List consume(long offset, String topicName,int partition,CEConsumeRequest inputReq) throws CustomException {
List msglist = new ArrayList();
Please suggest on how to improve this..
The text was updated successfully, but these errors were encountered: