Worker |
spark.mobius.CSharpWorker.maxProcessCount |
Sets max number of C# worker processes in Spark executors |
Streaming (Kafka) |
spark.mobius.streaming.kafka.CSharpReader.enabled |
Enables use of C# Kafka reader in Mobius streaming applications |
Streaming (Kafka) |
spark.mobius.streaming.kafka.maxMessagesPerTask.<topicName> |
Sets the max number of messages per RDD partition created from specified Kafka topic to uniformly spread load across tasks that process them |
Streaming (Kafka) |
spark.mobius.streaming.kafka.numPartitions.<topicName>[.<clusterId>] |
Sets RDD partitions to a different number from kafka parations per topic and per cluster(optional, defined as "cluster.id" in kafkaParams if the topic is from multiple kafka clusters) to uniformly and better spread load across tasks that process them |
Streaming (Kafka) |
spark.mobius.streaming.kafka.fetchRate |
Set the number of Kafka metadata fetch operation per batch |
Streaming (Kafka) |
spark.mobius.streaming.kafka.numReceivers |
Set the number of threads used to materialize the RDD created by applying the user read function to the original KafkaRDD. |
Streaming (UpdateStateByKey) |
spark.mobius.streaming.parallelJobs |
Sets 0-based max number of parallel jobs for UpdateStateByKey so that next N batches can start its tasks on time even if previous batch not completed yet. default: 0, recommended: 1. It's a special version of spark.streaming.concurrentJobs which does not observe UpdateStateByKey's state ordering properly |
Worker |
spark.mobius.CSharp.socketType |
Sets the socket type that will be used in IPC when transferring data between JVM and CLR. Valid values for this setting are: - Normal: default .Net Socket implementation will be used. This is the default socket type in Mobius.
- Rio: Windows RIO socket will be used. This option can be used only in Windows OS. To use this socket type, "Prefer 32-bit" option should be set to false when building Mobius driver program.
- Saea: .Net Socket implementation with SocketAsyncEventArgs class will be used
Riosocket and SaeaSocket has better performance when dealing with larger data transmission than traditional .Net Socket. Significant performance improvement has been observed by using RIO/SAEA socket types when the average size of each row in the data processed in Mobius is over 4KB. You can profile your application for different socket types and decide which one offers best performance for your data. Depending on the OS, either Rio (Windows-only) or Saea (Windows/Linux) socket types can be used for data with larger row sizes |
Worker |
spark.mobius.CSharpWorker.readBufferSize |
Sets the buffer size in bytes for data read operation from JVM to CSharpWorker. By default the value is 8KB if not explicitly specified. A typical scenario which can benefits a lot from this option is that CSharpWorker reads large amount of small records from JVM process. Please adjust the number based on your scenario. |
Worker |
spark.mobius.CSharpWorker.writeBufferSize |
Sets the buffer size in bytes for data write operation from CSharpWorker to JVM. The default value is 8KB. Usually better performance can be gained if specify this option with a proper value when CSharpWorker needs to sends lots of small records (multiple bytes size) back to JVM process. Please adjust the buffer size based on your scenario. |