Skip to content

Latest commit

 

History

History
14 lines (9 loc) · 757 Bytes

spark-streaming-foreachdstreams.adoc

File metadata and controls

14 lines (9 loc) · 757 Bytes

ForEachDStreams

ForEachDStream is an internal DStream with dependency on the parent stream with the exact same slideDuration.

The compute method returns no RDD.

When generateJob is called, it returns a streaming job for a batch when parent stream does. And if so, it uses the "foreach" function (given as foreachFunc) to work on the RDDs generated.

Note

Although it may seem that ForEachDStreams are by design output streams they are not. You have to use DStreamGraph.addOutputStream to register a stream as output.

You use stream operators that do the registration as part of their operation, like print.