Skip to content
BertrandDechoux edited this page Oct 29, 2012 · 26 revisions

Flume Frequently Asked Questions

These FAQs and answers are relative to the current release unless otherwise noted.

Setup + Installation

Why is Flume running out of memory?

Ensure that you have provided the jvm with sufficient heap space. By default Flume starts the jvm with it’s default heap allocation, which differs depending on the jvm version, the host type (os, 32/64 bit, etc…), total host memory available, as well as other issues.

The environment variable UOPTS can be used to pass additional jvm parameters when running Flume. e.g.

$ UOPTS=“-Xms1g -Xmx2g” bin/flume node

which starts a flume node with an initial heap of one gig and a max heap of two gig. See “java -h” or “java -X” for more details on available jvm options.

Why is ZooKeeper a dependency? Why is it included?

We use it to make the master reliable.

From packages, flume runs as a flume user but we can’t read certain files because they belong to root!

For now, we suggest adding a group that the flume user is part of, and make the file give read rights to members of that group.

How can I tell if Flume master process is running?

You can use your browser to go to http://:35872/ and should get a Flume-generated web page.

if you have a the JDK installed you can use the ‘jps’ program to find out the names of the java processes currently running. You should see something like this :

$ jps
4711 FlumeMaster
4677 FlumeWatchdog

If not you can run ‘ps aux | grep Flume’ to see if it is running:

$ ps aux | grep Flume
flume 4677 0.0 0.1 2383292 32140 ? Sl Sep19 1:42 java -Dflume.log.dir=/var/log/flume -Dflume.log.file=flume-flume-master-monster01.sf.cloudera.com.log -Dflume.root.logger=INFO,DRFA -Dzookeeper.root.logger=INFO,zookeeper -Dwatchdog.root.logger=INFO,watchdog -Djava.library.path=/usr/lib/flume/lib -Xmx2000m -Dpid=4677 -Dpidfile=/tmp/flumemaster.pid com.cloudera.flume.watchdog.FlumeWatchdog java -Dflume.log.dir=/var/log/flume -Dflume.log.file=flume-flume-master-monster01.sf.cloudera.com.log -Dflume.root.logger=INFO,DRFA -Dzookeeper.root.logger=INFO,zookeeper -Dwatchdog.root.logger=INFO,watchdog -Djava.library.path=/usr/lib/flume/lib -Xmx2000m com.cloudera.flume.master.FlumeMaster
flume 4711 0.0 0.2 2536336 55900 ? Sl Sep19 0:09 java -Dflume.log.dir=/var/log/flume -Dflume.log.file=flume-flume-master-monster01.sf.cloudera.com.log -Dflume.root.logger=INFO,DRFA -Dzookeeper.root.logger=INFO,zookeeper -Dwatchdog.root.logger=INFO,watchdog -Djava.library.path=/usr/lib/flume/lib -Xmx2000m com.cloudera.flume.master.FlumeMaster

Configuration

I’ve edited my flume-site.xml but my changes aren’t showing up!

You may have edited a version of the file but probably isn’t where flume expects it. Try manually starting the flume node but going to the command line and
entering:

flume node

In the first few lines of output there should be something like:

10/07/21 10:25:20 INFO conf.FlumeConfiguration: Loading configurations
from /etc/flume/conf

The flume-site.xml file that you edit should be in that directory.
(in this case ‘/etc/flume/conf/flume-site.xml’)

I seem to have active nodes using a tail source but nothing seems to show up.

Check the log and the permissions of the files to make sure flume has permissions to read the files. Oftentimes logs are stored as root or another system user. Flume usually runs as the user or as the user ‘flume’, who may not have permissions.

Data is saved to hdfs in some *.seq sequence file. How do I make it save in something closer to its original form?

You need to set a the output format property in your flume-site.xml file.

  <property>
    <name>flume.collector.output.format</name>
    <value>raw</value>
    <description>The output format for the data written by a Flume 
    collector node.  There are several formats available:
      syslog - outputs events in a syslog-like format
      log4j - outputs events in a pattern similar to Hadoop's log4j pattern 
      avrojson - this outputs data as json encoded by avro
      avrodata - this outputs data as a avro binary encoded data
      debug - used only for debugging
      raw - output only the event body, no metadata
    </description>
  </property>  

How do I change the maximum raw event size?

Set the flume.event.max.size.bytes property in the flume-site.xml file to a max size value.

Does this version support output file compression?

Yes, it v0.9.1 supports gzip compression and v0.9.2 supports any compression codec hadoop supports.

autoDFOChain stuff seems to hang!

That bug should be fixed in v0.9.1u1 and v0.9.2+.

Using a agent or auto E2E results in many periodic duplicates!

To use E2E reliability modes, you currently must use the collectorSink at the end point! The collectorSink contains the code that checks and responds to the acking and flushing logic injected in the ackedWriteAhead decorator that are used/generated in the auto/agent E2E sinks.

Also, make sure that if you change flume.collector.roll.millis, change flume.agent.logdir.retransmit to a value at least twice as big.

Huh? why isn’t the ack checking stuff in the collectorSource?

To guarantee data gets written, we can only send acknowledgements after we have successfully written. Sinks do the writing so only they can send the acknowledgement signals!

I get this exception in my logs:

2136264 [pool-1-thread-3] ERROR org.apache.thrift.server.TSaneThreadPoolServer - Thrift error occurred during processing of message.
org.apache.thrift.protocol.TProtocolException: Missing version in readMessageBegin, old client?
        at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:201)
        at com.cloudera.flume.conf.thrift.FlumeClientServer$Processor.process(FlumeClientServer.java:290)
        at org.apache.thrift.server.TSaneThreadPoolServer$WorkerProcess.run(TSaneThreadPoolServer.java:280)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)

This happens when an incorrect client attempts to talk to one of the thrift service. You may have accidentally added a port to flume.master.servers or some other place.

Plugins.

HBase Plugin doesn’t seem to be loading its configuration file, where should they go?

The hbase-site.xml file should be put in a dir that is in the your class path of flume. This means you could put the file in ./flume/conf/ or add the path to your configuration file to FLUME_CLASSPATH.

Dev Stuff

Where can I get a tar ball with source?

http://archive.cloudera.com/cdh/3/

I can start a Flume Master or Flume Node in eclipse but I can’t seem to load the web pages! I get a flumemaster.jsp not found error!?

The default setting is to precompile the jsps into java code. Currently these are generated when ‘ant’ is run from the command line. The java servlets are written to ./build/src/. You need to make sure to add ‘build/src’ to you eclipse build path.

I’ve found a problem: bug with the program, typo in the docs, etc.

Please let us know! We use a system called JIRA for bug reporting, tracking, and resolution. You can go here to let us know what you have found! Please let us know the version, component (if you can tell), and ideally a way to duplicate the bug!

Flume has weak ordering guarantees.

Flume has weaker guarantees than some other systems (message queues for example) in the interest of moving data around more quickly and to enable cheaper fault tolerance (The idea is to minimise the amount of state that Flume has to keep. Replicated state is what makes fault-tolerance hard, and makes reasoning about failure conditions difficult.). In Flume’s end-to-end reliability mode, events are delivered at least once, but with no ordering guarantees. We’ve found this sufficient for using Flume as a data conduit, since messages can be de-duplicated either at write time or by a post-hoc batch process. However, this means that Flume is harder to use as a message passing or eventing framework unless your application is setup to be idempotent wrt duplicate events and there is no causal relationship between events that is required to be preserved upon delivery.

There are two ways that events may be re-ordered:

1. They are transmitted in DFO or E2E modes, and a failure delays them until after the successful delivery of some chronologically later events. The agent will try and retransmit unacknowledged events, but that could happen after some events get delivered just fine.

2. The network reorders the packets. That can’t happen with current TCP protocols (i.e. there’s buffering and reordering done at the receiver), but I can’t rule out us going to UDP, precisely because we don’t need those guarantees.

You can always reconstruct causal order after all events are delivered by looking at their timestamps, but at the time of delivery you don’t know if there are events that you missed, unless you attach sequence numbers to each. If you are using Flume for alerting then you just need to track when the last interesting state was Say you received an ERROR notification with timestamp t – just make sure you save t and silently drop any messages that arrive after it with timestamps < t.

In BE mode, currently, events should arrive in order but it’s possible they could be delivered to different collectors, if you have more than one. You have to be aware of the possibility that events could be arbitrarily delayed, as well, although the delay you see for BE should be less than for DFO or E2E (i.e. events are usually delivered quickly, or not at all).

Clone this wiki locally