Components of Apache Flume
What is the role of interceptors in Apache Flume? Using the Interceptors we can alter the Flume Events or inspect the Events. The interceptors are placed in between the Source and the Channel so that whatever the events are to be sent to the Channel they can be inspected or filtered, in case we don’t want some of the events to be passed, by the interceptor.
For example, say we are streaming all the log files from an application server; it will have a number of different types of log files like info log, warning log, trace log, etc. and in case we are not interested in the trace logs, they can be filtered using the interceptors so that they will not be passed through to the Channel.
An Interceptor can either be between the Source and the Channel or in between the Channel and the Sink. And it is not necessary that there should only be a single interceptor; there can be multiple interceptors too. If there are more than one interceptors, the Flume events generated have to pass through all these Interceptors one by one and the interceptors will have some logic inside so that undesirable data is filtered out; hence, it can be said that when events pass through the interceptors, enrichment of data is done. So, this is the role of an interceptor in Apache Flume.
The Sink Processor is another important component of Apache Flume; there are many uses of the Sink Processor like:
- Sink Processor is the mechanism through which we can create failover paths.
- In case there are a lot of events, we can configure the load balance using the sink processor so that the load can be distributed to multiple sinks.