Fluentd (official website) & Logstash (official website) are open source data collectors, majorly used for transportation of log data to a centralized location / server. In this article we are going to compare Fluentd and Logstash data collectors to transport log data from various servers to a centralized server. We are going to comapre fluentd and logstash based on the following metrics:

  • Installation & Setup
  • Features & Performance

Recently we were in the process of selecting either of Fluentd or Logstash to be used with Elasticsearch and Kibana, which will eventually be used for analytical and tracking purposes. So we are going to list out all the important points based on the above metrics that will directly or indirectly influence “our requirements”. Also after the comparison we are going to tell you which one we chose and why. If you are in a similar boat, we’ll suggest you to choose either of fluentd or logstash based on “your own project requirements” and by evaluating which one will best suit “your requirements”.

 

Fluentd Vs Logstash

 

Installation & Setup
Fluentd Logstash
Installation Process Installer
Ruby Gem
Source
Repository (apt/yum) or
Compressed Archive
Size on Disk 165.3M 154.9M
Difficulty Easy Easy
Supported Platforms Linux
Mac OS X
Windows
Linux
Mac OS X
Requirements JVM
Technology CRuby JRuby

 

Features & Performance
Fluentd Logstash
Input Routing Tagging
Better for complex routing
Algorithmic Statements
Better for Structural or Procedural Programming
Data Transportation Uses a Buffering System that is highly configurable.
Could be both in-memory or on-disk
No built-in persistent message queue. So for persistence relies on a third-party messaging queue like Redis or Zeromq. Checkout this github issue.
Available Transportation Protocols
  • Active-Standby (Where logs are sent to secondary server in case primary fails)
  • Active-Active (Where logs are sent to multiple destinations via load-balancing and/or weighted load-balancing)
  • Require Acknowledgement (Where logs are re-transmitted until a receipt or acknowledment is not received back. This may affect the performance though.)
  • Active-Standby (Where logs are sent to secondary server in case primary fails)
Configuration Simple Complex
Extensibility Support extensibility via plugins.
Plugins List
Support extensibility via plugins.
Plugins Repository
Memory Consumption (Avg) 20 – 40MB 100 – 130MB

What we chose and why ?

So after all the evaluation, hours of reasearch, comparison and discussions, We finalized to go with the EFK stack and use Fluentd to transport our logs to our centralized logging servers. The reason as I already mentioned earlier were our exact requirements and not just the comparison points listed above. I personally was in favour of using logstash as it does not only collect/transport the log data but can also process and be used to analyze the collected data. In our case, Kibana was a requirement (for data representation and analysis) as the data was to be used by our analytics team, where people are much more comfortable using the data representation in a graphical form on a web dashboard. Kibana being a requirement, we felt we will not be using logstash to it’s fullest and wil be missing on the features like filtering and using codecs and will use it merely for transportation purposes. This led us to take a decision in favour of Fluentd. Let us know in the comments below, what worked for you and why did you choose it.

If you would like to know more about the EFK stack follow this Elasticsearch, Fluentd and Kibana installation, configuration and setup Guide.