ELK Stack with log4j

Boring introduction that no one ever reads

ELK stack (or Elastic Stack) is comprised of three programs that, when used together, act as a full-scale centralized logging and log analytics solution. Those programs are ElasticSearch, Logstash and Kibana:
  • Logstash acts as a log router. Multiple applications can all at once ship their logs to Logstash, where they will be filtered, transformed and sent to one or more outputs according to one simple configuration file. For it to be called ELK stack, Logstash must dump the logs to ElasticSearch.
  • ElasticSearch is a full-text search engine much like Solr (they both use Apache Lucene as an underlying storage) that stores the logs and lets you perform queries to retrieve log entries. As soon as log entry comes into ES, it's marked with the timestamp and indexed. Later, it can be retrieved almost instantaneously thanks to Lucene index.
  • What comes after that is Kibana steps in. Kibana will perform queries on ES and visualize the logs. A picture is worth a thousand words, so without further ado here's some screenshots of Kibana visualizing logs from the app I've worked on recently:

Out of all the abundance of Logstash input plugins, there used to be one separate plugin for log4j users. Since the recent introduction of a new product to the ELK lineup - called Beats - and the subsequent rebranding and re-orientation, the Elastic stack creators mostly recommend Beats as a one-stop-shop for all kinds of inputs and a replacement for the wide range of plugins they used to support. That's how the log4j input plugin got deprecated:

This tutorial will explore the solutions for log4j input (except the new-fangled Beats, to read about that check this or this one link), as well as provide you with the best alternative.

Solutions for log4j input

  • File input (the old school approach). It's the simplest of all approaches for getting the precious logs from your apps. All you have to do is point to your log files in .conf file: The upside of this approach is you won't have to change your application's code, downside being the rigidity of log format. The unstructured log messages CAN and should be parsed with grok, but grok expression will have to be rewritten for each new logging framework and log message layout.
    Unstructured logs read from files.

  • Greylog Extended Log Format (GELF). Looking for the networked alternative to the file input, I stumbled upon the new log format I've never heard before. GELF, unlike the plain old log message, is the structured log format. Each log entry is represented as JSON document, with any conceivable number of fields.
    I realized that's something I wanted to try out, and did. The results and a detailed how to - to follow in the next paragraph.

Log4j GELF log shipping how-to

ELK download and configuration

https://www.elastic.co/downloads has all the software you'll need to set up your very own ELK stack:

After you've downloaded and unzipped all the files, the only bit of configuration you'll have to do is Logstash config. Create this file in logstash/config directory:
gelf.config Logstash accepts GELF log messages through UDP protocol on localhost:12201. Now, the entire stack can be launched from one .bat file:

Log4j configuration

Configure your application to send logs to Logstash input (GELF-formatted on localhost:12201). For instance, log4j.properties will look like this:
log4j.properties To write to the separate audit log, you'll have to grab the logger like this: getLogger("AUDIT"), but that's out of scope of this tiny tutorial.
This additional GelfLogAppender will require an additional dependency (maven):
That's all you're going to need to try out the ELK stack with log4j appender on your PC! Thanks to rubydebug output section in Logstash config, the messages that are stored into ES index can be viewed in console window.

Some useful info to learn more about ELK:

On the final note, I'll post some of the links I've been using extensively while making this tutorial.
Configuring Logstash
Field manipulation, if-then conditions, etc...
Full list of Logstash input plugins
List of Logstash filter plugins
Full list of Logstash output plugins
Logstash GELF library for all major logging frameworks
The power of filter section is mostly concentrated in so called Grok sections - a regexp-like log transformers, but much simpler than actual regexp. (example)