Push Application Logs to Elasticsearch and Kibana

push-logs-to-elk
In my previous articles I have covered in detail about what is ELK stack (Elasticsearch, Kibana and Logstash) along with installation steps. In this post we will specifically cover how to automatically push your application logs to ELK. If you are using centralized logging for your different applications then its important to forward application logs to a centralized log aggregation server in our case its ELK stack. The task of forwarding logs to Elasticsearch either via logstash or directly to Elasticsearch is done by an agent. The task of that agent will be to just forward the logs to pre-defined destination which is configured in the agent itself. So, lets go ahead and see how to do this interesting thing. P.S.: We will be using all open source versions of software (This note is for all my posts :))

Architecture

Before jumping on the architecture lets first understand different components involved in the architecture.
  1. Server
    • This is a physical instance where application or service is hosted.
  2. Application
    • This is the service which is hosted on the server.
  3. Filebeat
    • Filebeat is a lightweight agent for forwarding and centralizing log data. Installed as an agent on your servers, Filebeat monitors the log files or locations that you specify, collects log events, and forwards them either to Elasticsearch or Logstash for indexing.
    • When you start Filebeat, it starts one or more inputs that look in the locations you’ve specified for log data. For each log that Filebeat locates, Filebeat starts a harvester. Each harvester reads a single log for new content and sends the new log data to libbeat, which aggregates the events and sends the aggregated data to the output that you’ve configured for Filebeat.
  4. Logstash
    • Logstash is an open source data collection engine with real-time pipelining capabilities. Logstash can dynamically unify data from different sources and normalize the data into destinations of your choice. Cleanse and democratize all your data for diverse advanced downstream analytics and visualization use cases.
  5. Elasticsearch
    • Elasticsearch is the distributed search and analytics engine. Elasticsearch is the component from ELK stack which does indexing, search, and analysis part.Elasticsearch provides near real-time search and analytics for all types of data. Whether you have structured or unstructured text, numerical data, or geospatial data, Elasticsearch can efficiently store and index it in a way that supports fast searches.
  6. Kibana
    • Kibana is An open-source analytics and visualization platform. Use Kibana to explore your Elasticsearch data, and then build beautiful visualizations and dashboards.Using Kibana you can manage your security settings, assign user roles, take snapshots, roll up your data, and more all from the convenience of a Kibana UI.
To give you an idea about different components involved in here, I will try to explain those through below high level architecture of the solution. There are 2 approaches here. We will discuss both the approaches. First approach: Forward Logs using Filebeat to Logstash and Logstash pushes it to Elasticsearch. logs-to-elk-filebeat-logstash As you can see in above architecture, Filebeat agent needs to be up and running on the application server instance. Filebeat continuously monitors application logs (paths configured in Agent) and pushes any new changes to Logstash. Logstash should be up and running in a separate server instance whose task is to read logs sent by Filebeat, process it and finally index logs to Elasticsearch. Once logs are indexed in Elasticsearch those will be automatically available to view in Kibana Dashboard. Second Approach: Forward logs using Filebeat directly to Elasticsearch skipping Logstash log-forward-elk-without-logstash So, this is another approach where Filebeat forwards logs directly to Elasticsearch skipping Logstash. But this approach does not provide a way to process and format your logs as per your need which can be done using Logstash component.

How it Works

Filebeat consists of two main components: inputs and harvesters. These components work together to tail files and send event data to the output that you specify.

What is a harvester?

A harvester is responsible for reading the content of a single file. The harvester reads each file, line by line, and sends the content to the output. One harvester is started for each file. The harvester is responsible for opening and closing the file, which means that the file descriptor remains open while the harvester is running. If a file is removed or renamed while it’s being harvested, Filebeat continues to read the file.

What is an input?

An input is responsible for managing the harvesters and finding all sources to read from. If the input type is log, the input finds all files on the drive that match the defined glob paths and starts a harvester for each file. Each input runs in its own Go routine. The following example configures Filebeat to harvest lines from all log files that match the specified glob patterns:
filebeat.inputs:
- type: log
  paths:
    - /var/log/*.log
    - /var/path2/*.log

How does Filebeat keep the state of files?

Filebeat keeps the state of each file and frequently flushes the state to disk in the registry file. The state is used to remember the last offset a harvester was reading from and to ensure all log lines are sent. If the output, such as Elasticsearch or Logstash, is not reachable, Filebeat keeps track of the last lines sent and will continue reading the files as soon as the output becomes available again. While Filebeat is running, the state information is also kept in memory for each input. When Filebeat is restarted, data from the registry file is used to rebuild the state, and Filebeat continues each harvester at the last known position. For each input, Filebeat keeps a state of each file it finds. Because files can be renamed or moved, the filename and path are not enough to identify a file. For each file, Filebeat stores unique identifiers to detect whether a file was harvested previously.

How does Filebeat ensure at-least-once delivery?

Filebeat guarantees that events will be delivered to the configured output at least once and with no data loss. Filebeat is able to achieve this behavior because it stores the delivery state of each event in the registry file. In situations where the defined output is blocked and has not confirmed all events, Filebeat will keep trying to send events until the output acknowledges that it has received the events.

Installation & Configuration

Now, lets see how to download and install different components. We will be using open source versions of all components.

Open Distro Elasticsearch

Please follow below steps to download and install Elasticsearch
  1. Download the ZIP file.
  2. Edit “elasticsearch.yml” file located at “/config/”
    • Change below property value to false, this is to disable HTTPS on Elasticsearch.
    • opendistro_security.ssl.http.enabled: false
  3. Extract the file to a directory, and open that directory at the command prompt.
  4. Run Open Distro for Elasticsearch:
    .\bin\elasticsearch.bat
    
Now, lets verify that elasticsearch is successfully installed or not. After you start Open Distro for Elasticsearch, open a new command prompt window. Then send requests to the server to verify that it is up and running:
curl -XGET http://localhost:9200 -u "admin:admin"
curl -XGET http://localhost:9200/_cat/plugins?v -u "admin:admin"

Open Distro Kibana

Please follow below steps to download and install Kibana
  1. Download the ZIP.
  2. Edit “kibana.yml” file located at “/config/”
    • Update below property to change elasticsearch host without https
    • elasticsearch.hosts: http://localhost:9200
  3. Extract the ZIP file to a directory and open that directory at the command prompt.
  4. If desired, modify config/kibana.yml.
  5. Run Kibana:
    .\bin\kibana.bat
 

Logstash

Follow below steps to install Logstash
  1. Download open source version of Logstash from this link
  2. Unzip the zip folder and create logstash-filter.config file in /config folder.
  3. Use below configuration only when you want to forward logs to Logstash from Filebeat
input {
beats {
port => 5044
ssl => false
ssl_verify_mode => "none"
}
}

filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}

output {
elasticsearch { 
hosts => ["http://localhost:9200"]
index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
user => admin
password => admin

}
stdout { codec => rubydebug }
}
4. Run the below command to start the logstash using our conf file
D:\logstash-7.10.2\bin>logstash -f ..\config\logstash-filter.conf

Filebeat

Follow below steps to download and install Filebeat.
  1. Dowload Apache 2.0 licensed distribution of Filebeat from here.
  2. Unzip the zip and edit filebeat.yml file.
  3. To forward logs directly to Elasticsearch use below configuration. Make sure to comment “Logstash Output” section.
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["localhost:9200"]

# Protocol - either `http` (default) or `https`.
#protocol: "http"

# Authentication credentials - either API key or username/password.
#api_key: "id:api_key"
username: "admin"
password: "admin"
4. Configure filebeat input logs file. All the logs from this path “D:\setups\logs\” will be processed.
filebeat.inputs:

# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.

- type: log

# Change to true to enable this input configuration.
enabled: true

# Paths that should be crawled and fetched. Glob based paths.
paths:
- D:\setups\logs\*
5. Run the Filebeat using below command
D:\setups\filebeat-7.12.1-windows-x86_64>filebeat.exe -e -c filebeat.yml
 

Execution Result

Now, lets see the successful execution of the process step by step.
  1. Start Elasticsearch service
  2. Start Kibana service
3. Start Logstash service 4. Start Filebeat service   Now all our required services are up and running. Lets see how logs are now pushed to Elasticsearch and Kibana when new log entry added to the log file or new log file created in the directory. Let’s add below sample log entries in the existing log file and we will check it on the Kibana dashboard
66.249.73.135 - - [17/Jan/2021:11:05:26 +0000] "GET /?flav=atom HTTP/1.1" 200 32352 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
207.241.237.220 - - [17/Jan/2021:11:05:24 +0000] "GET /blog/tags/C?page=2 HTTP/1.0" 200 16311 "http://www.semicomplete.com/blog/tags/C" "Mozilla/5.0 (compatible; archive.org_bot +http://www.archive.org/details/archive.org_bot)"
68.184.202.186 - - [17/Jan/2021:11:05:28 +0000] "GET /projects/xpathtool/ HTTP/1.1" 200 10745 "https://www.google.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36"
68.184.202.186 - - [17/Jan/2021:11:05:02 +0000] "GET /reset.css HTTP/1.1" 200 1015 "http://www.semicomplete.com/projects/xpathtool/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36"
68.184.202.186 - - [17/Jan/2021:11:05:05 +0000] "GET /images/jordan-80.png HTTP/1.1" 200 6146 "http://www.semicomplete.com/projects/xpathtool/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36"
68.184.202.186 - - [17/Jan/2021:11:05:02 +0000] "GET /style2.css HTTP/1.1" 200 4877 "http://www.semicomplete.com/projects/xpathtool/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36"
68.184.202.186 - - [17/Jan/2021:11:05:37 +0000] "GET /images/web/2009/banner.png HTTP/1.1" 200 52315 "http://www.semicomplete.com/projects/xpathtool/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36"
68.184.202.186 - - [17/Jan/2021:11:05:58 +0000] "GET /favicon.ico HTTP/1.1" 200 3638 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36"
46.105.14.53 - - [17/Jan/2021:11:05:29 +0000] "GET /blog/tags/puppet?flav=rss20 HTTP/1.1" 200 14872 "-" "UniversalFeedParser/4.2-pre-314-svn +http://feedparser.org/"
66.249.73.135 - - [17/Jan/2021:11:05:00 +0000] "GET /?flav=rss20 HTTP/1.1" 200 29941 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
24.233.162.179 - - [17/Jan/2021:11:05:31 +0000] "GET /favicon.ico HTTP/1.1" 200 3638 "-" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:25.0) Gecko/20100101 Firefox/25.0"
123.125.71.117 - - [17/Jan/2021:11:05:16 +0000] "GET / HTTP/1.1" 200 36824 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
220.181.108.153 - - [17/Jan/2021:11:05:09 +0000] "GET / HTTP/1.1" 200 36824 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
65.19.138.34 - - [17/Jan/2021:11:05:40 +0000] "GET / HTTP/1.1" 200 37932 "-" "Feedly/1.0 (+http://www.feedly.com/fetcher.html; like FeedFetcher-Google)"
66.249.73.135 - - [17/Jan/2021:11:05:32 +0000] "GET /blog/geekery/rhapsody-on-linux.html HTTP/1.1" 200 9109 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
97.116.185.190 - - [17/Jan/2021:11:05:59 +0000] "GET /articles/dynamic-dns-with-dhcp/ HTTP/1.1" 200 18848 "http://ubuntuforums.org/showthread.php?t=2003644" "Mozilla/5.0 (Windows NT 5.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36"
97.116.185.190 - - [17/Jan/2021:11:05:39 +0000] "GET /reset.css HTTP/1.1" 200 1015 "http://www.semicomplete.com/articles/dynamic-dns-with-dhcp/" "Mozilla/5.0 (Windows NT 5.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36"
97.116.185.190 - - [17/Jan/2021:11:05:29 +0000] "GET /style2.css HTTP/1.1" 200 4877 "http://www.semicomplete.com/articles/dynamic-dns-with-dhcp/" "Mozilla/5.0 (Windows NT 5.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36"
97.116.185.190 - - [17/Jan/2021:11:05:39 +0000] "GET /images/jordan-80.png HTTP/1.1" 200 6146 "http://www.semicomplete.com/articles/dynamic-dns-with-dhcp/" "Mozilla/5.0 (Windows NT 5.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36"
97.116.185.190 - - [17/Jan/2021:11:05:02 +0000] "GET /images/web/2009/banner.png HTTP/1.1" 200 52315 "http://www.semicomplete.com/articles/dynamic-dns-with-dhcp/" "Mozilla/5.0 (Windows NT 5.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36"
97.116.185.190 - - [17/Jan/2021:11:05:35 +0000] "GET /favicon.ico HTTP/1.1" 200 3638 "-" "Mozilla/5.0 (Windows NT 5.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36"
5.255.72.168 - - [17/Jan/2021:11:05:21 +0000] "GET / HTTP/1.0" 200 37932 "http://www.semicomplete.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:21.0) Gecko/20100101 Firefox/21.0"
5.255.72.168 - - [17/Jan/2021:11:05:08 +0000] "GET /blog/geekery/installing-windows-8-consumer-preview.html HTTP/1.0" 200 8948 "http://www.semicomplete.com/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7;
Lets create an index in Kibana create-index Here is the Kibana dashboard Go to Kibana -> Discover and you will see your logs here kibana-dashboard  

Recommendation

If you want to go in depth and learn more about ELK stack and forwarding data to ELK, I would recommend below books.

Conclusion

So, in this article we saw different approaches to forward logs to a centralized server, high level architecture for the same, how to push logs to ELK along with installation and configuration. Hope this post is useful!   Tags:

3 Replies to “Push Application Logs to Elasticsearch and Kibana”

  1. Hey,
    First of all thank you for stopping by here and also asking your doubts.
    I think that we use username and password to authenticate (Basic Auth) a user and IAM Role comes into picture when we are authorizing someone i.e its part of Authorization.
    You can think of using Token based authentication instead of Basic Auth here. Token Based authentication like JWT is the industry standard now also you can mint user roles, claims into the token for authorization purpose.
    Please refer this link to understand how to use JWT with Opendistro Elasticsearch.
    Hope this helps!

Leave a Reply