Summary

Here we are going to review the process for ingesting authentication logs from a 2FA service into a SIEM. For my purposes I am using Duo for 2FA and Elastic for my SIEM though the concepts and at least some parts of the process should theoretically carry over to any combination of 2FA/SIEM platforms.

This tutorial assumes that you have an Elastic SIEM in place that is ready to receive traffic. If you need help with that please consult Elastic’s documentation or see part one of my SIEM home lab series for a detailed walkthrough.

All of this is being done in a home lab environment and you can see the diagram below for a high level overview of the topology.

2021-04-13 16_52_37-home_lab - Visio Professional.png

Duo

The first place I checked to start this project was Duo’s own knowledge base (which is pretty good btw). A quick search for “SIEM” led me to an article on exporting 2FA logs into a SIEM and after evaluating the options available I settled on using their own open source utility, Duologsync. They have a separate blog post detailing the underlying inspiration for the tool and the problems it is meant to solve if you are interested in reading more about it.

If you wind up utilizing this tool yourself I would strongly encourage you to carefully read through the README docs before jumping in as there are a few gotchas that initially tripped me up and wasted a bit of my time. The installation instructions are straightforward so I won’t go over that here, just read through the docs carefully and make sure you address the prerequisites.

Admin API

Before we get into the main configuration I want to point out one of the things that initially tripped me up due to it not being as explicitly declared in the documentation as it could be.

Duo uses the concept of applications to bind their 2FA to the service or platform you are trying to protect. After setting up the Duologsync utility and configuring it to query the desired authentication service I was confused when it didn’t pull any of the user authentications as I expected. As it turns out, you are not actually supposed to point the utility to the application for your authentication service itself but instead use the Admin API for the Duo platform.

Make sure you take the time to set up the Admin API application in the Duo dashboard first before proceeding. Once you have it set up and configured take note of the integration/secret keys as well as the API hostname:

Configuration

After you’ve cloned the github repo into the /opt directory (or wherever you decide to put it) and run the installation script the main config file that you want to edit is…config.yml 😉

2021-04-13 17_01_51-root@logstash_ ~.png

This is not meant to be an exhaustive review of every possible configuration option but I did want to point out a few things about how I am sending the data to Logstash.

One of the utility’s defaults is to output logs in JSON format. You should leave it this way because the logs are more enriched as compared to the alternative CEF format.

See what I mean?

The section related to destination server(s) is the first part the config that you will want to customize.

2021-04-13 16_58_12-root@sandbox_ _opt_duo_log_sync.png

The id field is arbitrary and can be whatever you want so long as you consistently use it throughout the config. The hostname is self-explanatory but the port is also arbitrary; I chose 10514 but you can use whatever port you want. The protocol is very important and you need to make a decision here - do you want the traffic to be secure or no? If yes then you need to select TCPSSL and configure additional parameters for the certificates used to secure the connection. The screenshots you see here are from my private lab environment and I’ve chosen for the sake of testing not to bother with encryption.

DO NOT USE UNSECURED PROTOCOLS IN A PRODUCTION ENVIRONMENT!!!

Ye be warned.

Remember earlier that I suggested taking note of your ikey/skey/hostname information? This is where you will need to plug that in.

2021-04-13 17_14_26-root@sandbox_ _opt_duo_log_sync.png

Finally, you need to tell Duologsync what kinds of logs to send to which server. In this case I have it set for auth which will pull all user authentication logs from the Duo Admin API and forward them to my Logstash server.

2021-04-13 17_17_46-root@sandbox_ _opt_duo_log_sync.png

Startup and Recovery

Something to keep in mind when ingesting important logs is the persistency of the data flow. In this case, what happens if the server running the Duologsync utility suddenly loses power or has a scheduled reboot? In it’s current configuration the utility’s running process will cease and the flow of logs will stop until you manually start it up again. There are probably several different ways to solve this problem but for me the simplest seemed to be systemd.

If you are unfamiliar with systemd or need a quick refresher then you can check out part one of this helpful tutorial. At it’s heart the Duologsync utility is a python program so I adapted my unit file from this example.

[Unit]

Description=Enable fetching various log types from the Duo cloud

After=multi-user.target

[Service]

Type=idle

ExecStart=/usr/local/bin/duologsync /opt/duo_log_sync/config.yml

WorkingDirectory=/opt/duo_log_sync

User=root

[Install]

WantedBy=multi-user.target

Two things that I want to point out:

ExecStart - the command line syntax for running the utility is “duologsync </path/to/config.yml>” so in the unit file we are explicitly stating the absolute path to each element needed to start the utility
User - I have it set to the root user in my lab but due to the security risks involved you should not do this unless absolutely necessary

Set the appropriate permissions for the file…

…and configure systemd to load the new unit file.

sudo systemctl daemon-reload
sudo systemctl enable duologsync.service

After the next system reboot you should see the utility running automatically.

From here onward you can use systemd to easily start/stop or check the running status of the utility.

Logstash

The next link in our chain is an application called Logstash; it is going to accept incoming traffic from Duologsync, perform some special formatting of the data, and output to Elasticsearch. If you are unfamiliar with Logstash you can read this introduction which gives a really great overview of the high-level concepts as well as some practical configuration scenarios. If you need some help getting it installed and set up you can reference my last blog post on Elastic cross-cluster search in which there is a section covering Logstash.

Pipelines

For this setup we are using multiple pipelines, one of which is devoted to Duologsync traffic. We start by editing the /etc/logstash/pipelines.yml file:

Change it from the default configuration…

…and customize it to explicitly declare two separate pipelines. I already had one pipeline for my Beats traffic so it’s simply a matter of adding another that is dedicated to Duo traffic. The names of the pipeline IDs and the .conf files themselves are irrelevant except for the purposes of configuration. In this case, since my Duo traffic is using the UDP protocol I just labeled the configuration file “udp.conf” and created a separate folder in the /conf.d directory to keep it organized.

Side note: you can actually just use a single pipeline/configuration file if you want but that requires heavy use of conditionals and for me it wasn’t worth the additional complexity and effort.

Below is my complete pipeline configuration file with a detailed breakdown of each stage.

Input

input {

  udp {

    port => 10514

    id => "duologsync"

    tags => "duologsync"

UDP plugin

Read messages as events over the network via udp. The only required configuration item is port, which specifies the udp port logstash will listen on for event streams.

Port

The port which logstash will listen on. Remember that ports less than 1024 (privileged ports) may require root or elevated privileges to use.

This is the only required parameter - in this case I chose the arbitrary value 10514 to match the setting in the Duologsync configuration.

ID

Add a unique ID to the plugin configuration. If no ID is specified, Logstash will generate one. It is strongly recommended to set this ID in your configuration.

Filter

filter {

  json {

    source => "message"

  geoip {

    source => "[access_device][ip]"

    target => "[access_device][geoip]"

  geoip {

    source => "[auth_device][ip]"

    target => "[auth_device][geoip]"

Remember from earlier when we chose to keep the default JSON output in the Duologsync utility? Now we need to tell Logstash that that is the format to expect and process it accordingly.

JSON plugin

It takes an existing field which contains JSON and expands it into an actual data structure within the Logstash event.

I want to expound on this one a bit because I spent a lot of time wrangling with it and hopefully I can save you some time. To illustrate why this plugin is important let’s look at what happens before and after it is enabled.

Before:

Notice that the entire Duo authentication log is tucked into the message field? This is better than nothing at all but it’s still hard to parse and doesn’t take full advantage of all the filtering and visualization capabilities of Kibana.

After:

It’s the exact same information but every value in the message has been broken out into it’s own unique field.

Before settling on the JSON filter I had first spent a lot of time researching how to use the Dissect and Grok filters to structure the logs the way I wanted them. Technically those would have worked in creating separate, structured fields out of the message data but only after much more effort and additional configuration complexity.

Take my word for it that the JSON filter is the best solution for this particular implementation 🙂

GeoIP plugin

The GeoIP filter adds information about the geographical location of IP addresses, based on data from the Maxmind GeoLite2 databases.

This plugin is optional and it’s usefulness really depends on whether you want to plot data from incoming logs onto a map or visualize the resulting GeoIP data in a particular way.

target

Specify the field into which Logstash should store the geoip data. This can be useful, for example, if you have src_ip and dst_ip fields and would like the GeoIP information of both IPs.

There are two important devices in a Duo authentication log - the ‘access device’ that is being used to login to a service and the ‘auth device’ that is used to authenticate the Duo second factor prompt. In this configuration we are telling Logstash to add additional fields that show the geographical data associated with the IP addresses for both devices.

Output

output {

  elasticsearch {

    hosts => ["https://10.0.2.12:9200","https://10.0.2.16:9200"]

    index => "duologsync-%{+YYYY.MM.dd}"

    user => "elastic"

    password => "Password123"

    ssl => "true"

    cacert => "/etc/logstash/certs/elasticsearch-ca.pem"

Elasticsearch plugin

If you plan to use the Kibana web interface to analyze data transformed by Logstash, use the Elasticsearch output plugin to get your data into Elasticsearch.

Hosts

Sets the host(s) of the remote instance. If given an array it will load balance requests across the hosts specified in the hosts parameter.

Index

The index to write events to. This can be dynamic using the %{foo} syntax. The default value will partition your indices by day so you can more easily delete old data or only search specific date ranges.

Notice that we are creating an index specifically for Duologsync and based on this configuration the resulting index would look something like ‘duologsync-2021-04-21’ in Kibana. You don’t have to do it this way but structuring the index based on the timestamp makes index management and lifecycle policies much easier.

The user, password, ssl, and cacert fields are pretty self-explanatory but do note that the path to the CA certificate should point to the same certificate authority cert that signed the certificate being presented by the remote Elasticsearch nodes. This is usually only necessary if you are using a private CA that is not included in the operating system’s default trust store.

Also, the value of the password field in my config is plaintext which is no bueno in a production environment! Always use a keystore for sensitive values in any Elastic configuration.

Once you have finished setting all the configurations go ahead and save the file and then restart Logstash:

systemctl restart logstash.service

Do the same thing on the Duologsync server, if necessary,

If you want to watch the apps start up, or troubleshoot issues, you can monitor the app logs by tailing the /var/log/logstash/logstash-plain.log on the Logstash server and the /tmp/duologsync.log on the Duo server (assuming you didn’t change the default log location).

Conclusion

Hopefully this tutorial was helpful to you and you were able to avoid some of the frustrations that I suffered through to get this working.

Take care!

Ingesting 2FA Logs into a SIEM