The 21st century is the age where information is power. This information comes in various forms of machine data which is generated by a multitude of computerized sources. However, when you face terabytes of information it is difficult to feel powerful, rather it feels like facing an uncontrolled demon.
In this article, we’ll look at an application of the
Splunk Enterprise platform. We’ll analyze an open source set of
iptables logs and answer a few interesting questions. The sample set is available here.
Note: I’ve already created a
Splunk app called
iptables logs and ingested the data set into an index named
What is the time range of the data set?
index="iptables" | timechart count
We can see that the data set has a time range from
February 1, 2004 - February 27, 2004.
What is the average number of events per day? Are there any abnormalities?
index="iptables" | timechart count | stats avg(count)
On average, there are about
11390 events per day.
I call a day abnormal if there are very less or very high number of events. A day with very few events can be considered a very good day (abnormal) and a day with very high number of events can be considered a very bad day (abnormal). With this in mind, we can determine those days that have an abnormal number (
+/- 3000) of events.
index="iptables" | timechart count | eval abnormal=if(count = 14390, 1, NULL) | replace 1 with "True", 0 with "False" in abnormal
Very few events on the following days:
February 1, 2004
February 6, 2004
February 10, 2004
February 27, 2004
Very high number of events on the following days:
February 3, 2004
February 7, 2004
February 8, 2004
February 21, 2004
February 26, 2004
How many distinct source and destination IPs exist?
index="iptables" | stats dc(SRC) AS "Distinct Source IPs" dc(DST) AS "Distinct Destination IPs"
20382 distinct source IPs and
369 distinct destination IPs. This is expected since the data set comes from a honeynet (refer data set docs here). High number of inbound connections (likely malicious) are expected in a honeynet.
What connection types exist in the data set?
index="iptables" | rex field=_raw "bridge kernel: (?(.+)?):" | top conn_type
We can see there are
10 types of connections. The important connection type here is the
INBLOCK type. During normal operation, a honeynet is open to all inbound connections. The
INBLOCK connection type suggests that something went wrong with the honeynet. Let’s see at what points in time the
INBLOCK connection type was observed.
index="iptables" | rex field=_raw "bridge kernel: (?(.+)?):" | search conn_type=INBLOCK* | timechart count
We can see that there were over
150 INBLOCK connections on
February 10 and around
February 11. This could mean that something went wrong around
Since the honeynet attracts maliciousness from all over the world, it is possible that the honeynet was compromised around
February 10. We also know that there were very few events on
February 10. This could mean that the honeynet was not operational (thus, no events were logged) because of the compromise.
The next step is to find what actually occurred on
February 10 and answer questions related to the abnormality.
Are there any scan patterns?
Simple scan patterns are relatively easy to detect. When executing a sweeping scan, a single source host pings multiple destination hosts on the same network.
index=iptables ICMP | stats dc(DST) AS dst_ip by SRC | where dst_ip > 20 | sort - dst_ip
23 hosts (
1 actual system and
22 virtual) and
1 router on the honeynet. Since multiple external sources are pinging using
24 distinct IP addresses, we can say with high probability that the honeynet is being scanned (possible automated with tools like
Was the honeynet compromised?
index=iptables | rex field=_raw "bridge kernel: (?(.+)?):" | timechart limit=0 count(conn_type) by conn_type
We can see that there were an abnormally high number of
INBOUND TCP connections on
February 7 and
8. But the honeynet was still operational as the number of
OUTG TCP connections were non-zero and higher than average (on first sight).
February 10 and
11, there were an abnormally high number of
INBLOCK connections and also the number of
OUTG TCP connections was
0. The honeynet might have been shut down on
February 10. Since there are non-zero number of outgoing
TCP connections on
February 11, the honeynet must have been turned back on.
From the above observations, we can theorize that the compromise occurred on or before
February 10 and
index=iptables | timechart count by DPT
We can see from the above statistic that there were more than
16000 connections on port
February 7 and more than
7500 connections on
February 8. It is possible that there was a compromise using the
HTTP over TLS/SSL protocol.
index=iptables OUTG* | where DPT==21 OR DPT == 80 | timechart count by DPT
We can see that there were an abnormally high number of outgoing
FTP connections on ports
21 respectively on
February 7 and
8. This is a sign of compromise since attackers are using
HTTP to exfiltrate data out of the honeynet.
What possible evidence of malware is there?
If the honeynet was infected with malware, there must be a particular destination port number which was used many times during the connection for purposes like backdoor communication, etc.
index=iptables | stats count by DPT | where count > 100 | sort - count
We can see that port numbers
135, 445, 443, 3127, 53, 139, 80, 137, 1434, 138 were used in more than
3000 connections each.
Note: I created an automatic lookup named,
port_desc which returns port descriptions for the four ports mentioned above.
index=iptables | stats count by DPT | where count > 3000 | sort - count limit=10 | lookup port_desc DPT OUTPUT
The following popular malware (around
2004 time frame) used the ports shown above:
For example, I used the following procedure to determine if the honeynet was infected with the
Mydoom Virus Check
Mydoom is a virus which opens a backdoor on port
3127 and can download and execute arbitrary files. To check whether the honeynet is compromised by
Mydoom, I visualized the traffic on port
index=iptables INBOUND | where DPT==3127 | timechart count
We can see that there was no
INBOUND traffic on port
February 5. However, the traffic considerably increased after
February 9. It is possible that the honey was infected with
Thanks for reading!
In this article, we looked at an application of
Splunk Enterprise where we determined if a honeynet was compromised and infected. This is a valuable skill for those of us who are interested in the fields of incident response, threat intelligence, etc.
We determined that the honeynet was compromised using
HTTP over TLS/SSL on
February 7 and it was shut down on
February 10. There were also signs of infection by malware like
Mydoom. Other malware also might have been used since non-popular ports like
1434 were sending/receiving traffic.
Thank you for reading! If you have any questions, leave them in the comments section below and I’ll get back to you as soon as I can!