Showing posts from 2015

Memcached logging (and others) under Systemd on RHEL7

I've been getting into RHEL7 lately (yay, a valid reason to get into at work!) and this means learning about systemd. This post is not about systemd... at least its not another systemd tutorial. This post is about how I got memcached to emit its logging to syslog while running under systemd, and how to configure memcache to sanely log to a file that I can expose via a file share.

Its 2:21am, so this is gonna be quick. I wrote this because I needed memcached, and in response to some bad advice on Stack Overflow.

Use IPTables NOTRACK to implement stateless rules and reduce packet loss.

I recently struck a performance problem with a high-volume Linux DNS server and found a very satisfying way to overcome it. This post is not about DNS specifically, but useful also to services with a high rate of connections/sessions (UDP or TCP), but it is especially useful for UDP-based traffic, as the stateful firewall doesn't really buy you much with UDP. It is also applicable to services such as HTTP/HTTPS or anything where you have a lot of connections...

We observed times when DNS would not respond, but retrying very soon after would generally work. For TCP, you may find that you get a a connection timeout (or possibly a connection reset? I haven't checked that recently).

Observing logs, you might the following in kernel logs:
kernel: nf_conntrack: table full, dropping packet. You might be inclined to increase net.netfilter.nf_conntrack_max and net.nf_conntrack_max, but a better response might be found by looking at what is actually taking up those entries in your conne…

The importance of being liberal in a Cisco environment

Okay, so today I grappled with a Cisco sized gorilla and won. -- me, earlier this year, apparently feeling rather chuffed with myself.

I had recently launched a new service for a client, and a harvester on the Internet was experiencing timeouts in trying to connect, and so our data was not being harvested by the harvester that we needed. There was no evidence of a problem in the web-server logs because no HTTP request ever had a chance to make it through.

It seems that Cisco products (some unstudied subset, but likely firewalls and NATs) seem to play a bit too fast and loose for Linux’s liking and change packets in ways that makes the Linux firewall (iptables) stateful connection tracking occassionally see such traffic as INVALID. This manifests in in things such as connection timeouts, and as such you won’t notice it in things like webserver logs. In a traffic capture, you may recognise it as a lot of retransmissions.

How long has that command been running? (in seconds)

The ps command is fairly obtuse at the best of the times, but I am very thankful for things like ps -eo uid,command. I wish it were as easy to have ps report the starttime of a process in a form I can use. See my pain looking at the start-time and elapsed time (which is not wall-clock) for a rather antiquarian system.

# ps -eo stime,etime ... selected extracts ... Apr20  1-02:30:30  2013 731-03:01:37 12:59    01:42:25 14:41       00:00
Yeah, I really don't want to touch that with any scripting tool. Best to head to /proc/... you may actually want to use ps or something like pgrep to determine the PID of the progress of interest.

According to proc(5), we want the 28th element in /proc/PID/stat:  "starttime: ...The time in jiffies the process started after system boot.". A jiffie is explained in time(7) -- this from RHEL 5:

   The Software Clock, HZ, and Jiffies
       The  accuracy  of  many system calls and timestamps is limited by the resolution of
       the software clock, …

Answering 'Are we there yet?' in a Unix setting

Often -- commonly during an outage window -- you might get asked "How far through is that (insert-length-process-here)?". Such processes are common in outage windows; particularly unscheduled outages where filesystem-related work may be involved, but crop up in plenty of places.

In a UNIX/Linux environment, a lot of processes are very silent about progress (certainly with regard to % completed), but a lot of time, we can deduce how far through an operation is. This post illustrates with a few examples, and then slaps on a very simple and easy user-interface.

But 'Are we there yet?' is rather similar in spirit to 'Where is up to?' or 'What is it doing?', so I'll address that here too. In fact, I'll address those first, because they often lead up to the first question. And we won't just cover filesystem operations, but they will be first because that's what's on my mind as I write this.

Please don't use dig etc. in reporting scripts... use 'getent hosts' instead (gawk example)

Okay, so the excrement of the day is flying from the fan and you need to get some quick analytics of who/what is being the most-wanted of the day. Perhaps you don't currently the analytics you need at your fingertips, so you hack up a huge 'one-liner' that perhaps takes a live sample of traffic (tcpdump -p -nn ...), extracts some data using sed, collates the data using awk etc. and then does some sort of reporting step. Yay, that's what we call agile analytics (see 'hyperbole'); its the all-to-common fallback goto, but it does prove to be immensely useful.

Okay, so you've got some report, perhaps it lacks a bit of polish, but it contains IP addresses, and we'd like to see something more recognisable (mind you, some IP addresses can become pretty recognisable). So, you scratch your head a bit and have the usual internal debate "do I do this bash, or fall up to awk, perl/python". At this point (if you go with bash etc. or awk), you'll perhap…

My Scriptorium on GitHub

I consider myself something of a toolsmith... when solving problems I often end up writing a script (or lengthy 'one-liner' more likely). Sometimes that script ends up being rather useful and I find myself using it frequently. Now you can perhaps benefit from this too (and perhaps improve on it).

You can find a hand-picked assortment of widely applicable such scripts and tools in my scriptorium on GitHub. If you're an admin of Linux systems (or Unix systems), then there may be something of use for you there.

[Humbledown highlights] Managing IPv6 Zone Pain

Originally published by myself on at Wed Aug 11 12:49:36 NZST 2010 and since recovered to this location. It has not been tested since its original publication.
If you’re the type who prefers to hand-edit their DNS zone files (and there are an awful lot of us), then you’ll recognise the pain of managing IPv6 PTR records in DNS. You might even have a coping strategy to help you input them without making an all-to-easy typo, such as by using a command such as ipv6calc. However, if that’s how you do it, then it still makes it very difficult to look for the address, or errors, after it has been entered; IPv6 PTR records are highly unrecognisable at a glance. A better way is to separate the edited view from the production view, to a small extent, by pre-processing the input with a tool. That is what this post is about; I present to you: ipv6-dns-revnibbles.

[Humbledown highlights] Finding IPv6 Addresses Derived from SLAAC

Originally published by myself on at Thu Feb 3 17:49:38 NZDT 2011 and since recovered to this location. It has not been tested since its original publication.

There is a common desire, often derived from security requirements, to find out what IPv6 addresses are being used on the network, and who has them. Granted, many IPv6 host addresses can have their MAC address either inferred from the address itself, this is assuming the address is a SLAAC address, and not manually assigned, privacy address, or assigned by a stateful mechanism (ie. DHCPv6).

To take an IPv4 analog of the problem, let's say that we wanted to determine a table of mappings of MAC addresses, IP addresses, and perhaps the time at which that binding was valid. The easiest way to do this for address assignments managed via DHCP is to look at your DHCP server's leases file. In an IPv6 context, we could do exactly the same thing... or rather, we could, if only all clients used stateful assignment, an…

[Humbledown highlights] Generating Tomcat Keystore from Key, Cert, and CA Bundle

Originally published by myself on at Sun Jun 12 06:05:32 NZST 2011
and since recovered to this location. It has not been tested since its original publication.

I had to configure a new Tomcat installation yesterday with for SSL. The server also has an Apache (httpd) installation, and the SSL certificate was generated using the OpenSSL command (umask 077; cd /etc/pki/tls/private; openssl req -nodes -newkey rsa:2048 -keyout servername.key -out servername.csr. This generates the private key and a certificate signing request, which gets sent to the Certificate Authority (CA). The response from the CA includes a servername.crt and, which contains a list of PEM-encoded x509 certificates, that are used in establishing the chain of trust from your certificate to a root certificate.

That process works completely fine fine for the httpd server, but Tomcat expects its key to be generated using the keytool command, and the resulting certificate and CA bundle to …

[Humbledown highlights] Build an inode Lookup Database using SQLite3

Originally published by myself on at Sun Aug 28 19:25:47 NZST 2011 and since recovered to this location. It has not been tested since its original publication.
Occassionally, such as when investigating NFS traffic, you need to determine which filename(s) correspond to a given inode, or set of inodes. Now, you could answer this question using a command such as find /someplace -inum 12345 -print, but that will generally be quite slow, and very cumbersome if you wanted to lookup multiple inodes.

I decided to do this a different way, by creating a database to map between inode and path. Furthermore, I decided to do this using sqlite3, because a) it ought to be faster than grepping through a file looking for a particular inode, b) sqlite3 is pretty useful and I thought this would be a good excuse to practice using it, c) it was easy to do so.

[Humbledown highlights] NFS Lock Analysis with thsark (Wireshark) and Python

Originally published by myself on at Sun Aug 28 07:23:18 NZST 2011
and since recovered to this location. It has not been tested since its original publication.
I had a need to investigate what was happening on an NFS export, looking at what files were being locked, how often, and how long it took for locks to be granted. I decided I would use the tshark command, which is the command-line equivalent of Wireshark. However, since tshark isn't on most servers, I recorded 200,000 NFS-related packets using tcpdump, then analysed it on my Mac laptop using tshark and a Python script.

Before I delve into how you can do this, let's first look at the final result:

$ nfs-lock-report -f nfs.pcap inode lk_den lk_reqs lk_ok_min lk_ok_max lk_int_min lk_int_max filename 22753 0 4 0.0 60.0 60.0 120.0 22754 0 4 0.0 60.0 60.0 120.0 22759 0 0 0.0 0.0 empty emp…

From DNS Packet Capture to analysis in Kibana

UPDATE June 2015: Forget this post, just head for the Beats component for ElasticSearch. Beats is based on PacketBeat (the same people). That said, I haven't used it yet.

If you're trying to get analytics on DNS traffic on a busy or potentially overloaded DNS server, then you really don't want to enable query logging. You'd be better off getting data from a traffic capture. If you're capturing this on the DNS server, ensure the capture file doesn't flood the disk or degrade performance overmuch (here I'm capturing it on a separate partition, and running it at a reduced priority).

# nice tcpdump -p -nn -i eth0 -s0 -w /spare/dns.pcap port domain

Great, so now you've got a lot of packets (set's say at least a million, which is a reasonably short capture). Despite being short, that is still a massive pain to work with in Wireshark, and Wireshark is not the best tool for faceting the message stream so you can can look for patterns (eg. to find relationshi…