Srlog2 Design Documentation

Bruce Guenter

February 10, 2015


1 Introduction

The original srlog package originated as an internal mechanism for collecting all system logs at FutureQuest, Inc. in one place for analysys.


1.1 Requirements

When figuring out how to accomplish this, we identified several key requirements:

We considered using some tools that were already available for this task. In particular, reusing a tool like SSH would have been ideal. However, we were unaware of any such tool that give the delivery guarantees we wanted.


1.2 Initial Implementation

The initial implementation was fairly limited, and there were a number of design mistakes. The packet format allowed no variations in what cryptography mechanisms were used. It was hard coded to use MD5 for authentication, the nistp224 elliptic curve for key exchange, and AES192-CBC for encryption, with no hashing of the shared secret to produce the encryption key, no IV, and no resets between packets. Each service required its own secret key, and needed the server key copied into its directory. Senders were identified exclusively by IP and authenticated by a manually copied public key. The packet format was also overly optimized for the established connection path, and only allowed one line per packet.


1.3 Rewriting to srlog2

I recognized a number of the original design decisions were poor choices or outright mistakes, and set out to fix them. In order to avoid recreating some original mistakes or throwing away existing knowledge, all of the changes were done incrementally, resulting in a system that was at least minimally usable at each step.

However, many of the choices resulted in a system that was completely incompatible with the original srlog externally, even though much of the internal mechanism was still the same. In particular the protocol and the key file handling were completely overhauled. So, the package name (and the name of all the programs) was changed to reflect these differences, and to prevent confusion between the old and new packages.


2 Design


2.1 Network Protocol


2.1.1 Network Transport

All data is exchanged over UDP with a default port number of 11014. The sender and receiver first optionally negotiate encryption parameters, and then establish a virtual connection over which the sender delivers its log messages. Only acknowledgements are sent by the receiver to successful packets; no negative acknowledgements are possible.


2.1.2 Packet Formats


2.1.2.1 Data Formats

All integers are unsigned, and encoded in LSB order.

A “timestamp” is encoded as a 4-byte integer number of seconds since the UNIX epoch, and a 4-byte integer nanosecond offset since the last whole second. Using unsigned integers, this will be adequate until the year 2106.

Strings is encoded as a 1 or 2 byte length integer followed by the unencoded data. No trailing NUL byte is used (externally).


2.1.2.2 PRQ: Preferences Query

OffsetSizeTypeDescription
04ConstantPacket type ‘SRL2
44ConstantMessage type ‘PRQ1
88StringNonce
161+NStringAuthenticator list (‘HMAC-MD5’)
??1+NStringKey exchange list (‘nistp224’ or ‘curve25519\000nistp224’)
??1+NStringKey hash list (‘SHA256’)
??1+NStringEncryptor list (‘AES128-CBC-ESSIV’)
??1+NStringCompressor list (‘null’)

Notes:


2.1.2.3 PRF: Preferences Response

OffsetSizeTypeDescription
04ConstantPacket type ‘SRL2
44ConstantMessage type ‘PRF1
88StringCopy of nonce
161+NStringAuthenticator choice
??1+NStringKey exchange choice
??1+NStringKey hash choice
??1+NStringEncryptor choice
??1+NStringCompressor choice

2.1.2.4 INI: Initialization Packet

OffsetSizeTypeDescription
04ConstantPacket format ‘SRL2
44ConstantPacket type ‘INI1
88IntegerInitial sequence number
168TimestampInitial timestamp
201+NStringSender name
??1+NStringService name
??1+NStringAuthenticator name (A)
??1+NStringKey exchange name (E)
??1+NStringKey hash name (H)
??1+NStringCipher name (C)
??1+NStringCompressor name (Z)
??sizeof(E)EClient session public key
??sizeof(A)AAuthenticator

2.1.2.5 CID: Initialization Response

OffsetSizeTypeDescription
04ConstantPacket type ‘SRL2
44ConstantMessage type ‘CID1
8sizeof(E)EServer session public key
??sizeof(A)AAuthenticator

2.1.2.6 MSG: Message Packet

OffsetSizeTypeDescription
04ConstantPacket type ‘SRL2
44ConstantMessage type ‘MSG1
88UnsignedInitial sequence number
161UnsignedMessage count M
??8TimestampTimestamp
??2+NStringLine
????CharPadding to fill out encryption block
??4CRC-32Check code on encrypted data
??sizeof(A)AAuthenticator

Notes:


2.1.2.7 ACK: Message Acknowledgement Packet

OffsetSizeTypeDescription
04ConstantPacket type ‘SRL2
44ConstantMessage type ‘ACK1
88UnsignedSequence number
16sizeof(A)AAuthenticator

2.1.2.8 SRQ: Status Request

OffsetSizeTypeDescription
04ConstantPacket type ‘SRL2
44ConstantMessage type ‘SRQ1
88StringNonce

2.1.2.9 SRP: Status Response

OffsetSizeTypeDescription
04ConstantPacket type ‘SRL2
44ConstantMessage type ‘SRP1
88StringCopy of nonce
162+NStringText status report

2.2 Key Exchange

The shared secrets for the INI, CID, MSG, and ACK packets are computed as follows:

PacketClient ComputesServer Computes
INIClient secret * Server publicServer secret * Client public
CIDClient session secret * Server publicServer secret * Client session public
ACK or MSGClient session secret * Server session publicServer session secret * Client session public

2.3 Encryption Parameters

The current system is hard coded to use HMAC-MD5 for authentication and AES-CBC as the cipher with a 128-bit key for encryption and ESSIV, with the first 32 bytes of the SHA256 hash of the nistp224 shared secret used for the key. Additionally, the first 32 bytes of the SHA256 hash of the previous SHA256 hash is used as the ESSIV encryptor. The system may use either nistp224 or curve25519 for key exchange, depending on if curve25519 keys and software support is present on both ends.


2.4 Logging Format

The logging format (the format of the lines written by srlog2d to be read by a log processor) reflects the fact that multiple lines will frequently be output for the same sender/service combination. In this way, it encapsulates the manner in which log data arrives – each packet contains one or more log lines (usually more).

So, instead of having information about the service on each line of output, there is a seperate line type for identifying the service. This actually simplifies the sender, as the actual log lines can be passed by the logger into the output file or pipe without modification.


3 Detailed Changes

This chapter describes in detail the changes made between the original package and srlog2. Some of the explanation for the design decisions above is explained here.


3.1 Multiple lines per packet

The largest real problem encountered with the original system was the high system load caused by the receiver. Having the protocol handle a single line per packet meant that each log line would cause the system to handle two interrupts (incoming and outgoing), and the receiver would have to do a decryption and two full secure hashes. This ended up being a significant issue as we were handling well over 1,000 lines/sec.

Adding a new packet type that would transmit multiple lines was not a big problem, but the bigger issue came with encryption. Since the CBC state was not reset between packets, retransmissions caused a huge implementation headache that could not be satisfactorily resolved.


3.2 IV computed using E(Salt|Sector)

To resolve the CBC issue, the IV was initially forced to zero at the start of each packet. Then while researching disk encryption I came across a scheme called E(Salt,Sector)IV or ESSIV. In this scheme, the key used for the primary encryption is hashed to key another encryptor. To produce the IV for each packet, the (public) sequence number is encrypted (in simple ECB mode) with this (secret) key material to produce a deterministic but still secret IV. This eliminated encryption ordering issues, making one of the issues with having multiple lines per packet disappear.


3.3 Introducing curve25519

After writing the original package, the author of the nistp224 package, Daniel J. Bernstein produced another, stronger, elliptic curve key exchange protocol called curve25519. The nistp224 package was no longer being maintained, and had known bugs causing serious performance regressions with modern compilers, and the author was advocating the use of curve25519 over it.

Initially I was inclined to switch the entire system to curve25519 and drop nistp224 entirely, but the core math of the new system was written entirely in assembler, and the released code only worked on Intel/AMD 32-bit systems. As a result, a mechanism was introduced which would allow either system to be used, with a preference for the longer keys where both were supported.


3.4 New packet format

The original packet format had two shortcomings. First, there was no identification information in the packet other than the leading sequence number, and that was only useful if there was a single line in the packet. To add more packet formats, the sequence numbers from 0xffffffff00000000 and up were reserved. While it is improbable that any sender would ever get close to this number, it is still a poor kludge for multiple packet types. Second, all numbers were represented in MSB order but all the systems using it used LSB ordering, requiring byte swapping on each packet.

So, a new packet format was designed that improved on several attributes. First, the format itself included a version number in both a format identifier and a seperate type identifier, allowing for easily adding more packet types and for future updates to the format. Second, the single line packet was rejected in favor of a explicitly handling multiple lines in each transmission. Finally, all numbers were encoded in LSB order.


3.5 Sender Names

The first design for srlog used IP addresses exclusively to identify senders in the receiver program. This however led to problems when the IP address on a sender changed. In particular, when a sender had multiple IP addresses, the kernel would make an arbitrary choice of which one to use for sending, and that could confuse the receiver. Switching from strictly IPs to names has the additional benefit of allowing support for roaming senders, which has happened when we set up servers in one place and install them in another.


4 Miscellaneous


4.1 External Encryption Libraries

Originally, I had set up the package to use a built in Rijndael (AES) implementation for symmetric encryption. There are, however, several encryption libraries available which may be preferable due to being more portable and/or faster (due to the use of assembler etc).

Here are the features I have identified in an encryption library as being required or desireable for srlog2:

The candidate libraries that I found are:

I have switched to using libtomcrypt based on its good API and documentation, compact size, and public domain status. The encryption support in srlog2 is already encapsulated into a single source file, so switching to another library should not be a large effort. Ideally the build process could switch between several libraries depending on which was present at build time, but that’s more work than it’s worth for now.


Table of Contents


This document was generated on February 10, 2015 using texi2html 5.0.