cipherdyne.org

Michael Rash, Security Researcher



2012 Blog Archive    [Summary View]

Software Release - fwsnort-1.6.2

fwsnort-1.6.2 released The 1.6.2 release of fwsnort is available for download. The most impactful change in this release is a switch to how fwsnort loads translated rules into the running iptables policy. Instead of attempting to parse the local policy and only add those rules in that appear to match protocols that the policy allows, fwsnort now loads all translated rules by default. The reasoning for this change is in the ChangeLog below. There are a few bug fixes and updates to get fwsnort working without warnings on recent versions of perl as well as an ICMP type fix for recent versions of iptables. As usual, please let me know if there are any issues.

Here is the complete fwsnort-1.6.2 ChangeLog:

  • Switched --no-ipt-sync to default to not syncing with the iptables policy. By default fwsnort attempts to match translated Snort rules to the running iptables policy, but this is tough to do well because iptables policies can be complex. And, before fwsnort switched to the iptables-save format for instantiating the policy, a large set of translated rules could take a really long time to make active within the kernel. Finally, many Snort rules restrict themselves to established TCP connections anyway, and if a restrictive policy doesn't allow connections to get into the established state for some port let's say, then there is little harm in having translated Snort rules for this port. Some kernel memory would be wasted (small), but no performance would be lost since packets won't be processed against these rules anyway. The end result is that the default behavior is now to not sync with the local iptables policy in favor of translating and instantiating as many rules as possible.
  • Replaced Net::IPv4Addr with the excellent NetAddr::IP module which has comprehensive support for IPv6 address network parsing and comparisons.
  • Moved the fwsnort.sh script and associated files into the /var/lib/fwsnort/ directory. This was suggested by Peter Vrabec.
  • Bug fix for recent versions of iptables (such as 1.4.12) where the icmp match requires --icmp-type to be set - some Snort rules look for a string to match in icmp traffic, but don't also specify an icmp type.
  • Bug fix for 'qw(...) usage as parenthesis' warnings for perl > 5.14
  • Removed the ExtUtils::MakeMaker RPM build requirement from the fwsnort.spec file. This is a compromise which will allow the fwsnort RPM to be built even if RPM doesn't or can't see that ExtUtils::MakeMaker is installed - most likely it will build anyway. If it doesn't, there are bigger problems since fwsnort is written in perl. If you want to build the fwsnort RPM with a .spec file that requires ExtUtils::MakeMaker, then use the "fwsnort-require-makemaker.spec" file that is bundled in the fwsnort sources.

Software Release - psad-2.2

psad-2.2 released After a long development cycle, the 2.2 release of psad is available for download. This release adds major new functionality for the detection of malicious traffic that is delivered over IPv6 by parsing ip6tables logs. A significant portion of this capability is enabled by the excellent NetAddr::IP CPAN module that can properly handle IPv6 addresses. In addition, speed optimizations have been made that result in psad-2.2 being about 15% faster than previous releases, several bugs have been fixed (including one that caused compile time warnings on recent versions of perl), and a comprehensive test suite has been added. psad-2.2 is a stepping stone to the upcoming psad-3.0 release that will include support for both PF and ipfw firewalls running on *BSD systems. Quite a bit of this work has already been done in the openbsd_integration branch.

Here is an excerpt of the psad-2.2 ChangeLog:

  • Added support for detection of malicious traffic that is delivered via IPv6. This is accomplished by parsing ip6tables log messages - these are in a slightly different format than the iptables log messages. Here is an example:

    Mar 17 13:39:13 linux kernel: [956932.483644] DROP IN=eth0 OUT= MAC=00:13:46:3a:41:36:00:1b:b9:76:9c:e4:86:dd SRC=2001:0db8:0000:f101:0000:0000:0000:0002 DST=2001:0db8:0000:f101:0000:0000:0000:0001 LEN=80 TC=0 HOPLIMIT=64 FLOWLBL=0 PROTO=TCP SPT=50326 DPT=993 WINDOW=5760 RES=0x00 SYN URGP=0

    Detection of malicious IPv6 traffic can be disabled via a new ENABLE_IPV6_DETECTION config variable.
  • For ICMP6 traffic, added protocol validation for ICMP6 type/code combinations.
  • Added a new test suite in the test/ directory to validate psad run time operations (scan detection, signature matching, and more). To support this, a new '--install-test-dir' option was added to the install.pl script. Once this is executed, the test suite can be run via the test-psad.pl script in the test/ directory.
  • Added a new MAX_SCAN_IP_PAIRS config variable to allow psad memory usage to be constrained by restricting the number of unique IP pairs that psad This is useful for when psad is deployed on systems with little memory, and is best utilized in conjunction with disabling ENABLE_PERSISTENCE so that old scans will also be deleted (and thereby making room for tracking new scans under the MAX_SCAN_IP_PAIRS threshold).
  • Bug fix for 'qw(...) usage as parenthesis' warnings for perl > 5.14
  • Bug fix that caused psad to emit the following:

    Undefined subroutine &main::LOG_DAEMON called at ./psad line 10071.

    This problem was noticed by Robert and reported on the psad mailing list.
  • Bug fix for ICMP packet handling where psad would incorrectly interpret ICMP port unreachable messages as UDP packets because the UDP specifics are included in the iptables log message. This bug was first reported by Lukas Baxa to the Debian maintainers and was followed up by Franck Joncourt: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=596240 An example ICMP log message that exposed the bug is included below:

    Sep 8 18:04:26 baxic kernel: [28241.572876] IN_DROP IN=wlan0 OUT= MAC=00:1a:9f:91:df:ae:00:21:27:e8:0a:a0:08:00 SRC=10.0.0.138 DST=192.168.1.103 LEN=96 TOS=0x00 PREC=0xC0 TTL=254 ID=63642 PROTO=ICMP TYPE=3 CODE=3 [SRC=192.168.1.103 DST=10.0.0.138 LEN=68 TOS=0x00 PREC=0x00 TTL=0 ID=22458 PROTO=UDP SPT=35080 DPT=33434 LEN=48 ]
The complete psad-2.2 ChangeLog can be found here via the psad gitweb interface.

cipherdyne.org git Repositories Safe After github Hack

github After a widely publicized hack of github (now fixed), I thought it would be a good idea to ensure that the cipherdyne.org git repositories remain secure on both github and on the cipherdyne.org webserver. The techniques in the blog post may not be well suited to large git repositories with a lot of different people able to commit code, but for my repositories these checks provide a fairly high level of confidence that no malicious code has been introduced. First, the github hack was made possible through a "mass assignment" vulnerability in Ruby on Rails, and would have permitted an attacker to gain admin privileges to any project on github. Once admin access is acquired, an attacker would be in a position to do anything to the underlying code base - including adding new code that implements "undocumented features".

Now, in order to add a backdoor into a code base on github, what would an attacker need to do?

Altering the code base for a project would need to be done through standard git operations as a new commit - i.e., make code changes to a local copy, git add ..., git commit ... - as opposed to manually editing previously committed code in the git repository itself. This is because every commit must match a corresponding SHA1 hash according to git's object model, and a SHA1 collision such that the bogus data is also working code would be "computationally difficult" to say the least (attacks against SHA1 not withstanding). As a basic check, one can create a git repository for testing purposes, write a random byte to a random position within the .git/objects/pack/*.pack file and then try to clone it to see what happens:
error: packfile ./objects/pack/pack-9c886ed427d9a7538093f09edf516a0a718201ac.pack does not match index
error: packfile ./objects/pack/pack-9c886ed427d9a7538093f09edf516a0a718201ac.pack cannot be accessed
fatal: git upload-pack: cannot find object 7e8e48412ff985461095a09874059e955145d513:
fatal: The remote end hung up unexpectedly
The repository has essentially been corrupted and the clone operation fails.

Ok, so a malicious code modification would most likely need to be done via an entirely new commit. This could certainly be done by an attacker, but anyone who has cloned the repository would be able to see the change. For a large, highly active project without rigorous code review and committer hierarchy, it is conceivable that such a change might just get lost within the noise of lots of commits. After all, a human would need to review the change and recognize it as being malicious.

Could such a malicious code change affect the cipherdyne.org projects? In a word, "no", and here's why: all commits to the cipherdyne.org code bases are pushed to github from private git repositories on a dedicated system that is not generally accessible, and git pull ... is only done occasionally and every change comes from a known source and is reviewed. The accessible non-private git repositories on cipherdyne.org are mirrors of the github repositories, so if a malicious change were introduced into github, then they would have this change too. The private repositories would still be safe however. So, given this work flow, what I need is a way to verify that there are no commits in either the github repositories or cipherdyne.org mirrors that are not in private repositories. Further, I would like to be able to verify this without having to push or pull code into the private repositories (so I can regularly check at any time without any modifications coming through). There are probably many ways to do this in the git world - for example, one could just use git fetch to bring in changes into remote tracking branches and compare these against local branches (any malicious code would not be merged into a local branch after it is discovered), but here is an alternate solution:

  • On my system where the private repositories live, create two directories GH/ and CD/, and clone all of the github repositories into the GH directory and all cipherdyne.org repository mirrors into the CD directory. We'll assume that the private repositories live in a directory called private/
  • For each repository pair in the GH and CD directories, diff the output of git rev-list --all. There should be zero differences here.
  • If the step above checks out, then diff the git rev-list --all output across the GH/<repo> and private/<repo> pairs. For this step we expect that the private repositories will have local commits that are not necessarily pushed upstream - what we're concerned about is any commit in a GH/<repo> that is not in a private/<repo>.
Here are a few commands to accomplish the above (we'll assume that the git clone --bare <repo> steps have already been done for brevity):
$ for r in fwknop psad gpgdir fwsnort IPTables-Parse IPTables-ChainMgr
> do
> echo "[+] Checking $r...";
> diff -u <(cd ~/CD/$r.git && git rev-list --all ) <( cd ~/GH/$r.git && git rev-list --all)
> done
[+] Checking fwknop...
[+] Checking psad...
[+] Checking gpgdir...
[+] Checking fwsnort...
[+] Checking IPTables-Parse...
[+] Checking IPTables-ChainMgr...
The output above indicates there are identical git commits in the github repositories vs. the cipherdyne.org mirrors. Good. Now, let's compare the github repositories vs the private ones - we grep on "+" which would indicate new commits in github that are not in the private repositories:
$ for r in fwknop psad gpgdir fwsnort IPTables-Parse IPTables-ChainMgr
> do
> echo "[+] Checking $r...";
> diff -u <(cd ~/private/$r.git && git rev-list --all ) <( cd ~/GH/$r.git && git rev-list --all) | egrep "^\+" | grep -v @
> done
[+] Checking fwknop...
+++ /dev/fd/62  2012-03-07 22:50:48.004281002 -0500
[+] Checking psad...
+++ /dev/fd/62  2012-03-07 22:50:48.054281002 -0500
[+] Checking gpgdir...
+++ /dev/fd/62  2012-03-07 22:50:48.164281002 -0500
[+] Checking fwsnort...
+++ /dev/fd/62  2012-03-07 22:50:48.194281002 -0500
[+] Checking IPTables-Parse...
[+] Checking IPTables-ChainMgr...
Again, good, no commits in github that are not in the private repositories. If there had been a line like the following I would have been concerned:
+fff688f5b4275152636d8959f67bbcd46839fbbb
Rather than modifying code and committing it to a git repository on github, it would have been far more damaging for an attacker to just alter the github website to serve up drive by exploits for a popular web browser. Either way, I'm glad they fixed the vulnerability.

On SPA Cross-Packet Ciphertext Entropy

fwknop SPA packet entropy With fwknop now re-written in C for the 2.0 release, I thought it would be a good idea to take a look at how close encrypted SPA packet data comes to having high levels of entropy - as understood to be a measure of randomness - from one packet to the next. If fwknop is properly using encryption, and the ciphers themselves are also well-implemented (fwknop can use either Rijndael or GPG), then we would expect there to be no obvious relationship between SPA packets even for repeated access requests to the same service. If there are any such relationships in the encrypted data across multiple SPA packets, then an adversary might be able to infer things about the underlying plaintext - precisely what strong encryption is supposed to make difficult. This blog post covers SPA packet entropy for AES (Rijndael) CBC and ECB encryption modes, and leaves GPG to another post.

Although this post has some similarities with an older blog entry "Visualizing SPA Packet Randomness", a more rigorous and automated way of measuring cross-packet SPA entropy will be presented. In addition, we'll take a look at what happens when (normally) random salt values for AES encrypted SPA packets are artificially forced to be constant. This helps to highlight some real differences in AES electronic codebook (ECB) and cipher block chaining (CBC) encryption modes.

First, the next release of fwknop will most likely offer the ability to select different AES encryption modes (such as cipher feedback (CFB) mode and output feedback (OFB) mode), and a dedicated "crypto_update" branch has been created for this work. The default AES encryption mode used by fwknop is cipher block chaining (CBC) mode as defined here. Within the crypto_update branch there is a new script "spa-entropy.pl" that is designed to execute the fwknop client multiple times, collect the encrypted SPA packet data, use the ent program to measure the entropy in slices for each byte position across the SPA data set, and then plot the results with gnuplot. What does this accomplish? It allows us to easily see for any given byte position within a collection of SPA packets whether there is a relation from one to the next. If there is such a relation, then the cipher used to encrypt the data was not very good at achieving high levels of entropy in the ciphertext across multiple packets.

As a motivating example from Wikipedia, AES in ECB mode encrypts identical plaintext blocks into identical ciphertext blocks, and this results in patterns in plaintext data being preserved to some extent in the ciphertext. So, an adversary can make good guesses about the underlying plaintext just by looking at the ciphertext! Wikipedia does a nice job of illustrating this with the following two images of the Linux kernel mascot "Tux" - before and after AES encryption in ECB mode:

plaintext Tux       AES ECB encryption ->                     AES ECB encrypted Tux
Encryption Fail.

Now, let's take a look at SPA packet entropy with the spa-entropy.pl script. For reference, fwknop builds SPA packets according to the following data format before encryption:

[random data: 16 bytes]:[username]:[timestamp]:[version]:[message type]:[access request]:[digest]

So, if a user wants repeated access to the same service protected behind fwknopd on some system, then several fields above will be identical across the corresponding SPA packets before they are encrypted. The username, version, message type, and access request fields will likely be the same. If fwknop has made proper use of encryption, then the fact that these fields are the same across multiple SPA packets should not matter. After encryption, an observer should not be able to tell anything about the underlying plaintext (other than perhaps size since AES is a block cipher). Let's verify this for 1,000 SPA packets encrypted with the default CBC mode - they are all encrypted with the same key 'fwknoptest' by the spa-entropy.pl script: $ ./spa-entropy.pl -f 1000_pkts.data -r -c 1000 --base64-decode
[+] Running fwknop client via the following command:

LD_LIBRARY_PATH=../../lib/.libs ../../client/.libs/fwknop -A tcp/22 -a 127.0.0.2 -D 127.0.0.1 --get-key local_spa.key -B 1000_pkts.data -b -v --test -M cbc

[+] Read in 1000 SPA packets...
[+] Min entropy: 7.75 at byte: 54
[+] Max entropy: 7.86 at byte: 115
[+] Creating entropy.gif gnuplot graph...
This produces the gnuplot graph below. Perfectly random data would produce 8 bits of entropy per byte, and the min/max values of 7.75 and 7.86 along with the fairly uniform distribution of similar values across all of the SPA byte positions implies that there is little relation from one SPA packet to the next - good. SPA entropy for CBC mode As an aside, here is what ent reports against the local /dev/urandom entropy source on my Linux system, and it is the "Entropy =" line that spa-entropy.pl parses for each SPA byte slice: $ dd if=/dev/urandom count=1000 |ent
1000+0 records in
1000+0 records out
512000 bytes (512 kB) copied, 0.128497 s, 4.0 MB/s
Entropy = 7.999625 bits per byte.

Optimum compression would reduce the size
of this 512000 byte file by 0 percent.

Chi square distribution for 512000 samples is 265.77, and randomly
would exceed this value 50.00 percent of the times.

Arithmetic mean value of data bytes is 127.5076 (127.5 = random).
Monte Carlo value for Pi is 3.138715386 (error 0.09 percent).
Serial correlation coefficient is -0.001293 (totally uncorrelated = 0.0).
Now, let's switch to ECB mode and see what happens (just run the spa-entropy.pl script with '-e ecb'): SPA entropy for ECB mode Well, that still looks pretty good. Revisiting the ECB encrypted image of Tux above for a moment - the reason that the Tux outline can be seen in the encrypted version is that in the JPG image file there must be identical blocks in multiple locations to represent the solid black regions. These blocks are all encrypted in the same way by AES in ECB mode, so the outline persists. But, this is one instance of ECB encryption against a file that has multiple identical blocks. For the encrypted SPA packets, we're dealing with 1,000 separate instances of encrypted data (all with the same key). Across this data set there are certainly lots of identical plaintext blocks (all of the SPA packets request access for source IP 127.0.0.2 to destination port tcp/22 for example), but the encrypted data still shows a high level of entropy. This source of entropy is provided by the random salt values that are used to generate the initialization vector and final encryption key for each encrypted SPA packet. As proof, if we apply the following patch to force the salt to zero for all SPA packets (of course, one would not want to use this patch in practice): $ git diff lib/cipher_funcs.c
diff --git a/lib/cipher_funcs.c b/lib/cipher_funcs.c
index 0a0ce3b..32c8bd6 100644
--- a/lib/cipher_funcs.c
+++ b/lib/cipher_funcs.c
@@ -153,6 +153,8 @@ rij_salt_and_iv(RIJNDAEL_context *ctx, const char *pass, const unsigned char *da
         get_random_data(ctx->salt, 8);
     }

+    memset(ctx->salt, 0x00, 8);
+
     /* Now generate the key and initialization vector.
      * (again it is the perl Crypt::CBC way, with a touch of
      * fwknop).
Here is what spa-entropy.pl reports after recompiling fwknop with the patch above: SPA entropy for ECB mode zero salt Now we can easily see where there are identical blocks across the SPA packet data set. The first eight bytes contains the salt, so these are all zero (note that fwknop strips the usual "Salted__" prefix before transmitting an SPA packet on the wire). The next 16 bytes are the random bytes that fwknop includes in every SPA packet so these bytes have high entropy. Next up are the username and timestamp - the later changes with each second, so there is some entropy there since it takes a few seconds to create the 1,000 SPA packet data set. Then the entropy goes back to zero with the next fields and there isn't any decent entropy until the final message digest.

As a final contrasting case, let's leave the patch applied to force the salt to zero, but now switch back to CBC mode: SPA entropy for CBC mode zero salt In CBC mode, the random data included by the fwknop client now results in decent entropy even though the salt is zero. This is because every ciphertext block in CBC mode depends on all previous plaintext blocks, so randomness in one plaintext block implies that every subsequent encrypted block will look different from one SPA packet to the next. This graphically shows that CBC mode is a better choice for strong security. Now, if the pseudo random number generator on the local operating system is poorly implemented, this will negatively impact ciphertext entropy regardless of the encryption mode, but still CBC mode is a better alternative than ECB mode.

Although spa-entropy.pl is geared towards measuring SPA packet entropy, this technique could certainly be generalized to arbitrary collections of ciphertext. If you know of such an implementation, please email me.

Bing Indexing of gitweb.cgi Links

Bing indexing of gitweb.cgi links In June, 2011, all of the cipherdyne.org software projects were switched over to git from svn, and at the same time the web interface was switched to gitweb (along with hosting at github) from trac. Given the switch, I knew there would be a change to how search engines indexed the code/data, and one question would be whether any particular search engine would take a specific interest in the code provided via git and/or gitweb. Note that each of the fwknop, psad, fwsnort, and gpgdir projects have raw git repositories that can be cloned directly over HTTP from cipherdyne.org (a nice feature of git), or viewed with any browser through gitweb. (Personally, I like the "links2" text-based browser rendering of gitweb pages - nice and clean.)

First, here are some stats for indexing bots from major search engines across all cipherdyne.org Apache log data for hits against gitweb.cgi from June, 2011 to today:

HitsPercentageUser-Agent
50505581.01%Mozilla/5.0 (compatible; bingbot/2.0;)
502428.06%msnbot/2.0b (+http://search.msn.com/msnbot.htm)._
257074.12%Mozilla/5.0 (compatible; Ezooms/1.0; ezooms.bot@gmail.com)
65831.06%Feedfetcher-Google; (+http://www.google.com/feedfetcher.html;)
43100.69%Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
19560.31%Mozilla/5.0 (compatible; SISTRIX Crawler; http://crawler.sistrix.net/)
19050.31%Mozilla/5.0 (compatible; Purebot/1.1; +http://www.puritysearch.net/)
17510.28%Mozilla/5.0 (compatible; Yahoo! Slurp;)
16250.26%Mozilla/5.0 (compatible; MJ12bot/v1.4.0;)
14510.23%TwengaBot-Discover (http://www.twenga.fr/bot-discover.html)


Wow! So bots associated with Microsoft's Bing search engine take the top two spots for a combined hit total of well over 500,000 since June, 2011. If spread out over the entire time period (which it's not as we'll see) that would be an average of about 2,600 hits per day, and this figure is more than 20 times the third place bot. Google is in a distant forth place, even though Google used to heavily index Trac repositories.

So, let's see how the search engine hits are distributed since June, 2011. First, here is a graph of just gitweb hits by the top five crawlers: top 5 gitweb indexers Clearly, that is not a very uniform distribution from day to day. It looks like Bing has been hitting the gitweb interface at a rate of over 17,000 hits per day for a significant portion of late December and early January. The other search engines hardly even show up in the graph - you know there are big spikes when everything looks better on a logarithmic scale: top 5 gitweb indexers logarithmic With some additional work, it looks like the gitweb.cgi links that Bing is indexing are not all unique. That is, one might expect that Bing would hit a link, grab the content, and then not return to the same link for a while. Some gitweb.cgi links were hit more than 10 times and more than 100,000 links were hit more than once during this time period.

How does this compare with hits across other portions of cipherdyne.org? Bing indexing is still far and away the largest outlier: top 5 indexers of cipherdyne.org Given that 1) all of the information gitweb displays is derived from the underlying git repositories, and 2) the git repositories are directly accessible via HTTP anyway, it would seem that a better way for search engines to behave would be to just ignore gitweb altogether and pull directly from git. That would certainly cut down on the server-side resources necessary to service search engine requests. Perhaps though the general strategy of search engines is not to be too smart about such things - they probably just want access to data, and when they see a link they go after it. Either way, the kind of dedicated and repetitive indexing the Bing is doing against gitweb is a bit much, and it certainly seems as though they could implement a less intensive crawler. I'm curious if other server admins are seeing similar behavior.

Update 01/23: There are tons of web analysis tools out there, but I wrote a couple of quick scripts to generate the data in this blog post. The first "user_agent_stats.pl" parses Apache logs and produces user-agent graphs with Gnuplot as shown in this post. The second "uniq_hits.pl" is extremely simple and just counts the number of hits against the same links within the Apache log data. Both scripts accept log data via TDIN - here is an example where user agents who hit any "index.html" link are plotted (graph is not shown): $ zcat ../logs/cipherdyne.org*.gz |grep "index.html" | ./user_agent_stats.pl -p index_hits
[+] Parsing Apache log data...
[+] Total agents: 1769 (abbreviated to: 174 agents)
[+] Executing gnuplot...
Plot file: index_hits.gif
Agent stats: index_hits.agents

Software Release - fwknop-2.0

fwknop-2.0 released After a long development cycle, fwknop-2.0 has been released. This is the first production release of the fully re-written C version of fwknop, and is the culmination of an effort to provide Single Packet Authorization to multiple open source firewalls, embedded systems, mobile devices, and more. On the "server" side, supported firewalls now include iptables on Linux, ipfw on FreeBSD and Mac OS X, and pf on OpenBSD. The fwknop client is known to run on all of these platforms, and also functions on Windows systems running under Cygwin. There is also an Android client, and a good start on a iPhone client as well. On a personal note, I wish to thank Damien Stuart for a heroic effort to port most of the original perl code over to C. Also, several other people have made significant contributions including Jonathan Bennet, Max Kastanas, Sebastien Jeanquier, Ozmart, and others. If there are any issues, please get in touch with me directly or send an email to the fwknop mailing list.

Update 01/03: Both libfko library that powers much of fwknop operations and the fwknop client can be compiled as native Windows executables. In addition, there are perl and python bindings to libfko as well.

Update 01/07: Damien Stuart has built RPM files for fwknop on RHEL5, RHEL6, Fedora 15, 16, and 17 and for other architectures the Fedora koji build system can produce.