[Snort-devel] Exceeding the Speed of Snort

C. Jason Coit jasonc at ...60...
Wed Jan 31 00:20:50 EST 2001

Silicon Defense is pleased to announce the proof of concept release of the
Aho-Corasick Boyer-Moore (AC_BM) pattern matching algorithm for Snort.  Our
implementation is integrated into Snort 1.6.3.  It is intended as a
motivation to explore advanced pattern matching techniques in NIDS,
particularly Snort. 


Snort currently uses a standard Boyer-Moore searching
algorithm to perform pattern matching.  It repeated applies this
algorithm to examine packet content against a long list of patterns held
by various OTN's that often contain common substrings - one example
being the myriad of WEB-GET rules.  The standard search proceeds
linearly down the OTN list and attempts to match a pattern contained in
a rule against the entire packet.  If no match is found for the content
of a particular rule, Snort will try to match the packet against the
next OTN in the list.  Thus the packet requires several content searches
of the entire packet before a match is found.  Our Aho-Corasick,
Boyer-Moore Style (AC_BM) algorithm takes advantage of the similarities
to examine the packet against multiple patterns at the same time and
reduce the time it takes for a match to be found.  Since a majority of the
rules (786 out of the 854 rules in 10102kany.rules) contain at least one
"content:"  option, we thought any improvement on content searching would
dramatically improve the overall
efficiency of Snort.

The AC_BM algorithm is an application of Boyer-Moore strategies of exact
pattern matching to a set of patterns held in an Aho-Corasick like
keyword tree.  It is essentially an implementation of a "Boyer-Moore
Approach to Exact Set Matching" described by Dan Gusfield in "Algorithms
on Strings, Trees, and Sequences."  In the text, Gusfield outlines an
algorithm which uses suffix trees, and examines the text from left to
right. Our implementation mirrors the Gusfield algorithm - it examines
packet data from right to left and uses a common prefix approach instead
of a common suffix approach.  The AC_BM implementation allows the
various content rules to be placed in a tree that can be searched using
elements of Boyer-Moore.


We tested our implementation on Capture the Flag (CTF) data from Defcon 8
available at www.shmoo.org.  For 786 content rules the AC_BM implementation
took 7.8 seconds while standard Boyer-Moore approach took 25.92 seconds.  

We exchange performance speed for memory.  AC_BM takes approximately 3 times
the memory to run, for the example above standard Snort requires 3684 KB and
ours requires 12312 KB. 


For more information, please visit the website at

where you will find our paper describing the algorithm, results, and
behavioral changes to snort as well as our source code for the AC_BM
integrated into Snort 1.6.3.

or, download these directly from

paper: http://www.silicondefense.com/acbm/speed_of_snort_01_10_2001.pdf
readme: http://www.silicondefense.com/acbm/README.ACBMSEARCH
code: http://www.silicondefense.com/acbm/snort-acbm-1.0.tar.gz

Future Work:
This is a proof of concept implementation.  Thus while it demonstrates the
benefits of advanced pattern matching in Snort, it still needs some work to
be integrated into future versions of Snort.  It is a quick and dirty
approach that relies on the wonderful world of #ifdef's.  The memory required
for the AC_BM can most likely be reduced by implementing a true suffix tree
approach rather than a hybrid Aho-Corasick tree with a node for every
character.  Snort's RTN/OTN list can also be redesigned to better condense
similar options to eliminate unnecessary option checks and facilitate "exact
set pattern matching." 

Silicon Defense does not currently plan to continue this research.  We are
releasing the code for developers to do with as they wish.  

Feel free to email me with questions or comments.


-Jason Coit

|  Jason Coit, Silicon Defense   |
| http://www.silicondefense.com/ |

More information about the Snort-devel mailing list