[Snort-users] String matching in snort.

C. Jason Coit jasonc at ...155...
Sat May 18 15:01:02 EDT 2002

Matt and Ashley,

Thanks for reading the paper, you should also check out the simultaneous
work done by George Varghese and Mike Fisk (whose code, I believe, is
being included in the 1.9 branch of Snort) at

Neil Desai also wrote a good technical paper which discusses pattern
matching in Snort that can be found at the Snort website in the News
section.  It is titled "Increasing Performance in High Speed NIDS: A
Look at Snort's Internals." I highly suggest anyone interested in
pattern matching and Snort read this paper.

It is important to understand that the work done for my paper was
intended as proof of concept and as incentive to look deeper into
the benefits of advanced string matching methods for IDS.  It was not
meant to be used as is for production quality Snort.  Thus the quick and
dirty implementation has memory requirements and forced
case insensitivity that was left to be addressed in further work on
integrating different pattern matching behavior into Snort.
Mike Fisk and George Varghese implemented a library of pattern matching
algorithms to improve Snort's performance.  I believe one of their
setwise algorithms is currently being incorporated into Snort by Chris
Green and Mike Fisk---right Chris?).

    > It also is optimized for improving the performance of string matches
    > where the content contains repeated substrings. In general most snort
    > rules can be written in a manner which avoids such problems, and in
    > it's been repeatedly said that rule writers should avoid repetitive
    > strings whenever possible.

The key ideas of the work done by Stuart Staniford Joe McAlerney and
myself, as well as the concurrent work done by Mike Fisk, and George
Varghese is grouping similar content strings to be searched in a setwise
manner.  Setwise pattern matching if improves performance by eliminating
pointless searches. If "hello" is not found in a packet it makes no
sense to then search for "hello world".  Matt mentions most rules don't
or shouldn't be written to contain repeated substrings.  To see that in
practice many such commonalties exist, take a look at many of webserver
rules.  Several content based rules indeed have substrings in common.
For this reason as the number of rulesets increases, the setwise pattern
matching algorithms scale much better than the repeated applications of
standard  Boyer-Moore.

Obviously, the setwise string matching won't help for rules that don't
require content matching. The real performance enhancements of setwise
string matching are dependent on many factors including the number of
rules used that require content matching, the amount of traffic that
triggers these rules, and the content similarity.

   > And yes, the all-content rule case went up quite a lot in speed, but
   > that's completely not realistic in the case of snort. It does show that
   > that aspect of snort was improved quite a bit, but also shows (when
   > combined with the other results) that improving that aspect quite a lot
   > does not radically improve general performance.


The all content tests were just to demonstrate the speedup of the string
matching algorithm itself.  While it is not all to realistic, some
environment may be much more reliant on packet content string matching
than others depending on the factors I previously mentioned.

The current work being done on setwise pattern matching
by Mike Fisk which is being incorporated into Snort (1.9?) by Chris
Green should be beneficial to the overall performance of Snort.  In most
environments you should see some improvement and in others you should
see vast improvements in speed.

Thanks for the interest in setwise pattern matching.




+--                                  --+
|   C. Jason Coit Programmer/Analyst   |
|    Silicon Defense: IDS Solutions    |
|    http://www.silicondefense.com/    |
+--                                   -+

More information about the Snort-users mailing list