[Snort-users] Dectecting Social Security Numbers?

Matt Kettler mkettler at ...4108...
Fri Sep 10 14:59:32 EDT 2004

At 04:10 PM 9/10/2004, Harper, Patrick wrote:
>alert ip $HOME_NET any -> $EXTERNAL_NET any
>(pcre:"/[1-9]{3,3}[-][1-9]{2,2}[-][1-9]{4,4}/"; msg:"SSN# in clear
>text"; classtype:policy-violation; sid:2000370; rev:2;)

Note: your rule doesn't catch SSN's which contain zeros (mine contains at 
least one zero, so this is valid). use [0-9] or \d instead of [1-9].

PCRE style and optimization suggestions:

For fixed-value repeats, consider using single-value notation instead of 
ranged notation. i.e.: use {3} instead of {3,3}. This is purely style thing 
in perl, and I assume the same of libpcre. Some regex engines might handle 
{n} differently than {n,n} and you might loose some performance here, but I 
dobut it.

If you're not doing a range of characters, don't use []. If you have to use 
punctuation use \ to escape it instead of superfluous braces. ie \- instead 
of [-]. This actually impacts performance and memory consumption in perl, 
and it probably hurts when using libpcre as well.

Take a look at how perl (5.8.0 tested here) handles /[-]/

         $perl -Mre=debug -e "/[-]/"
         Freeing REx: `","'
         Compiling REx `[-]'
         size 12 Got 100 bytes for offset annotations.
         first at 1
         1: ANYOF[\-](12)
         12: END(0)
         stclass `ANYOF[\-]' minlen 1
         Offsets: [12]
                 1[3] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 4[0]

Now look at how perl handles /\-/

         $ perl -Mre=debug -e "/\-/"
         Freeing REx: `","'
         Compiling REx `\-'
         size 3 Got 28 bytes for offset annotations.
         first at 1
         rarest char - at 0
         1: EXACT <->(3)
         3: END(0)
         anchored `-' at 0 (checking anchored isall) minlen 1
         Offsets: [3]
                 1[134624689] 0[0] 3[0]

Note the factor of 4 size difference between the two regexes post-compile 
(size of 12 words vs size of 3 words) and over factor of 3 difference 
between the size of the offset tables (100 bytes vs 28 bytes).  (A detailed 
explanation of this output can be found in man perldebguts.)

Admittedly SA uses libpcre, and isn't using perl, but looking at how perl 
handles a regex can give you a general idea of what constructs are faster 
than others.

More information about the Snort-users mailing list