[Snort-devel] sdf preprocessor: partial matches/false positives

Bram bram-fabeg at ...3414...
Fri Jul 19 17:21:23 EDT 2013


Hi,


There appears to be an issue with the sdf preprocossor: when the regex  
partially matches at the end of a data packet then the match count is  
increased.
This then results in false positives.

Attached are two capture files:
* sdf_0.cap: this contains: 2048 x A, 2048 x B, 2048 x C, 2048 x D;
* sdf_1.cap: this contains: foo at ...3417..., 2048 x A, 2048 x B, 2048 x C, 2048 x D;

My expectation is/was that:
* sdf_0.cap: E-Mail address pattern matched 0 times
* sdf_1.cap: E-Mail address pattern matched 1 time

Instead:
* sdf_0.cap: E-Mail address pattern matched 4 times
* sdf_1.cap: E-Mail address pattern matched 5 times


Snort config used:
	dynamicpreprocessor directory /usr/lib/snort_dynamicpreprocessor/
	preprocessor stream5_global: track_tcp yes, \
	    track_udp no, \
	    track_icmp no, \
	    max_tcp 262144, \
	    max_udp 131072,
	preprocessor stream5_tcp: policy first, ports client 25

	preprocessor sensitive_data: alert_threshold 250

	config event_queue: max_queue 15 log 15 order_events content_length

	alert tcp any any -> any any (msg:"Email - count = 1";  
sd_pattern:1,\w@\w; sid:201; gid:138; )
	alert tcp any any -> any any (msg:"Email - count = 2";  
sd_pattern:2,\w@\w; sid:202; gid:138; )
	alert tcp any any -> any any (msg:"Email - count = 3";  
sd_pattern:3,\w@\w; sid:203; gid:138; )
	alert tcp any any -> any any (msg:"Email - count = 4";  
sd_pattern:4,\w@\w; sid:204; gid:138; )
	alert tcp any any -> any any (msg:"Email - count = 5";  
sd_pattern:5,\w@\w; sid:205; gid:138; )
	alert tcp any any -> any any (msg:"Email - count = 6";  
sd_pattern:6,\w@\w; sid:206; gid:138; )

	output alert_fast: stdout


Note: in the config a custom pattern was used for clarity.
This is just as reproducible with the 'email' pattern.
That is: "sd_pattern:1,email" instead of "sd_pattern:1,\w@\w;"


Running it
	$ snort -v -l /var/log -c /etc/ips/snort.conf --daq-dir /lib/daq/ -r  
/tmp/sdf_0.cap 2>&1 | grep '138'
	07/22-05:35:32.774356  [**] [138:201:0] Email - count = 1 [**]  
[Priority: 0] {TCP} 192.168.173.1:53313 -> 192.168.173.153:25
	07/22-05:35:32.774573  [**] [138:202:0] Email - count = 2 [**]  
[Priority: 0] {TCP} 192.168.173.1:53313 -> 192.168.173.153:25
	07/22-05:35:32.774718  [**] [138:203:0] Email - count = 3 [**]  
[Priority: 0] {TCP} 192.168.173.1:53313 -> 192.168.173.153:25
	07/22-05:35:32.774856  [**] [138:204:0] Email - count = 4 [**]  
[Priority: 0] {TCP} 192.168.173.1:53313 -> 192.168.173.153:25

	$ snort -v -l /var/log -c /etc/ips/snort.conf --daq-dir /lib/daq/ -r  
/tmp/sdf_1.cap 2>&1 | grep '138'
	07/22-05:37:38.020377  [**] [138:201:0] Email - count = 1 [**]  
[Priority: 0] {TCP} 192.168.173.1:53321 -> 192.168.173.153:25
	07/22-05:37:38.020377  [**] [138:202:0] Email - count = 2 [**]  
[Priority: 0] {TCP} 192.168.173.1:53321 -> 192.168.173.153:25
	07/22-05:37:38.020581  [**] [138:203:0] Email - count = 3 [**]  
[Priority: 0] {TCP} 192.168.173.1:53321 -> 192.168.173.153:25
	07/22-05:37:38.020718  [**] [138:204:0] Email - count = 4 [**]  
[Priority: 0] {TCP} 192.168.173.1:53321 -> 192.168.173.153:25
	07/22-05:37:38.020856  [**] [138:205:0] Email - count = 5 [**]  
[Priority: 0] {TCP} 192.168.173.1:53321 -> 192.168.173.153:25

Looking with gdb:
	(gdb) set args -v -l /var/log -c /etc/ips/snort.conf --daq-dir  
/lib/daq/ -r /tmp/sdf_0.cap
	(gdb) break spp_sdf.c:281
	...
	Breakpoint 1, SDFSearch (config=config at ...3418...=0x8ea2350,  
packet=packet at ...3418...=0x8eba130, session=session at ...3418...=0x8ebd950,  
position=0x8f6fa19 "A", end=0x8f6fa1a "", buflen=1)
	    at spp_sdf.c:281
	(gdb) print buflen
	$1 = 1
	(gdb) print position
	$2 = 0x8f6fa19 "A"
	(gdb) print matched_node
	$3 = (sdf_tree_node *) 0x8ec2338
	(gdb) print *matched_node
	$4 = {pattern = 0x8ec2128 "\\w@\\w", num_children = 0,  
num_option_data = 6, children = 0x0, option_data_list = 0x8ed3180}
	(gdb) c
	Continuing.
	07/22-05:35:32.774356  [**] [138:201:0] Email - count = 1 [**]  
[Priority: 0] {TCP} 192.168.173.1:53313 -> 192.168.173.153:25

It matched the pattern '\w@\w' on the character 'A'...


src/dynamic-preprocessors/sdf/sdf_pattern_match.c: FindPiiRecursively:  
line 501 - 509:
     /* Match pattern buf against current node. Evaluate escape sequences.

        NOTE: node->pattern is a NULL-terminated string, but buf is  
network data
              and may legitimately contain NULL bytes. */
     while (*buf_index < buflen &&
            *(node->pattern + pattern_index) != '\0' &&
            node_match )
     {
         /* Match a byte at a time. */

src/dynamic-preprocessors/sdf/sdf_pattern_match.c: FindPiiRecursively:  
line 576 - 592:
         }
     }

     if (node_match)
     {
         int i = 0;
         uint16_t j = 0;
         int node_contains_matches = 0;

         /* Check the children first. Always err on the side of a  
larger match. */
         while (i < node->num_children && matched_node == NULL)
         {
             matched_node = FindPiiRecursively(node->children[i], buf,  
buf_index, buflen, config);
             i++;
         }

         if (matched_node != NULL)
             return matched_node;


What happens:
* FindPiiRecursively is called with pattern '\w@\w' and buffer 'A' and  
buflen 1
* it goes thru the while loop:
** it sets node_match to 1 because 'A' matches '\w'
** it increases buf_index
** it increases pattern_index with 2 (2 because it matched an escape sequence)
* it rechecks the while condition:
** buf_index is 1
** buflen is 1
** => while loop is stopped
* after the while loop it checks the value of node_match
* node_match is true
* matching node is returned

The problem: it never checked if all the characters of the pattern were used..

Result: when it partially matches the pattern at the end of the  
buffer/at the end of the packet then it returns true.
This causes plenty of false positive...



A work around to reduce the number of false positive is to check if  
all characters of the pattern were used:

--- src/dynamic-preprocessors/sdf/sdf_pattern_match.c.orig       
2013-07-19 23:43:05.000000000 +0200
+++ src/dynamic-preprocessors/sdf/sdf_pattern_match.c   2013-07-19  
23:42:49.000000000 +0200
@@ -575,6 +575,11 @@
          }
      }

+    if (node_match && *(node->pattern + pattern_index) != '\0') {
+        /* pattern not entirely matched */
+        node_match = 0;
+    }
+
      if (node_match)
      {
          int i = 0;


This work-around is obviously not ideal because it will fail to match  
when the data is split in multiple packets.
That is: 'foo at ...3417...' split in two packets:
* packet 1 ends with 'foo'
* packet 2 starts with '@bar'

Ideally: when it sees a partial match it would retry the pattern when  
more data is available.
In order to do this it would need to remember the data that partially  
matched...


Running snort with the work-around applied:
	$ snort -v -l /var/log -c /etc/ips/snort.conf --daq-dir /lib/daq/ -r  
/tmp/sdf_0.cap 2>&1 | grep '138'

	$ snort -v -l /var/log -c /etc/ips/snort.conf --daq-dir /lib/daq/ -r  
/tmp/sdf_1.cap 2>&1 | grep '138'
	07/22-05:37:38.020377  [**] [138:201:0] Email - count = 1 [**]  
[Priority: 0] {TCP} 192.168.173.1:53321 -> 192.168.173.153:25



Best regards,

Bram

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: sdf_0.cap
Type: application/octet-stream
Size: 9872 bytes
Desc: not available
URL: <https://lists.snort.org/pipermail/snort-devel/attachments/20130719/5dad59b9/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sdf_1.cap
Type: application/octet-stream
Size: 9879 bytes
Desc: not available
URL: <https://lists.snort.org/pipermail/snort-devel/attachments/20130719/5dad59b9/attachment-0001.obj>


More information about the Snort-devel mailing list