[Snort-users] newbie: http and uris

Matt Kettler mkettler at ...4108...
Wed Apr 13 14:16:05 EDT 2005


mosquitooth at ...158... wrote:

>Hi,
>
>I've got some (newbie) questions concerning http and especially URIs I
>couldn't find an answert to - but nethertheless I do need the answers to
>write snort rules with the "uricontent" keyword.
>
>- What does the string "\....\" in an URI mean? There are some hints on
>"directory transversal" - could someone explain this any further?
>  
>
First, ditch your windows roots. For URI's it's / not \. uricontent
normalizes to \ for you, so write your rules the way uri's are supposed
to be.

That said,  /..../ is supposed to be invalid, but some (broken) products
have strange parsers that interpret it as /../../ or /../

 /../ is standard, and means go up one directory. /../../ would be go up 2.

>- Every whitespace character in an URI is replaced by a "+" when encoded to
>html (correct?).
>
I don't think that's correct. AFAIK whitespace should be encoded as %20,
not a +. Usually + is used for spaces in CGI parameters, not URI targets.

ie:
http://www.google.com/search?hl=en&q=foo+bar&btnG=Google+Search

Everything after the ? is all parameters to the CGI script named
"search". & delimits the parameters, + represents spaces within a parameter.

However if I had a document with a space in the name it would be:

http://www.example.com/my%20document.txt


> Now, does snort remove this "+" when it decodes the http
>stream?
>  
>
It will decode %20's. Since I don't think + is proper syntax, I'm unsure
if it will decode that.

>- What is the standard decoding for snort? UTF7, UTF8, Unicode, ASCII...?
>  
>
Not my area of expertise.

>- Several papers I tried to read about the subject contain the term "regular
>expression". What's this?
>  
>
Welcome to the world of computers outside of Microsoft.

A regular expression, aka regex, is a generic search string. Think of it
as being like dos wildcards, but MUCH more flexible. Regexes are a
more-or-less standard feature of most unix utilities, such as grep, and
there's even a posix standard for them.

With regular expressions you can do very, very specific search strings.
Things like [a-z] which will match any letter, but not any other
character such as a number, punctuation, or space. or [b-y] which
excludes a and z as well. There's also flexible repeat options,
backreferencing to a previous portion of the match (used to force
repeated words) etc.

Things like this craziness:
/([a-z]{1,12}) repeats \1/i

Will match any letter sequence that's 1-12 characters long surrounding
both sides of the word "repeats"

ie:

hello repeats hello
boo repeats boo

but not:
supercalafragalizticexpialadocious repeats
supercalafragalizticexpialadocious

Because supercalafragalizticexpialadocious doesn't fit the 12-character
limit.


The regexes used by snort are Perl compatible regexes. Hence pcre. They
use the same regular expression extensions that the Perl language uses.
Perl, being a highly flexible language in terms of string manipulation
supports a lot of very powerful extensions to the standard posix
extended regular expression syntax.

Do some googling, there are lots of references out there on perl regexes.

A few good references on perl regular expressions are:

 http://www.english.uga.edu/humcomp/perl/regex2a.html
 http://www.perldoc.com/perl5.6/pod/perlre.html
 http://www.troubleshooters.com/codecorn/littperl/perlreg.htm
 http://directory.google.com/Top/Computers/Programming/Languages/Regular_Expressions/Perl/

Admittedly these are targeted at perl programming, but the regular
expression syntax is the same.




More information about the Snort-users mailing list