[Snort-users] Draft: Database layout

Mike Andersen mike at ...207...
Thu Aug 10 09:48:57 EDT 2000


[Jed Pickel]
|
| >  * What is the best way to represent TCP flags in the database?
| 
| I think we will stick with the current unless anyone is strongly
| opposed to this. 

This is one of the three things I've really wanted to change in the
current database format (the other two are the IP src/dst addresses and
the possibility to store all available data from the header(s)).

In other words, I'm strongly opposed to the current way of storing this
data. :-) The reason is that I believe that the select query for listing
all packages with (for example) the SYN flag set, will be much more
complex when you also have to consider all the possible combinations
that SYN exists in.

| > ##   	Question:
| > ##		Which format is better for flags, true/fals or all
| > ##            bits represented in one TINYINT?
| 
| There would be some wasted space even with using a tinyint if we
| use one for each field. Sticking with the current way they all go
| into one tinyint.

I agree to the argument of wasted space, but if the alternative is a
more complex SQL query, I would argue that disk is an inexpensive
resources. ;-)  For 20M packages that are using flags, we would waste
around 83MB of disk space.  IMHO it's worth it. 

I have to add that I don't know much about the pro/cons related to the
database performance here.

| > Hmm... does anyone know any good tools that I can use to draw a
| > database diagram?  And runs under Linux? 
| 
| Good question. I guess emacs doesn't count. ;) Interested to hear
| if anyone has an answer.

Hehe... I was actually thinking about using emacs (which I'm using to
almost everything else).  It should be possible to draw a database
layout in plain ASCII. :-)

| > ## Changes  from the original:
| > ##	Added:
| > ##		- tstamp	Timestamp (automaticly updated in 
| > ##              MySQL)
| 
| I am opposed to this because a timestamp is not part of an IP
| header. 

Even if it's not a part of the header, it's still a very important peace
of information -- and I believe that this is the most natural place to
put it.  Anyway, this is not a big issue since it's possible to solve it
in other ways (MySQL update it automatically, and other databases can
use triggers).

| The event table is intended for this sort of data. To associate an
| ip header with a timestamp you could just join with the event
| table.

Which means that an event is generated for each packet (something I
don't like).  I believe I'm thinking on something else when we are
talking about the event table.

Hmm...  Since we are on the snort list, it's natural to think of
database as a part of snort, but I'll try to explain how I'm seeing
this:

  +---------+      +--------+     +---------------------+
  | Sniffer +----->+ SQL DB +<--->+ analyst application |
  +---------+      +----+---+     +---------------------+
                        ^
                        |
                        v
              +---------+---------+
              | Tools for finding |
              |     anomalies.    |
              +-------------------+

The tools for finding anomalies is using the event table too tell us
where in the ip table we can find suspicious traffic.  The tool can be a
part of the sniffer, but might also be a separate tool that analyses the
data in the database.

As soon as we have the sniffer, DB and the analyst application ready,
it's going to be much easier for others to come up with brilliant tools
to _find_ the anomalies...


I'll better get back to the topic of this discussion. :-)


| > ##		- ip_ver	The IP version	      
| > ##		- ip_ihl	Internet Header length
| > ##		- ip_len	Total Length (of datagram)
| > ##		- ip_flags	Various control Flags
| 
| Hmm... I just noticed that the IPHdr structure in decode.h does not
| have a field for the ip_flags. I guess you could get that by doing
| a bitmask with the first three bits of the fragment offset. I will
| do some testing to see if this works as I expect.

I don't know if it's important to have separate rows for ip_flags and
ip_off as long as the information is available.

[Jed Pickel] (About changing the naming convention)
| 
| This comment makes sense, but th is consistent with how these
| values are represented in the snort source; thus, I am reluctant to
| make that change. Anyone else feel strong about this? 
:
| It may be an option to update the snort source code to the ways
| that you suggest. I will see how hard that would be.

The reason I suggested the name change, was to get a more consistent
naming, which increases the readability.  If it's too difficult to
achieve, I'll jump down from this barricade... ;-)

| > ##		- tcp_chks	Header Checksum
| > ##		- tcp_opt	Options
| 
| Because options are optional, and there are an unknown number of
| them we are using in this field an option id number. Actual option
| data is then stored in a separate table with one option per row. So
| then to see all of the options in this table you would SELECT in
| the option table based on the option id. You will see when this is
| ready to be released.

Ahh... I believe I understand what you mean here, and I like it.  If
it's zero, there are no options -- else it points me to the option
table?  Will this table also contain the raw data?


| > ## Changes from the original:
| > ##	The names starts with icmp
| > ##
| > ##	Added:
| > ##		- icmp_chks	Checksum
| > ##		- icmp_div	Used depending on icmp_type
| 
| I am not quite sure what the "div" field is. I will have to take a
| look at that RFC.

It's used for different things depending on the value of icmp_type. It
does not seem that 32bits is enough if we want to collect all the data.
ICMP type 13 and 14 (timestamp and timestamp reply messages) are using
4x32bits.  And type 8 and 0 (echo and echo reply) has not defined the
length...

What about another pointer to the option table?


mike
-- 
Lisp Users:
Due to the holiday next Monday, there will be no garbage collection.





More information about the Snort-users mailing list