[Snort-devel] [Snort-users] OS options to monitor traffic over a 1GiB and 10 GiB
vineyard at ...3300...
Sun Jul 1 00:24:23 EDT 2012
Thank you very much for your generous offer to share your work. I always
appreciate collaborating with a peer who shares my passion for
contributing the fruits of our labor back to the community in the name
of the greater good.
Now, with regard to your BPF tricks, that's waaaay beyond what I
imagined was capable of a "low-end" controller like the Intel 82599
series. Especially if you're wanting to do filtering at that granularity
*inline* without inducing an intolerable amount of latency. If you can
pull that off, I would like to personally shake your hand and tell you
that you, sir, are a gentleman *and* a scholar.
If that's what you're looking to do, and you want to do it on
fully-saturated links at line-rate with zero packet loss, you really
can't beat a comprehensive hardware-based solution.
I spent several years working with a programmable FPGA-based Napatech
NT20E (aka the nPulse PCAPExpress), and found it to be an incredibly
capable product. It's also incredibly expensive - when the organization
I was working for in 2009 made the purchase, the cost was around $15k if
I remember correctly. Hopefully it's more affordable these days, but I
haven't checked pricing recently. That said, I had no problems doing
everything you've just described and much more - including in-kernel
zero-copy DMA packet capturing, onboard microsecond-latency
almost-BPF-style dynamic filtering and slicing, and best of all -
hardware capture buffering and real-time packet de-duplication according
to near-arbitrary criteria. Oh, and native 32-way 5-tuple hash-based
Think Gigamon on steroids, in a PCI Express form-factor. They've now got
a native 40Gbps card available as well, using the same unified driver /
firmware / API programming syntax. They also offer a custom libpcap
implementation that can simply be dynamically-linked against an existing
pcap-based application such as Snort or Wireshark.
IMHO, it's the Cadillac of both packet-capture and inline IPS-style
network security analysis and filtering. Napatech also sells a
Tilera-based "inspection" accelerator card. I've not had the pleasure of
working with that one, but from what I understand, there are ways to
compile standard Snort signatures into its native pattern-matching language.
Granted, we're talking some serious money at this point - but if the
goal is to never, ever, drop packets, and to be as close as possible to
a "bump in the wire" - you can't go wrong with Napatech / nPulse. For
what it's worth, nPulse was founded by a small contingency of
highly-capable sales and engineering refugees who became frustrated with
their previous employer, Endace, and wanted to create something better. ;-)
I digress... but I'm hoping to re-produce as much as possible of what I
was able to do with the Napatech / nPulse cards with the latest
generation of Intel networking products. It sounds like you're on the
same page. The fact that Napatech now offers a version of their unified
driver and API stack targeted for the very same Intel chipsets as Luca
Deri's TNAPI/DNA drivers gives me hope that this is an achievable goal.
Looking forward to working with you soon!
On 06/30/2012 07:31 PM, Livio Ricciulli wrote:
> I am all for sharing and collaborating. Use our work for whatever
> purpose you'd like. It would be great to make it into an howto.
> In my experience, more transparency==better quality; so the more you can
> add the better for everyone. Luca is always very helpful..
> Another issue I was exploring with not much success is hardware
> filtering with the 82599 (x520 is just a dual version of it).
> I can translate a subset of bpf expressions to hw registers to install
> into the hardware instead the software bpf. Unfortunately I
> run into a snag.. If anyone can add or subtract from the following I
> would appreciate it, since hw filtering would be very interesting for
> 10G Snort deployments.. Please tell me that I am wrong..
> As far as I can tell the 82599 designers have made very powerful
> hardware exclusively designed
> for endpoint (therefore simplex) filtering applications.
> In plain English here is what the chip is capable of (using bpf language).
> 128 5-tuple /32 rules capable of masking out any of the 5 fields.
> These filters could only handle:
> o 1/2 of a simplex class c.
> For example to implement a bpf like (src net 192.168.1.128/25)
> one could add (src host 192.168.1.128) or (src host 192.168.1.129) or
> ..(src host 192.168.1.255)
> o 1/4 of duplex class c.
> For example to implement a bpf like (net 192.168.1.192/26)
> one could add (src host 192.168.1.192) or (src host 192.168.1.193) or
> ..(src host 192.168.1.255)
> or (dst host 192.168.1.192) or (dst host 192.168.1.193) or ..(dst host
> 8k perfect FDIR filters.
> Unfortunately (&##*&@!!!) these filters have 1 single global 96-bit mask
> value for the source/destination addresses and ports.
> So, you could only implement simplex things like:
> o (src net 192.168.1.0/26 or src net 192.168.2.0/32 or src net
> 10.1.0.0/24) #note that the masks can change because you could
> enumerate 32-bit addresses up to 8k times with the same mask.
> o (src portrange 0-8000) #in this case you would be masking both src and
> dst addresses.
> o (vlan 19 or vlan 20 or vlan 21)
> duplex address and port filters are impossible (afikt).
> So, in short, you can have only very small and simple duplex expressions
> as 5-tuple filters or larger but only simplex FDIR expressions.
> I think the best solution regarding the 82599 hw filtering is to do vlan
> ID filtering..
> Would any of the other address filtering options (very small duplex or
> larger simplex) useful to you?
> I am not sure if it is worth implementing them (even though I could do
> it with a few more days of work).
> I guess if you wanted to do hw filtering using a inline 10Gbps box,
> the simplex FDIR filtering could work by applying mirrored filters
> on both interfaces. For example to implement an expression like
> (net 192.168.1.0/24) eth1 would get src net 192.168.1.0/24 and eth2
> would gets dst net 192.168.1.0/24.
> I am not sure how many people would have a need for hw filtering
> for an inline box based on a dual 82599..
> I am trying to figure out if it is worth pursuing it further..
> What do you think?
> On 6/30/2012 12:21 AM, Robert Vineyard wrote:
>> Hats off to you, it appears you've beat me to the punch :-)
>> You've just done what I was about to do...
>> For what it's worth, I'll be testing on a third-generation E5-1650 with
>> VT-d technology (separate writeup coming soon on a similar, virtualized
>> approach, leveraging IOV and related technologies).
>> I've got an I350 waiting to install when the box arrives on Monday, and
>> if all goes well, I'm going to see about acquiring an X520-based card too.
>> It sounds like you've already done the heavy lifting on this. If you
>> don't mind, I'd like to integrate your methods and lessons learned into
>> my own HOWTO guides -- attributing all credit for that part of work to
>> you, of course.
>> Best regards,
>> Robert Vineyard
>> On 06/29/2012 02:23 PM, livio Ricciulli wrote:
>>> You can also check out
>>> It gives detailed instructions on how to build a reliable PF_RING-based
>>> 10G box on CentOS using the Intel 82599 and compare the relative
>>> performance of PF_RING in NAPI mode and DNA mode. DNA wins but note that
>>> if you use DNA, you can only attach up 16 Snort processes using the
>>> Intel 82599 and attach 1 application to the interface (for example
>>> snort) but you could not (for example) run both Snort and Ntop at the
>>> same time. NAPI lets you do that.
>>> The number of rules you run is extremely important in determining how
>>> much bandwidth you can handle. With our traces we could process 4-6 Gigs
>>> on a dual X5670 and 4000 to 7000 rules respectively.
>>> On 06/29/2012 07:41 AM, Robert Vineyard wrote:
>>>> On 6/29/2012 9:23 AM, Joel Esler wrote:
>>>>> Probably BSD. But I think it's less dependent on the OS, and is more dependent on hardware. When you are talking about 10 Gig, there's lots of factors that come into play.
>>>> Some hardware options I'd recommend, in decreasing order of cost:
>>>> To monitor that much traffic reliably, you're going to have to employ a load-balancing technique. The best way I've found to go about doing that is to use something that can perform a hash function on the 5-tuple of any given flow. The 5-tuple is composed of the source and destination IP addresses, ports, and protocol. Hashing in this manner ensures that traffic is distributed roughly evenly, and that bidirectional conversations are preserved and sent to the same sensor engine.
>>>> The more expensive products in my list above can do this in hardware, often using FPGA tricks and DMA buffering to dramatically accelerate the process. When you're trying to monitor a fully-saturated link, every CPU cycle counts.
>>>> The less expensive products (typically from Intel or Myricom) can do some of it in hardware, but they really shine when you pair them with capture-optimized drivers like PF_RING DNA (http://www.ntop.org/products/pf_ring/dna/) or Myricom Sniffer10G (http://www.myricom.com/support/downloads/sniffer.html).
>>>> In any case you'll want a big server with lots of CPU cores and as much RAM as you can afford. If you'll be logging payloads and/or expect heavy alert volumes, you'll also need fast disk, like SSD or a hardware RAID10 array. The idea is to run multiple sensor engines (Snort, for example) and bind each one to one of the load-balanced virtual network interfaces presented by the setup I just described. If your traffic is fairly predictable or you have plenty of headroom on your sensor box, you can use CPU affinity to peg those engines to individual cores (there are ways to do this for the firehose of interrupts coming from the NIC too) to avoid spurious context-switching and buy yourself a few more precious CPU cycles. You'll want to run one sensor process per core.
>>>> On the other hand, if your traffic is bursty and unpredictable, you may want to forgo the CPU affinity and let the kernel scheduler do its job. For cases like that, I prefer to run two sensor processes per core (doubling the number of required virtual interfaces on your packet-capture NIC). That way, the chunks are smaller and if one needs to burst up to consume a full CPU core, the kernel scheduler will happily relocate the lesser-utilized processes to other cores.
>>>> Happy sniffing! :-)
>>>> -- Robert Vineyard
>>>> Live Security Virtual Conference
>>>> Exclusive live event will cover all the ways today's security and
>>>> threat landscape has changed and how IT managers can respond. Discussions
>>>> will include endpoint security, mobile security and the latest in malware
>>>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>>>> Snort-users mailing list
>>>> Snort-users at lists.sourceforge.net
>>>> Go to this URL to change user options or unsubscribe:
>>>> Snort-users list archive:
>>>> Please visit http://blog.snort.org to stay current on all the latest Snort news!
>>> Live Security Virtual Conference
>>> Exclusive live event will cover all the ways today's security and
>>> threat landscape has changed and how IT managers can respond. Discussions
>>> will include endpoint security, mobile security and the latest in malware
>>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>>> Snort-users mailing list
>>> Snort-users at lists.sourceforge.net
>>> Go to this URL to change user options or unsubscribe:
>>> Snort-users list archive:
>>> Please visit http://blog.snort.org to stay current on all the latest Snort news!
>> Live Security Virtual Conference
>> Exclusive live event will cover all the ways today's security and
>> threat landscape has changed and how IT managers can respond. Discussions
>> will include endpoint security, mobile security and the latest in malware
>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>> Snort-users mailing list
>> Snort-users at lists.sourceforge.net
>> Go to this URL to change user options or unsubscribe:
>> Snort-users list archive:
>> Please visit http://blog.snort.org to stay current on all the latest Snort news!
More information about the Snort-devel