[Snort-devel] Optimized implementation of AC and AC_Q pattern matching algorithms

Pablo Cantos pablocantos at ...2499...
Mon Jan 28 15:40:31 EST 2013


Hi Abed,

2013/1/28 abed mohammad kamaluddin <abedamu at ...2499...>

> Hi Pablo,
>
> I tried you modification on OCTEON today and was disappointed that
> there was a drop in performance on it, however on Intel there is
> significant improvement in perf. In comparison to unmodified snort
> using Network traffic and ac-q:
>
>                                       cross-compiling (Octeon)
>       gcc (Intel)
> Using your modification - 15% down
>  21% up
> Using mine                  -  11% up
>        12% up
>
> I supposed it. I mean, when I have been working on getting a better
performance for the MPSE a Intel machine was used and I found that in many
cases a better performance was obtained using gcc, but a worse performance
was obtained with icc. This time, probably we won't be able to perform
these tests on the Intel machine until next week.

>
> On Intel, we can remove sindex altogether:
>
>        state = ps[2 + xlatcase[T[0]]]
>
> Yes, I removed it too and I got very similar results, maybe because the
compiler (gcc) removed it by itself.


> Maybe it could be because I am cross compiling ; there were lot of
> wasted cycles in my case. Unless I introduce an intermediate state, I
> am not getting any improvement. I had earlier analyzed this routine in
> detail and each cycle reduced in this macro gave significant perf
> improvements.
>
> Could you please mention the following:
> - Which compiler have you used (gcc/icc version).
> - If you can provide the pcap file you are using for the tests it would
> help.
>
> I have used gcc 4.6.1-9ubuntu3.
One of them (1gb) was downloaded from here:
https://www.itoc.usma.edu/research/dataset/ but I don't remember which it
is exactly.
I will send you tomorrow a private link to download them. Anyway, if
someone else wants to download the pcap files just send me an email and
I'll provide him/her the link.

I will try your modifications on a bigger Intel machine sometime this
> week and report the findings.
>
> I'll be waiting for your news


> Thanks,
> Abed M K


>
> On Sun, Jan 27, 2013 at 12:27 AM, abed mohammad kamaluddin
> <abedamu at ...2499...> wrote:
> >
> > Hi Pablo,
> >
> > That looks neater :) I will try your modifications on our system on
> Monday
> > when I return to work. I used zero-alert traffic to measure perf which
> gave
> > me 11% up on 2.9.0.4.
> >
> > PS: I have noticed a more than 15% overall performance drop on 2.9.4 as
> > compare to 2.9.3.1 when compiled with exactly same options. I haven't
> > analyzed it so far, however this particular optimization has the same
> > relative improvement on all versions (2.9.0.4, 2.9.3.1 and 2.9.4) .
> >
> > Thanks,
> > Abed
> >
> >
> >
> > Message: 2
> >>
> >> Date: Sat, 26 Jan 2013 16:50:37 +0100
> >> From: Pablo Cantos <pablocantos at ...2499...>
> >> Subject: Re: [Snort-devel] Optimized implementation of AC and AC_Q
> >>         pattern matching algorithms
> >> To: snort-devel at lists.sourceforge.net
> >> Message-ID:
> >>
> >> <CADQcQ1OzJDso9FEQhG77dGtSTEiX=J80bxRDMv9t8Yz7WhH0-g at ...2500...>
> >> Content-Type: text/plain; charset="iso-8859-1"
> >>
> >> Hi Abed,
> >>
> >> First of all, thanks for your contribution.
> >>
> >> I have checked your proposal in snort 2.9.1 by using two different pcap
> >> files, one of them (612MB-sized) throws around 900k alerts and the other
> >> one (1GB-sized) throws just 55 alerts. I have used an AMD Turion II
> >> dual-core mobile M520 at 2.3GHz, 512KB cache L2 by each core and 4GB
> RAM.
> >>
> >> These are the performance jumps that I have obtained:
> >>
> >> 612MB file -> 4.24% for MPSE, 5.93% for ac-q
> >> 1GB file -> 7.93% for MPSE, 8.03% for ac-q
> >>
> >> These results were obtained by measuring the times taken to analyze each
> >> pcap file by the MPSE and AC, separately.
> >>
> >> I have been working for several months with Snort to get improvements in
> >> other areas of the MPSE, I have also studied the AC_SEARCH macros and I
> >> didn't find them completely efficients but I didn't go further. Now,
> >> after
> >> seeing your proposal and checking the AC_SEARCH_Q again I thought that
> >> taking small changes it could work even better.
> >>
> >> You have suggested to pre-fetch the new state before was used it:
> >>
> >> #define AC_SEARCH_Q \
> >> +   *acstate_t new_state;* \
> >>     for (; T < Tend; T++) \
> >>     { \
> >>         ps = NextState[state]; \
> >>         sindex = xlatcase[T[0]]; \
> >> +       *new_state = ps[2 + sindex];*  \
> >>         if (ps[1]) \
> >>         { \
> >>             if (MatchList[state]) \
> >>             { \
> >>                 if (_add_queue(&acsm->q,MatchList[state])) \
> >>                 { \
> >>                     if (_process_queue(&acsm->q, Match,data)) \
> >>                     { \
> >>                         *current_state = state; \
> >>                         return 1; \
> >>                     } \
> >>                 } \
> >>             } \
> >>         } \
> >> *-       state = ps[2 + sindex]; \
> >> *
> >> +       *state = new_state;* \
> >>     }
> >>  #endif
> >>
> >> But I think the routine could be more efficient too if we just fetch the
> >> new state in the end, and we move down one line:
> >>
> >> #define AC_SEARCH_Q \
> >>     for (; T < Tend; T++) \
> >>     { \
> >>         ps = NextState[state]; \
> >> *-       sindex = xlatcase[T[0]]; \*
> >>         if (ps[1]) \
> >>         { \
> >>             if (MatchList[state]) \
> >>             { \
> >>                 if (_add_queue(&acsm->q,MatchList[state])) \
> >>                 { \
> >>                     if (_process_queue(&acsm->q, Match,data)) \
> >>                     { \
> >>                         *current_state = state; \
> >>                         return 1; \
> >>                     } \
> >>                 } \
> >>             } \
> >>         } \
> >> *+       sindex = xlatcase[T[0]]; \*
> >>         state = ps[2 + sindex]; \
> >>     }
> >> #endif
> >>
> >> In this way, I think the number of instructions could be reduced too.
> >>
> >> And these are the results that I have obtained:
> >>
> >> 612MB file -> 15.66% for MPSE, 22.13% for ac-q
> >> 1GB file -> 34.49% for MPSE, 35.13% for ac-q
> >>
> >> I will try to repeat these tests using other more powerful computers. In
> >> any case, we will give more information on Monday.
> >> -------------- next part --------------
> >> An HTML attachment was scrubbed...
> >>
> >> ------------------------------
> >>
> >> Message: 3
> >> Date: Sat, 26 Jan 2013 11:26:55 -0500
> >> From: Joel Esler <jesler at ...402...>
> >> Subject: Re: [Snort-devel] Optimized implementation of AC and AC_Q
> >>         pattern matching algorithms
> >> To: Pablo Cantos <pablocantos at ...2499...>
> >> Cc: "snort-devel at lists.sourceforge.net"
> >>         <snort-devel at lists.sourceforge.net>
> >> Message-ID: <2B25430D-9736-49BB-9A81-EC9AF163C28F at ...402...>
> >> Content-Type: text/plain; charset="utf-8"
> >>
> >> Pablo,
> >>
> >> Great stuff. We need to make sure we are testing and modifying against
> >> 2.9.4 though.
> >>
> >> --
> >> Joel Esler
> >> Sent from my iPhone ?
> >>
> >> On Jan 26, 2013, at 10:50 AM, Pablo Cantos <pablocantos at ...2499...>
> wrote:
> >>
> >> > Hi Abed,
> >> >
> >> > First of all, thanks for your contribution.
> >> >
> >> > I have checked your proposal in snort 2.9.1 by using two different
> pcap
> >> > files, one of them (612MB-sized) throws around 900k alerts and the
> other one
> >> > (1GB-sized) throws just 55 alerts. I have used an AMD Turion II
> dual-core
> >> > mobile M520 at 2.3GHz, 512KB cache L2 by each core and 4GB RAM.
> >> >
> >> > These are the performance jumps that I have obtained:
> >> >
> >> > 612MB file -> 4.24% for MPSE, 5.93% for ac-q
> >> > 1GB file -> 7.93% for MPSE, 8.03% for ac-q
> >> >
> >> > These results were obtained by measuring the times taken to analyze
> >> > each pcap file by the MPSE and AC, separately.
> >> >
> >> > I have been working for several months with Snort to get improvements
> >> > in other areas of the MPSE, I have also studied the AC_SEARCH macros
> and I
> >> > didn't find them completely efficients but I didn't go further. Now,
> after
> >> > seeing your proposal and checking the AC_SEARCH_Q again I thought that
> >> > taking small changes it could work even better.
> >> >
> >> > You have suggested to pre-fetch the new state before was used it:
> >> >
> >> > #define AC_SEARCH_Q \
> >> > +   acstate_t new_state; \
> >> >     for (; T < Tend; T++) \
> >> >     { \
> >> >         ps = NextState[state]; \
> >> >         sindex = xlatcase[T[0]]; \
> >> > +       new_state = ps[2 + sindex];  \
> >> >         if (ps[1]) \
> >> >         { \
> >> >             if (MatchList[state]) \
> >> >             { \
> >> >                 if (_add_queue(&acsm->q,MatchList[state])) \
> >> >                 { \
> >> >                     if (_process_queue(&acsm->q, Match,data)) \
> >> >                     { \
> >> >                         *current_state = state; \
> >> >                         return 1; \
> >> >                     } \
> >> >                 } \
> >> >             } \
> >> >         } \
> >> > -       state = ps[2 + sindex]; \
> >> > +       state = new_state; \
> >> >     }
> >> > #endif
> >> >
> >> > But I think the routine could be more efficient too if we just fetch
> >> > the new state in the end, and we move down one line:
> >> >
> >> > #define AC_SEARCH_Q \
> >> >     for (; T < Tend; T++) \
> >> >     { \
> >> >         ps = NextState[state]; \
> >> > -       sindex = xlatcase[T[0]]; \
> >> >         if (ps[1]) \
> >> >         { \
> >> >             if (MatchList[state]) \
> >> >             { \
> >> >                 if (_add_queue(&acsm->q,MatchList[state])) \
> >> >                 { \
> >> >                     if (_process_queue(&acsm->q, Match,data)) \
> >> >                     { \
> >> >                         *current_state = state; \
> >> >                         return 1; \
> >> >                     } \
> >> >                 } \
> >> >             } \
> >> >         } \
> >> > +       sindex = xlatcase[T[0]]; \
> >> >         state = ps[2 + sindex]; \
> >> >     }
> >> > #endif
> >> >
> >> > In this way, I think the number of instructions could be reduced too.
> >> >
> >> > And these are the results that I have obtained:
> >> >
> >> > 612MB file -> 15.66% for MPSE, 22.13% for ac-q
> >> > 1GB file -> 34.49% for MPSE, 35.13% for ac-q
> >> >
> >> > I will try to repeat these tests using other more powerful computers.
> >> > In any case, we will give more information on Monday.
> >>
> >> >
> >> >
> ------------------------------------------------------------------------------
> >> > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
> >> > MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
> current
> >> > with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
> >> > MVPs and experts. ON SALE this month only -- learn more at:
> >> > http://p.sf.net/sfu/learnnow-d2d
> >> > _______________________________________________
> >> > Snort-devel mailing list
> >> > Snort-devel at lists.sourceforge.net
> >> > https://lists.sourceforge.net/lists/listinfo/snort-devel
> >> > Archive:
> >> > http://sourceforge.net/mailarchive/forum.php?forum_name=snort-devel
> >> >
> >> > Please visit http://blog.snort.org for the latest news about Snort!
> >> -------------- next part --------------
> >> An HTML attachment was scrubbed...
> >>
> >> ------------------------------
> >>
> >>
> >>
> >>
> ------------------------------------------------------------------------------
> >> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
> >> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
> >> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
> >> MVPs and experts. ON SALE this month only -- learn more at:
> >> http://p.sf.net/sfu/learnnow-d2d
> >>
> >> ------------------------------
> >>
> >>
> >> _______________________________________________
> >> Snort-devel mailing list
> >> Snort-devel at lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/snort-devel
> >>
> >>
> >> End of Snort-devel Digest, Vol 78, Issue 10
> >> *******************************************
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.snort.org/pipermail/snort-devel/attachments/20130128/72c4fa10/attachment.html>


More information about the Snort-devel mailing list