[Snort-devel] draft rfc - a module system for snort

Todd Lewis tlewis at ...255...
Sat Apr 21 14:58:32 EDT 2001


An RFC for your reading pleasure.

**************************************************************************
TITLE: A Module System for Snort
AUTHOR: Todd Lewis <tlewis at ...255...>
STATUS: draft
VERSION: 0.00
DATE: Sat Apr 21 10:13:41 EDT 2001

0. INTRODUCTION

Shortly, I shall be submitting my packet acquisition engine (paengine)
modifications to snort for inclusion in the 1.8 version.  That work
is based on the presence of a module system, and so as a prerequisite,
I first submit this design document for a module system, accompanied by
an implementing patch file.

The paengine system makes, it seems to me, a good test case for this
system.  Of the design criteria listed below, most of them affect the
design of the paengine system.  Therefore, it is my hope that our
collective experience with this system as used by the paengines will
serve as a means for us to fine-tune this design in anticipation of its
being applied to other snort subsystems.

If I recall correctly, a precursor to this system has been part of
my paengine patch since its initial publication.  Having received no
feedback on that system, I have not perceived any opposition to the sort
of system proposed herein, but I am, nonetheless, very receptive to any
counter-arguments against the system as a whole or requests for changes
to the details, and any feedback on this proposal is welcome.

1. THE NEED

Martin Roesch, the inventor and lead developer of snort, has often stated
his desire that snort consist of a 100-line core and 100,000 lines of
modules (or words to that effect).  Prime candidates under the existing
architecture include data link decoders, preprocessors and output plugins.
I personally have already submitted several proposals to the snort
development community for new subsystems that will, in my opinion,
make the codebase more segmented, manageable and efficient, to wit:

	- the packet acquisition engines
	- the protocol engines
	- rule matching implementations
	- rule parsers

The first two would potentially have multiple implementations existing
within the same snort instance while the latter will only have one live
instance simultaneously.  All of them, however, are modular, in that
they presume multiple, separate implementations of identical interfaces.
For the paengine example, this is already the case, with (at least)
three paengines having already been written by two authors.

Module systems, then, serve to facilitate the existence and (parallel
or serial) use of multiple implementations of an identical interface.
In so doing, they encourage the use of such standardised interfaces and
ease the separation of interface from implementation.  Such evolution
of the snort code base will serve the developer community and the user
base exceedingly well as time goes on.

2. DESIGN CRITERIA

	1) provide standardized support for use of modules

A module system should provide a single interface through which all
modular snort subsystems can be accessed.  This includes discovering
the modules, activating them, accessing them and deactivating them.

	2) multiple implementations of the same interface

As the pachet acquisition engine and output plugin examples demonstrate,
snort can require the use of multiple implementations of the same
interface.  Therefore, the module system should allow for the discovery
of multiple implementations, as opposed to merely allowing for the
loading of a single implementation.  A standardised convention for
labeling modules with their name is therefore included in order to
support name-based module discovery.

	3) provide as much assistance to user as possible and prudent

Further, the system should be designed at a high-enough level of
abstraction that the implementation can take care of as many bothersome
details as possible for the user.  (This is the main change from
the earlier incarnation of this system, which made many more demands
of the user.)  This includes supporting automatic module discovery,
automated re-use of already-activated modules, and garbage collection
to ensure that the resources associated with modules are actually freed,
and further that they are freed at the proper time.

	4) platform independence

It is a major goal of snort to be as platform-independent as reasonably
possible.  Therefore, this proposal includes several measures to
ensure that the module system will work on a wide range of platforms.
For platforms that do not have support for dynamic code loading or whose
dynamic code system happens to be unsupported, static compilation of
modules is a supported feature.  Although the initial implementation,
reflecting the platforms to which I have access, is limited to supporting
the dlopen interface, the interface to the module system is written in
such a way that support for other systems is easy to add, and there is
an implementation note explaining how such additions can occur.

	5) safely allow evolution of interfaces

Interfaces will change over time, and the possibility exists that
the code snort code and the module code will drift from each other.
While no module system can prevent ABI incompatibilities, it can offer
assistance in detecting them.  Therefore, the inclusion of versioning
information is mandatory, and the module system will strictly enforce
version requirements, refusing to load code which is version-mismatched.
(The user has the option to avoid version matching, but he must explicitly
chose to do so.)

	6) allow ultimate flexibility in how code is deployed

There should be as few constraints as possible imposed on the user in how
he can deploy code.  If he wishes to take a new module and load it into a
running snort instance, and if there is an appropriate underlying shared
code system on the platform, then doing so should be possible as far as
the module system is concerned.  Therefore, the interface below supports
not only scanning a fixed set of module collections but also rescanning,
potentially with a changed set of dynamic module directories and static
module collections.

	7) allow mixing of modules from different systems in same dir

Some administrators and/or policy regimes may not wish to have a
surfeit of directories.  Therefore, it is desirous to support having
modules of different types in the same directory and having the
system able to find the right module in the right context.

3. PROPOSED INTERFACE

The name of this system shall be "smodule", for "Snort Module".

3.1 DATA TYPES

	#typedef int smcol_t;
	typedef struct _smodule {
		char* name;
		unsigned int vmaj;
		unsigned int vmin;
		void* interface;
	} smodule;

3.1.1 smodule

The smodule structure is the interface which module implementations
use to export their functionality through the module system:

	typedef struct _smodule {
		char* name;
		unsigned int vmaj;
		unsigned int vmin;
		void* interface;
	} smodule;

3.1.3 smcol_t

	#typedef int smcol_t;

smcol_t is a handle to a module collection.  It is specified as an opaque
type, but just between you and me and the header file, it is almost
certainly an unsigned integer.  Users pass an array of static modules
and a list of directories to the module system; the module system uses
this to prepare a collection, the external representation of which is
the collection handle, smcol_t.

3.2 INTERFACES

(N.b., update this information as implementation evolves, especially
error specifications.)

	smcol_t scan_collection(char* mod_name, smodule* statics[],
		char** dirs, unsigned int vmaj, unsigned int vmin);
	void release_collection(smcol_t collection);
	smodule* get_smodule(smcol_t collection, char* name);
	void release_smodule(smodule* mod);

3.2.1 scan_collection

	smcol_t scan_collection(char* mod_name, smodule* statics[],
		char** dirs, unsigned int vmaj, unsigned int vmin);

DESCRIPTION

scan_collection allows the module system to scan a set of modules
in preparation for serving get_smodule requests.

In support of mixing modules of different types in the same directory,
the called may pass the name of the symbol which the module system should
look for in dynamic modules.  (The issue is moot for static modules.)
In the absence of any such name, the system will default to looking for
the symbol named "smodule".

scan_collection() further takes as arguments a null-terminated array
of smodule pointers pointing to the statically-compiled modules for
this system as well as an array of names of directories holding dynamic
module files.  For unix systems, these must be fully-qualified path names;
other platforms may issue guidelines for these names.

Additionally, a collection has version requirements signified by a
major and minor version number.  The module system assumes the normal
major/minor semantic convention, namely that minor version differences
are upwardly-compatible (e.g., a requirement for vmin=10 is satisfied
by vmin=13), while major version differences are presumed completely
incompatible.  Module system users who wish to eschew enforcement of
version compatibility by the module system should use the value '0'
for both vmaj and vmin, which disabled version checking.

The directories may be non-existent, but if they do exist then they must
be accessible, else an error will be returned.  If any named directories
do not exist, then they will be silently skipped, but the module system
will place a warning in the log.

RETURN VALUE

On success, scan_collection() returns an integer descriptor which can
be used for future discovery of modules in the collection.  On error,
-1 is returned and errno is set.  In this case, release_collection()
does not need to be called.

ERRORS

Any error will be accompanied by an explanatory log entry.

Here are some of the errno values that can be returned and what they
mean in this context:

EACCESS - one of the named directories was inaccessible
ENOTDIR - one of the named directories is actually not a directory

3.2.2 release_collection

	void release_collection(smcol_t collection);

DESCRIPTION

release_collection() allows the module system to free the resources
associated with a collection.  N.b., this does not release the modules
already activated out of a collection; they must be released individually.

RETURN VALUE

None.

ERRORS

None.

3.2.3 get_smodule

	smodule* get_smodule(smcol_t collection, char* name);

DESCRIPTION

get_smodule() attempts to find an instance of the named module that
satisfies the version requirements of the collection.

RETURN VALUE

On success, a pointer to the smodule with the passed name is returned.
On failure, NULL is returned and ERRNO is set.

ERRORS

Any error will be accompanied by an explanatory log entry.

Here are some of the errno values that can be returned and what they
mean in this context:

ENOENT - a module with this name could not be found in this collection.

3.2.4 release_smodule

	void release_smodule(smodule* mod);

DESCRIPTION

release_smodule() informs the module system that the user no longer
requires use of the smodule.  When all users have released the smodule,
then it will automatically be deactivated.  (On dl-based systems, this
means dlclose() will be called on the shared object; behaviour on other
systems shall be similar and defined precisely in their implementation
documentation.)

RETURN VALUE

None.

ERRORS

None.

3.3 NOTES

All pointer arrays are null-terminated.

The symbol exported from the module (either "smodule" or another name
specific to the system using smodule) should be of type "smodule **".
It consists of an array of smodule pointers terminated by a NULL.

The actual type of the "void* interface" exported from the smodule
is defined by the using system.  E.g., for paengine modules, they
will be of type "paengine_s".  It is recommended that this be a
structure or some other suitable data structure that serves as the
entry point to access all functionality in the module.

The "char** dirs" passed into scan_collection belongs to the caller,
and the module system will make copies if it needs any of the
directory names.  However, the "smodule* statics[]" that is passed
in is assumed to be, well, static.

There is no guarantee of when the passed directories are scanned or
when the modules will be activated, although presumably both will
happen before an activated module is returned to the user.  If
a material change happens to the available set of modules, then it
is up to the user to discard the present collection, create a new one
and use that for all future activateions.

When a module is requested for activation, if an instance of that
module is already activated, then it can be returned to the called.
If not, then dynamic module directories, if present and supported,
will be scanned in the order they were passed to scan_collection().
If no dynamic module can be found, then static modules will be scanned.
This scanning may already have happened and the results cached before
get_smodule() is called.

3.4 USAGE EXAMPLE

3.4.1 static module list creation

A little shell script will be enough to create the list of static
modules:

	#!/bin/sh

	# usage: create_static_mod_list (module files)
	# called from the makefile with a list of statically-compiled mods
	echo "static smodule* statics[] = {"
	for i in $*
	do
		n=`grep MODNAME $i|head -1|awk '{print $NF}'`
		echo "	&$n,"
	done
	echo "	NULL"
	echo "};"

This will create a list like this:

----------------------------------------------------------------
	static smodule* statics[] = {
		&mod_foo,
		&mod_bar,
		&mod_baz,
		NULL
	};
----------------------------------------------------------------

3.4.2 code invocation

Here is an example of a piece of client code:

----------------------------------------------------------------
static char* dyn_dirs[] = {
	"/var/lib/snort/modules/paengines/",
	"~/.snort/paengines/",
	NULL
};

paengine* pae;

void
find_paengine(char* paname)
{
	smcol_t c;
	smodule* s;

	c=scan_collection("paengine", statics, dyn_dirs, 1, 1);
	if(i<0){
		fprintf(stderr, "Can't scan available paengine modules.  Exiting.\n");
		exit(-1);
	}
	s=get_smodule(c, paname);
	if(s==NULL){
		fprintf(stderr, "Can't find paengine named \"%s\"; exiting.\n", paname);
		exit(-1);
	}
	pae=s->interface;
	release_collection(c);
	return(0);
}
----------------------------------------------------------------

4. IMPLEMENTATION STRATEGIES

A naive implementation can just re-scan the directories and static
module collections every time there is a request.  This is more expensive,
but there's no particular need to optimize this operation, and it is
far less code.

Support for other dynamic code systems should be easy.  Look at glib's
gmodule subsystem for a list of candidates (including dld, win32, beos
and os2) and for code to steal.

5. SMODULE.H

The following is the header file for the proposed interface:

----------------------------------------------------------------
#ifndef SMODULE_H
#define SMODULE_H

#typedef int smcol_t;
typedef struct _smodule {
	char* name;
	unsigned int vmaj;
	unsigned int vmin;
	void* interface;
} smodule;

extern smcol_t scan_collection(char* mod_name, smodule* statics[],
	char** dirs, unsigned int vmaj, unsigned int vmin);
extern void release_collection(smcol_t collection);
extern smodule* get_smodule(smcol_t collection, char* name);
extern void release_smodule(smodule* mod);

#endif /*SMODULE_H*/
----------------------------------------------------------------

**************************************************************************

--
Todd Lewis
tlewis at ...255...





More information about the Snort-devel mailing list