PROPOSED SPECIFICATION FOR GNUTELLA .GNL FILES --------------------------------------------------------------------------- v.1.0.0, 8/24/2000 by John Hoffman, theshadow@shambala.net with Albert Vierling, avierling@yahoo.com updated version available at http://gnutella.shambala.net/gnl_spec.txt I had the idea of a .GNL (GNutella Link) file when thinking about private Gnutella networks, and why they are so few and far between. The .GNL file is a way of defining that network and passing connect information on to other people. It is much like a RealMedia .RAM file or a .M3U file, in that it can be quickly downloaded by a browser and then point the client to a network. Optimally, a .GNL file should be opened by a client as a command-line argument, with no flags. A GNet client should install itself as the dependency for the .GNL file type, in the OS and preferably for all installed web browsers. A .GNL file is an ASCII text file, delimited by either PC-style CRLF or Unix-style LF. It may have one of two content formats: * FORMAT #1: Private Network Definition ------------------------------------- GNL VERSION SECTION [GNL] This section contains information specific to the .GNL file. GNL VERSION gnlver={GNL version} Example: gnlver=1.0 The client should give an error if it does not recognize this version of the GNL specification or if this key is missing. This line should only include the major version and sub-version information; minor sub-versions (e.g. 1.0.1, 1.0.2) designate changes that do not require a new format for the .GNL file. NAME name={name of the network} Example: name=The GnutellaNet This gives the name of the private (or general) network, and is mostly informative. The client may display this name, or ignore it completely. HOME home={URL} Example: home=http://gnutella.wego.com This is a URL to the HTTP home page of the private network. The client might provide a button that would pop up a browser window with this page; or it could ignore it. MOTD motd={URL} Example: motd=http://www.gnutellanews.com This is a URL to a HTTP page with a "message of the day". The client could handle it in all sorts of fancy ways, popping it up if it's changed; or it could ignore this key entirely. NETWORK SECTION [network] Contains the definitions for accessing the network. Also specifies that this .GNL contains a network definition and not a redirect. VERSION version={GNet protocol version} Example: version=0.4 This gives the version of the GnutellaNet protocol being used in the network, allowing newer clients to use the older protocol without there being any compatibility issues. If a client cannot support the version shown, it should display an error message. The protocol version will be included in the login. LOGIN login={challenge line} Example: login=GNUTELLA CONNECT This line contains the challenge sent by the client to nodes it connects with, minus the forward-slash, protocol version number, and two newlines. Any beginning or terminating whitespace should be stripped to prevent problems. RESPONSE response={response line} Example: response=GNUTELLA OK This line contains the response the client should expect from nodes it connects to, minus newlines, and the line it should respond with when another client attempts to connect with it. Any beginning or terminating whitespace should be removed. CACHE LIST SECTION [caches] Contains a list of hostcaches, for a client to initialize itself. If this section is missing or empty, the client should display a warning; it shouldn't give an outright error, since the owner of the private network might want people to find nodes another way. The client may use these in any way it pleases; I would suggest connecting to one at random if at any time the host catcher is empty and there are no active connections or connection attempts. CACHE LIST caches={address and port}{.address and port}... Example: caches=gnet1.ath.cx:6346,gnet2.ath.cx:6346 These lines contain the addresses of one or more hostcaches provided for this network. The port numbers should always be included. There may be more than one "caches=" line in the definition file. SHARED FILE TYPE LIST SECTION [shared] Contains a list of the file types this network was set up to share. This allows the scope of a private GNet to be limited, allowing special purpose GNets to be set up. The filename buffer should be filtered with this list on scanning. Implementation of this is optional but strongly recommended. Optional but recommended: All GNet QueryHit packets containing a filename whose type isn't shared should be filtered out. Optional but recommended: All GNet Query packets containing a search for only one of these extensions should be filtered out. For instance, if "mp3" is one of the shared file types, searches for "mp3", ".mp3" and "*.mp3" (and perhaps other variations) should be disallowed. If this section is not present, the client's default list of search extensions should be used, and GNet QueryHit packets should NOT be filtered for those extensions. (You might want to keep the Query packet filter in place, though.) If this section is present but empty, a warning should be given. SHARED FILE TYPE LIST shared={file extension}{,file extension}... Example: shared=mp3,mp2,mp4,aac These lines contain a list of file extensions to be passed by this private network. There may be more than one "shared=" line in the definition file. FILTER LIST SECTION [filters] Contains a list of search terms considered undesirable on this private network. Query packets matching any of these terms should be filtered out. Implementation of this filtering is optional but strongly recommended. FILTER LIST filters={filespec}{,filespec}... Example: filters=mp3,.mp3,*xxx*,*sex*,*adult* These lines contain a list of filters. A filter should be case insensitive, and may contain an asterisk to match zero or more characters, or a question mark to match any single character. The example filters shown above should filter searches for "mp3", ".mp3", or anything containing the string "xxx", "sex" or "adult". A search for "metallica mp3" should be allowed to pass. There may be more than one "filters=" line in the definition file. COMMENT ;{commentary} Example: ;this is a comment Any line with a semicolon as the first character should be ignored. OTHER Any unrecognized keyword, or any keyword in an unrecognized section, should be ignored; this allows for some expansion in the .GNL protocol without needing to change the file specification completely. Therefore, this file (gnet.gnl) would be a definition to connect to the main Gnutella network: ============================================ [GNL] gnlver=1.0 name=The GnutellaNet home=http://gnutella.wego.com motd=http://www.gnutellanews.com [network] version=0.4 login=GNUTELLA CONNECT response=GNUTELLA OK [caches] caches=gnet.ath.cx:6346 caches=gnet1.ath.cx:6346,gnet2.ath.cx:6346 caches=gnet3.ath.cx:6346,gnet4.ath.cx:6346 ============================================ This file (cmovies.gnl) might be used to define a private network dedicated to sharing classic movies: ============================================ [GNL] gnlver=1.0 name=The Classic Movie Connection home=http://www.classics.com motd=http://www.classics.com/motd.html [network] version=0.4 login=CLASSIC MOVIES response=GNUTELLA OK [caches] caches=cache1.classics.com:6346 caches=cache2.classics.com:6346 [shared] shared=mov,avi,asf,mpeg,mpg,rm,divx [filters] filters=mp3,.mp3,*xxx*,*sex*,*adult* ============================================ * FORMAT #2: Redirect ------------------- GNL VERSION SECTION [GNL] This section contains information specific to the .GNL file. GNL VERSION gnlver={GNL version} Example: gnlver=1.0 The client should give an error if it does not recognize this version of the GNL specification or if this key is missing. This line should only include the major version and sub-version information; minor sub-versions (e.g. 1.0.1, 1.0.2) designate changes that do not require a new format for the .GNL file. REDIRECT LIST SECTION [redirect] A URL, or list of URLs to a different .GNL file. (This is so a private network definition can be maintained and updated as necessary.) If the "[network]" section is present, this section should be ignored. REDIRECT ADDRESS LIST url={URL of new .GNL}{,URL of new .GNL}... Example: url=http://gnutella.wego.com/gnet.gnl The client should download the .GNL file at the given URL and use it instead. The protocol used should be limited to HTTP, since GNet already incorporates the HTTP protocol. There may be more than one "filters=" line in the definition file. I suggest only one level of redirection to a new .GNL file be allowed; otherwise, one .GNL could point to itself, resulting in an infinite loop. More than one URL should be specified; if the first one fails, the client should attempt to download each in order. This allows for server problems or government intervention. COMMENT ;{commentary} Example: ;this is a comment Any line with a semicolon as the first character should be ignored. OTHER Any unrecognized keyword, or any keyword in an unrecognized section, should be ignored; this allows for some expansion in the .GNL protocol without needing to change the file specification completely. Therefore, this file (gnutella.gnl) would be made available for download: ============================================ [GNL] gnlver=1.0 [redirect] url=http://gnutella.wego.com/gnet.gnl url=http://gnutelladev.wego.com/gnet.gnl url=http://www.gnutelliums.com/gnet.gnl url=http://gnet.ath.cx/gnet.gnl,http://gnet1.ath.cx/gnet.gnl url=http://www.gnuranium.com/gnet.gnl ============================================ NOTES: The original purpose of the .GNL specification was to provide a nice, simple way of opening up separate GNet hetworks. Adding in dedication to specific file types and filtering is admittedly going beyond that purpose. I've consulted with several people on the subject, however; and I've decided to go ahead and add them to the specification. It is up to you whether you wish to support them or not. ___ GNL Compliancy: In order for a client to rightfully claim to be GNL compliant, it needs to fully implement checking the gnl:gnlver key, and reading and using all keys in the network and cache sections. It is recommended the other keys be read and used, but optional. ___ It's been suggested that IP/subnet banning be added; this would be infeasible for the GNet v.0.4 protocol, in that the .GNL file is only loaded at the start of the session. A ban would only hold for newly established nodes; a client at a banned IP could simply keep trying nodes until it found one where its IP was allowed access. An IP/subnet ban would be workable for a new version of the GNet protocol, if: 1. The protocol were not backwards compatible, keeping unprotected nodes off the network; 2. The client would need to reload the .GNL file (or a separate list of banned IPs, specified in the .GNL) every few minutes; and 3. Each client was required to use a .GNL file, and there was a way to validate that the correct .GNL file was being used. Otherwise, someone could advertise an altered .GNL file at another location, adding unprotected nodes and circumventing the ban list. Note that this sort of validation may be desirable for other features to be incorporated. For the current features, changing the filter list for less than a large number of nodes would have little effect, and changing any of the other parameters would result no connection to the GNet being established. Also consider that banning an IP/subnet involves finding the IP of a spammer in the first place; this may be impossible without also publicizing the addresses of people searching for items. ___ Some novel ideas for using .GNL files: 1. having CGI generate them dynamically 2. having CGI switch people onto one of several different private networks randomly, cutting down on the network sizes ___ I've also been asked how filtering could be "enforced" on clients. It can't; but if the suggested filtering were used by a majority of the clients on a private network, searches for undesirable content would only travel one or two hops before being filtered out. This would discourage people from doing such searches on dedicated private networks, since their efforts would be more fruitful on an unlimited GNet. ___ My original specification was simple, line-keyed... I was convinced to go with a Microsoft INI-style format by some people, and others kept urging me to switch to XML. Well, part of the design philosophy was to keep the file size down; INI-style is adding a 40% overhead to the size, and XML would be well over 100%. http://www.enjoy.ne.jp/~gm/program/parsecfg/ is a C library for parsing INI files, though it has some drawbacks. To keep things simple for implementing a parser, I recommend that any future additions to this spec not have keys duplicated between sections. For instance, using the keys "gnlver" and "version", instead of "version" in both sections. In this way, a custom parser can ignore section headings and parse keywords alone.