GNV: PROPOSED SPECIFICATION FOR GNUTELLA NETWORK VERIFICATION --------------------------------------------------------------------------- v.0.0.4, 9/10/2000 by John Hoffman, theshadow@shambala.net updated version available at http://gnutella.shambala.net/gnv_spec.txt I am providing this specification in response to the perceived need for an IP blacklisting system for Gnutella. This is a rather difficult task, since there can be no central, trusted source for information about a GNet. Rather, having such a central node would give a network a "soft underbelly", allowing networks to be shut down. The only solution I can see to this problem is the use of a public key cryptosystem, where a digital signature can be used to verify whether a message came from the sponsor of that GNet. * KEY/SIGNATURE FORMAT -------------------- Each GNet should have associated with it a private key, kept confidential by the sponsor of that network. From that private key would be generated a public key, to be openly distributed. The private key would be used to generate digital signatures from specific data; anyone posessing the correct public key would then be able to verify that data was not counterfeit. Note that a high-security key is not necessary; should the private key be discovered, the key the network uses can be changed fairly quickly. The public key used should be a 1024-bit RSA key. Key or signature data will always be passed in binary form via packet, or in text files using a MIME base64 encoding, in a single 86 character line. The data will be stored in Little-Endian (x86) byte order. INCOMPLETE! THE EXACT FORMATS, AND METHODS FOR GENERATING, PUBLIC AND PRIVATE KEYS AND DIGITAL SIGNATURES NEED TO BE SPECIFIED HERE! There is a software library called "MIRACL", available at http://indigo.ie/~mscott/ that includes public key encryption algorithms and is free for non-commercial use; this might be a good basis to start with. * CONNECTION VERIFICATION ----------------------- The purpose of using connection verification is to make certain all "honest" GNet clients have the same public key and other configuration information; in this way, it becomes impossible for either a dishonest client to counterfeit systemwide broadcasts, or for nodes to be added using the incorrect configuration information. THERE ARE TWO POSSIBLE METHODS OF ESTABLISHING A NETWORKS' PUBLIC KEY; ONLY ONE SHOULD BE USED FOR THE FINAL SPECIFICATION: * KEY LOADED FROM CONFIG This method is dependent on the existence of a configuration file; either a .GNL file or other proprietary format. The configuration file would contain the public key used for the network, in ASCII format. Also, in the same location as the configuration file, with the same file name excepting the file extension being replaced with "KEY", would be a file containing a digital signature for that configuration file, created using the private key for the network, in ASCII format. The first task of the client would be to check the configuration file against its digital signature, and give an error if they didn't match. Upon connecting to another GNet client, the local client would transmit the public key to the remote machine. (The exact format and timing of this transmission would be specified for that version of the GNet protocol.) If the remote machine did not receive the identical key it is using as the network's public key, it would refuse the connection. * KEY LOADED DYNAMICALLY Upon connection to the GNet, a client would receive the network's public key, along with some checksum information. (The exact format and timing of this transmission would be specified for that version of the GNet protocol.) If the local client already had a key in place and was connecting to an additional host, and did not receive the same key as the first host, it would refuse the connection. If the local client repeatedly received keys that didn't match, it would assume the first host it connected to was using an incorrect key, would drop that connection and begin anew. If a configuration file were used to connect to the network, in the same location as the configuration file, with the same file name excepting the file extension being replaced with "KEY", would be a file containing a digital signature for that configuration file, created using the private key for the network, in ASCII format. The configuration file should be checked against its digital signature AFTER at least two GNet connections had been made, so as to make certain the right key were being used. NOTE: Given the existence of a configuration file, I recommend the first method be used, as it is more resistant against exploitation. Further, with the second method it would be necessary for at least ONE node to be "hard-wired" with the key, to jump-start the GNet. Dynamically loaded keys may also cause problems if it becomes necessary to expire the key. * MESSAGE VERIFICATION -------------------- The purpose of having message verification is to allow the administration of a GNet from any point on the network. Any message containing administration information should be signed with the private key for the network. Since after the connection verification process all honest clients would have the correct public key, they would be able to use that key to verify an administration message was not counterfeit. IMPORTANT: Any message containing configuration information would need to include a time stamp; and that stamp would have to be included in the data signed. Messages older than a certain specified amount would need to be expired. Otherwise a configuration message could be captured, and rebroadcast at a later date, possibly having unintended effects. The exact format of the packets containing configuration information and digital signatures would need to be specified for that version of the GNet protocol. * PRIORITY CONNECTION ------------------- The purpose of priority connections is to allow a "network crawler", sanctioned by the GNet's sponsor, to sample node traffic. This would allow spamming/flooding nodes to be located, and their IPs banned. The crawler should be allowed to log on even if a client's connection list was full. Whether and how priority connections were implemented would be dependent on that version of the GNet protocol. On negotiation, the crawler would announce to the remote client its claim to require a priority connection, and present a verification token. This token would need to contain the IP address of the computer the crawler is running on, and an expiration date, and would be signed using the private key for the network. (The exact format and timing of this transmission would be specified for that version of the GNet protocol.) This would verify the crawler was authorized by the network sponsors. The use of an expiration date would allow machines on dynamic IPs to work as crawlers, without leaving tickets that could be retransmitted later when another machine was using that IP. The token would need to be re-issued periodically by the holder of the private key. Once a client was connected to via a priority connection, and that connection was verified, the GNet protocol could opt to have the client send normal traffic to the crawler, and/or provide special statistics to the crawler. Statistics might include the IPs of all peers that client was connected to, how much traffic was originating from each peer, What packets were coming from which peer, or other similar information. It would be up to that GNet's protocol version to determine what statistics would be made available to crawlers. * KEY EXPIRATION -------------- The purpose of the key expiration message is to restart a GNet when a private key is discovered and exploited. The GNet protocol specification should include an administration message (with a verifying digital signature) forcing a GNet client to disconnect from the network, re-read any configuration files, and reconnect. File transfers in progress need not be disturbed. IMPORTANT: If the private key is discovered, this command could be exploited in order to "thrash" the network; forcing clients to disconnect and reconnect repeatedly, until the sponsor was able to create a new public key and change the configuration files. If a configuration file containing the key is used, the file should be reloaded and checked to see if there is a new key before restarting the client. Otherwise, perhaps once a key expiration notice was received and acted on, to ignore such messages for a certain amount of time (perhaps 15-30 minutes). NOTES: Originally I'd planned on using DH or El-Gamal public key encryption, since both systems were in the public domain; but since the RSA algorithm (used in the original version of PGP, and very well understood) just went into the public domain, it'd be better if that were used. ___ I had a request to use this to "white-list" clients, or have a log-on scheme; it'd work, but would probably compromise the security of the network. It would be better to have a second layer of digital signatures for this purpose. Just a few notes on a possible implementation: * The login public key could be present in the network's configuration file. Since the configuration file is signed by the network's private key, it would be safe from exploitation. * The login public key could also be changed quickly via configuration message, signed with the network's private key. * Before attempting to log onto the GNet, a client would perhaps open a web page containing CGI and having access to the login private key. (This web page would need to be available at several locations for redundancy's sake.) The client might need to give a user name and password. The page would then present the client with a login token, containing the client's IP and an expiration date, signed with the login private key. * On login, the client would need to present the token to any client it connected with. The remote client would examine the token, compare the IP contained with the IP of the connecting client, and verify the validity of its signature with the login public key. The connection would be terminated if the token were not valid. * Note that, since the client would need to be able to make new connections as nodes dropped out, perhaps after the login token has expired, the client should be able to cache the login and password and automatically obtain a fresh token. The login private key would need to be available at all the web sites offering login tokens, which is a security risk; however, since the login private key is separate from the network private key, it would be a limited risk. Any security breach could be alleviated by shutting down the exploited site, creating a new login private/public key pair, transmitting it to the remaining login servers, transmitting the new login public key over the GNet in a configuration message, and then repairing the exploited web site. ___ PUBLIC KEY CRYPTOGRAPHY -- a quick explanation of the technology: There exist mathematical systems where calculations cannot easily be run backwards, and this may be applied to generate public key cryptosystems, of which one is known as RSA. Using these systems, it is possible to generate public and private keys, where one cannot practically determine the private key from the public. A message can be encrypted using the public key, after which it can only be decrypted using the private key. If the private key is held confidential by one person, anyone can write a message meant for that person, encrypted using the public key, which would be made publically available. While the message was in transit, no one intercepting that message would be able to read it. Alternatively, many public key cryptosystems also work to write digital signatures. With this, a block of data (or a checksum of that data) would be encrypted using the private key (rather than the public key). Only someone holding that private key would be able to generate that particular pattern of data, which would be verifiable by anyone holding the matching public key. If it is assumed that only one person holds a private key, then any data "signed" with that key can be verified to have come from that person. Attacks: Public key cryptosystems can be attacked in one of several ways; the most common way is to substitute a different private key for the original. Other attacks include stealing the private key, attempting to reverse the public key (very difficult), attacking intermediate cryptosystems (used especially when large encrypted messages are being sent, since performing public key encryption is typically very slow, instead a random number will be generated, the data encrypted using that number and a faster system like IDEA or Blowfish, and then that random number will be encrypted using the public key and included in the message), and exploiting deficiencies in the random number generator used to build the public key.