WWW Book - Security

This was written for a possible book in around 1993/4. Written in Microsoft word, copied to HTML and salvaged by hand. ..Tim BL.

This section as was written as a layman's backgrounder in security because any discussion if security needs some background in what is possible. Not my area of experise so probably wrong! Now (1996) needs bringing up to date with DSig etc.

Security

Security is an aspect of W3 which you can almost, but not quite, separate from the other aspects of hypertext. Then when, you get into it, you find it's tied up wth all sorts of other things. There are other books on network security [refs] , and as I am certainly no expert, this will be a shallow summary with particulr emphasis on how security and w3 interact in particular. You should interpret this chapter as being more jounalistic than authoritative, but a book on th web would not be complete without it.

First we'll go over what we mean be security, in which ways W3 needs security. Then, we'll gloss over the cryoptographic methods which mathematics gives us, and which allow us to mke security where there was none. Then, we'll discuss the very important question of how keys are distributed, and finally, having gathered a list of technical can's and can'ts, we'll enumerate the political issues. These are interesting because here we can chose, through what we alllow to happen technically, the political climate and social structure of the "new world".

The need.

The web grew, as we have seen, initially through the dissemination of public information. This required little security.

Soon, some groups wanted private information when required password access. Next, commercial information distributors wanted to be able to identify subscribers to their services, and later forms started to crop up on the web which asked people to type in sensitive information such as, for example, their credit card number.

There was clearly a need for privacy, and it was obvious that all forms of security of message exchange would be required. These forms have been studied in connection with electronic mail [refs]. These forms of security can be applied independantly to the request and the response of a w3 transaction*. They are, basically, security that:

the person one corresponds with is really who they purport to be (authentication);
the message is indeed the one that person sent (message integrity);
noone else has had access to the information (privacy).
the sender (or reciever) of a message cannot deny having sent (recieved) it.

In pratice, these it is difficult to dispense with any of these forms, as if you can break one, you can use that leak to break another, but some argue that one can use a system which allows authentication and message integrity checking without privacy.

Value Added Networks or End to End security

One way of achieving security is to use a "value added" network which consists of a set of completely trusted nodes. This acts like a notary, vouching for the identities of parties, guaranteeing the integrity of messages, and registering documents so that neither their emission nor their delivery can be denied. This is the service which the Post Office has historically provided. It relies on laws which mandate strict punishment for those who "interfere with Her Majesty's Mails" or who abuse the trust which is placed in them as postal officials by, for example, reading mail. This single monopoly service provider has worked well enough for many years but it offends our sense of decentralisation (be it an engineering or political sense) and of free enterprise. If competition is to exist among network providers, then several parallel value added networks must exist. This implies gateways between VANs, and some way of specifying which carriers you would trust for a given transaction.

Just as the Internet's Transmission Control Protocol (TCP) provides reliable communication over a mottley collection of interconnected unreliable networks, end to end security protocols provide security features without requiring secure VANs. They put the quality of service decision back with the the communicating parties, separating it from the choice of carrier. They appeal to our sense of decentralisation. They can run over, and are in the spirit of, the Internet and the Web.

Traffic hiding

With end-to-end security, one has the choice as to what sorts of security one applies, and whether it is applied to the whole transaction or just parts of it. One might want, for example, not to bother with security when browsing a catalogue or ordering car parts, and might only wish to encrypt one's payment credentials. However, the VAN will always gain over end to end security if it is necessary to conceal not only the content of the communication, but also the fact that communication takes place at all. This can be important informtion (footnote: The discovery of the identity of the perpertrators of the bomb attack on the World Trade Center involved the fact that a phone call had been logged between a particular public phoe and a private address. The existence of the call, without its content, was sufficient. As another example, even though I may not be interested in having consumer catalogues I browse encrypted for security, I may not want the valuable marketing facts as to which companies I do business with to be available.

An answer to this problem in the end-to-end security world is the use of annonymous proxy gateways. These reroute requests: you set up your W3 client to send all requests through some such gateway. If the gateway has a good mix of people using it for a good mix of purposes, it will be more difficult to trace, at an intermediate node in the network, who is talking to whom. An anonymous gateway for electronic mail has beenoperating for some time in Finland, with the related but different aim of allowing one to disguise the identity not only from prying eyes but also from one's partner.

What Cryptography gives us

Security, when not enforced by armed patrols of the equipment, is enforced using cryptography. An intruiging science, this is based on a few mathematical results. One is the fact that some problems (such as finding two large primes if you only know their product) are simply difficult to solve, and will never have a short cut solutions. This allows us to make the breaking of security system a problem which will take, say, more computing effort than we expect to be available from all existing computers combined for the next century.

There are "secret key" functions which allow data to be encrypted in a way which is provably difficult to reverse unless you know the "key" number which was used in its encryption.

There are the so-called trap door functions which you can't practically decrypt even if you do know the encryption key, And then there are the methods of "public key" encryption in which you can't practically decrypt even if you know the encryption key: you have also to know a seperate decryption key.

These are the tools which mathematics gives us to engineer security protocols. There are all sorts of tricks with which one can use these protocols to do thinngs such as creting documents in which some information is hidden from certin people, such as cheques which one can prove are signed without being able to know who signed them.

Clearly the publcic key methods of encrytion are powerful in that the private keys they use do not have to known by more than one person. (Ben Franklin: Three people can keep a secret is two of them are dead"). The calculations reqired to encrypt data with public key methods are slow by comparison, too slow to use practically for whole web document each time it is received. Often, a simpler private key method is used to encrypt the whole message using a randomly generated key, and then the value of that key is sent usng public key encryption as a preface to the real message.

Trap door functions and digital signatures

A trap door function can be applied to a message to generate a large number which is called a signature of the message. However, given a signature it is practically impossible to make up message which has that signature. The signature is a function of the entire contents of the message. A common function is known as "MD5", and a signature of a message produced with MD5 is known as its MD5 signature. The fact that the function is "one way" means that a signature can't (pracically) be faked. If you change a message in any small way, the signature changes completely. Given a signature, it is (practically) impossible to create a message which has that signature: you can only get a signature by starting with the message. This means that in order to prove the autenticity of a message, you only have to be able to verify its signature.

Signing a message

Using public key cryptography, to prove that you originated a message, therefore, all you have to do is make its digital signature, and the encrypt the sigature with your priate key. You could have encrypted the whole message, but that would take longer. Anyone can prove that you signed it by decrypting the signaure with your public key, and verifying it against the message.

(The are two uses of the word "signature" in this context, each meaning an unforgeable characteristic mark. The digital signature of the message is a string of bits characteristic of the message. When encrypted with a person's private key, it becomes characteristic of the person.)

There is a difference in function between a manuscript signature and a public key based signature, in that a private key can be passed from one person to another, wheras handwriting style, we generallly assume, can't. A key can be stolen, too. This means that when you want to hold someone to a document which they seem to have signed, they can always protest that they gave away their key or had it stolen. Clearly the law would need to have an attitude as to when having given away a key constitutes a defence.

Sealing a message

A paper message is sealed so that noone else can read it. To seal an electronic message, you encript it with your public key, or

you encrypt it with a secret key method and then append the secret key encrypted with your private key. The latter method, as for signing, you use for speed.

-- encrytion interfering with eg caching.

Applying cryptographic security to WWW

When applying security to the web's HTTP protocol,

we can consider the HTTP request and the HTTP response as messages for encrytion. As all W3 transactions currently consist of a message each way, the only remaining visible information available to potential network eavesdroppers is the existence and length of the messages. We can even pad out messages with random sized blanks and send spurious dummy messages around further disguise what we're up to.

There are other approaches, in which we encrypt less: information in one direction only, for example. Annother thig we may want to do is store some files for safety only in encrypted form. When sending these, we may not bother to encrypt the whole transaction, relying on the encrypted state of the file. When ordering a book, we may be intereted only in encrypting the credit card number, not bothering to prove our identity as the credit card number effectiely does that though the credit card company. Or we may prefer to put the clothes on account, in which case we would need to prove that the message was indeed from us, but without botherig to encrypt it for privacy.

Interference with other functions

To as large an extent possible, security is introduced a a separate layer, indepndantly of the other apects ofthe web. There is one side-effect of using encryption, however, is that because the messages are hidden from intermediate nodes on the network, they can't optimize access by taking cached copies of information. Similarly, if a company operates a proxy gateway which selectvely allows access to breach the company firewall, it must be able to see at least the adrress on some "envelope" transction in order to be able to do the forwarding.

Key distribution

The above would have you believe that privacy is a solved problem, but ignores a very big question. We have assumed that each side knew the key numbers it would need, with whatever system, to encrypt outgoing and decrypt incomming messages. In fact, knowing these keys and keeping them secrure is a really big problem, as it involves not only the technology but also the people who use it, and so the problem has different solutions for different social situations.

The key distribution is a whole lot easier with pubic key systems than secret key systems. Remember that with a public key system, one of a pair of keys must be kept private, but the other one can be made public. You only need (basically) one key pair per person: everyone writing to the person encrypts with the public key, and the person decrypts it with the private key. This gives privacy as only that person can decrpyt it. If the sender of a message encrypts with his or her private key, then anyone can decrypt the message, proving that the mesage did in fact come from the person. This is authentication. To get both, you encrypt first with your private key and then with the recipient's public key. Prett neat, eh? The private key has to be kept safe, but it can be, as it only has to be known by one person. So if you believe that "three people can keep a secret so long as two of them are dead" (Benjamin Franklin, I think), you will like the idea of public key systems.

Certificates

However, the scheme can still be attacked by misleading someone about a public key. You might ask your friends personally what their public keys are, and note them down directly. However, when you are communicating with a new mail order store you have just discovered, or your pocket calculator is communicating with the checkout till of a supermarket, you need to find out someone else's public key in a hurry. What do you do? You ask them -- if you know they really are who they are. But supposing they are imposters, impersonating say the tax authorities to find out all about you? They give you a false identity allong with a false public key and you are none the wiser. So you have to ask someone else to verify that they key really belongs to that person. Who do you trust? Maybe you ask, say, a person's bank, and to be sure you check the bank is who they really say they are by checking their public key back with your bank. You don't have to use banks, but you have to have a web of trust between them and you.

A node in this web of trust can be repesented by a "certificate". A certificate is a document to the effect that a given quoted public key does in fact belong to a given named party. It can say other things about the party in question as often the name is not all you want: you may need to verify the home address or registered office, or membership for example of licensed practioners of a trade. The certificate has, attached to it, its digital signatue encrypted with the private key of some authority, and the name and public key of that authority. Before you believe the certificate, of course you need to verify that it itself is not a forgery, by getting some proof of the certifier's pubic key. This may be in the form of another certificate, and so on .. until you reach a public key which you know. This path of trust may pass through a hierachy of institutions, or it may pass simply through a circle of friends, or a society, or two individuals who have met in person and exchanged certificates on business cards.

A certificate is just a document. It can be stored in the web, it can be mailed, it can be scribbled on the back of a (large) envelope. The web provides a useful way of distributing certificates, in fact. It allows one to append to any signature pointers to certificates of the signer. As a certificate is signed by at least one certifying party, it will have pointers to to certificates of those signing authorities. This enables this certifcate checking to be done over the net.

It's useful to think about what you are actually trusting when you believe a message authenticated with a chain of certificates. You are trusting the sender, and all of the chain of certificate issuers. You are not only trusting them, but also the means they use to keep their private keys secret. You are trusting that no one has broken into their house and stolen their notebook, or downloaded their keys from their pocket compter while they were at lunch. You are trusting the people who created the keys in the first place, and the people who wrote the programs those people used. Crypography can protect you against hackers on the net, but it doesn't help you much against burglary, carelessness or betrayal.

(on to polictics and social aspects)