[Hypertext version Tim Berners-Lee 1990. Note that this RFC is obsoleted by RFC1036.]
There are five sections to this document. Section two defines the format. Section three defines the valid control messages. Section four specifies some valid transmission methods. Section five describes the overall news propagation algorithm.
An example message is included to illustrate the fields.
     Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP
     Posting-Version: version B 2.10 2/13/83; site eagle.UUCP
     Path: cbosgd!mhuxj!mhuxt!eagle!jerry
     From: jerry@eagle.uucp (Jerry Schwarz)
     Newsgroups: net.general
     Subject: Usenet Etiquette -- Please Read
     Message-ID: <642@eagle.UUCP>
     Date: Friday, 19-Nov-82 16:14:55 EST
     Followup-To: net.news
     Expires: Saturday, 1-Jan-83 00:00:00 EST
     Date-Received: Friday, 19-Nov-82 16:59:30 EST
     Organization: Bell Labs, Murray Hill
     The body of the article comes here, after a blank line.
Here is an example of a message in
the old format  (before the  existence
of this standard).  It is recommended
that implementations also accept
articles  in  this  format  to ease
upward conversion.
     From: cbosgd!mhuxj!mhuxt!eagle!jerry (Jerry Schwarz)
     Newsgroups: net.general
     Title: Usenet Etiquette -- Please Read
     Article-I.D.: eagle.642
     Posted: Fri Nov 19 16:14:55 1982
     Received: Fri Nov 19 16:59:30 1982
     Expires: Mon Jan  1 00:00:00 1990
     The body of the article comes here, after a blank line.
Some news systems transmit news in
the  "A"  format, which looks like
this:
     Aeagle.642
     net.general
     cbosgd!mhuxj!mhuxt!eagle!jerry
     Fri Nov 19 16:14:55 1982
     Usenet Etiquette - Please Read
     The body of the article comes here, with no blank line.
An article consists of several header
lines, followed by a blank  line,
followed  by  the  body of the message.
The header lines consist of a keyword,
a colon, a  blank,  and some  additional
information.   This  is  a subset
of the ARPANET standard, simplified
to allow simpler software  to handle
it.   The   "from"   line may optionally
include a full name, in the format
above, or use the  ARPANET  angle
bracket syntax.  To keep the implementations
simple, other formats (for example,
with part  of  the  machine  address
after the close parenthesis) are
not allowed.  The ARPANET convention
of continuation header lines (beginning
with  a blank or tab) is allowed.Certain headers are required, certain headers are optional. Any unrecognized headers are allowed, and will be passed through unchanged. The required headers are Relay-Version, Posting-Version, From, Date, Newsgroups, Subject, Message-ID, Path. The optional headers are Followup-To, Date-Received, Expires, Reply-To, Sender, References, Control, Distribution, Organization.
The line contains two fields, separated by semicolons. The fields are the version and the full domain name of the site. The version should identify the system program used (e.g., "B") as well as a version number and version date. For example, the header line might contain
Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCPThis header should not be passed on to additional sites. A relay program, when passing an article on, should include only its own Relay-Version, not the Relay-Version of some other site. (For upward compatibility with older software, if a Relay-Version is found in a header which is not the first line, it should be assumed to be moved by an older version of news and deleted.)
RFC 822 specifies that all text in parentheses is to be interpreted as a comment. It is common in ARPANET mail to place the full name of the user in a comment at the end of the From line. This standard specifies a more rigid syntax. The full name is not considered a comment, but an optional part of the header line. Either the full name is omitted, or it appears in parentheses after the electronic address of the person posting the article, or it appears before an electronic address enclosed in angle brackets. Thus, the three permissible forms are:
     From: mark@cbosgd.UUCP
     From: mark@cbosgd.UUCP (Mark Horton)
     From: Mark Horton <mark@cbosgd.UUCP>
Full names may contain any printing
ASCII characters  from space through
tilde, with the exceptions that they
may not contain parentheses  "("
or   ")",   or  angle  brackets "<"
 or  ">".    Additional restrictions
may be placed on full names  by the
mail  standard,  in  particular,
the characters  comma   ",",  colon
":",  and semicolon   ";" are inadvisable
in full names.
Weekday, DD-Mon-YY HH:MM:SS TIMEZONESeveral examples of valid dates appear in the sample article above. Note in particular that ctime format:
     Wdy Mon DD HH:MM:SS YYYY
is not acceptable because it is not
a valid ARPANET  date. However, since
older software still generates this
format, news implementations are
encouraged to accept this  format
and translate it into an acceptable
format.The contents of the TIMEZONE field is currently subject to worldwide time zone abbreviations, including the usual American zones (PST, PDT, MST, MDT, CST, CDT, EST, EDT), the other North American zones (Bering through Newfoundland), European zones, Australian zones, and so on. Lacking a complete list at present (and unsure if an unambiguous list exists), authors of software are encouraged to keep this code flexible, and in particular not to assume that time zone names are exactly three letters long. Implementations are free to edit this field, keeping the time the same, but changing the time zone (with an appropriate adjustment to the local time shown) to a known time zone.
Wildcards (e.g., the word "all") are never allowed in a Newsgroups line. For example, a newsgroup "net.all" is illegal, although a newsgroup name "net.sport.football" is permitted.
If an article is received with a Newsgroups line listing some valid newsgroups and some invalid newsgroups, a site should not remove invalid newsgroups from the list. Instead, the invalid newsgroups should be ignored. For example, suppose site A subscribes to the classes "btl.all" and "net.all", and exchanges news articles with site B, which subscribes to "net.all" but not "btl.all". Suppose A receives an article with "Newsgroups: net.micro,btl.general". This article is passed on to B because B receives net.micro, but B does not receive btl.general. A must leave the Newsgroup line unchanged. If it were to remove "btl.general", the edited header could eventually reenter the "btl.all" class, resulting in an article that is not shown to users subscribing to "btl.general". Also, followups from outside "btl.all" would not be shown to such users.
"<" "string not containing blank or >" ">"In order to conform to RFC 822, the Message-ID must have the format
"<" "unique" "@" "full domain name" ">"where "full domain name" is the full name of the host at which the article entered the network, including a domain that host is in, and unique is any string of printing ASCII characters, not including " ", or "@". For example, the "unique" part could be an integer representing a sequence number for articles submitted to the network, or a short string derived from the date and time the article was created. For example, valid message ID for an article submitted from site ucbvax in domain Berkeley.ARPA would be " ". Programmers are urged not to make assumptions about the content of message ID fields from other hosts, but to treat them as unknown character strings. It is not safe, for example, to assume that a message ID will be under 14 characters, nor that it is unique in the first 14 characters.
The angle brackets are considered part of the message ID. Thus, in references to the message ID, such as the ihave/sendme and cancel control messages, the angle brackets are included. White space characters (e.g., blank and tab) are not allowed in a message ID. All characters between the angle brackets must be printing ASCII characters.
Normally, the rightmost name will be the name of the originating system. However, it is also permissible to include an extra entry on the right, which is the name of the sender. This is for upward compatibility with older system.
The Path line is not used for replies, and should not be taken as a mailing address. It is intended to show the route the message travelled to reach the local site. There are several uses for this information. One is to monitor USENET routing for performance reasons. Another is to establish a path to reach new sites. Perhaps the most important is to cut down on redundant USENET traffic by failing to forward a message to a site that is known to have already received it. In particular, when site A sends an article to site B, the Path line includes "A", so that site B will not immediately send the article back to site A. The site name each site uses to identify itself should be the same as the name by which its neighbors know it, in order to make this optimization possible.
A site adds its own name to the front of a path when it receives a message from another site. Thus, if a message with path A!X!Y!Z is passed from site A to site B, B will add its own name to the path when it receives the message from A, e.g., B!A!X!Y!Z. If B then passes the message on to C, the message sent to C will contain the path B!A!X!Y!Z, and when C receives it, C will change it to C!B!A!X!Y!Z.
Special upward compatibility note: Since the From, Sender, and Reply-To lines are in internet format, and since many USENET sites do not yet have mailers capable of understanding internet format, it would break the reply capability to completely sever the connection between the Path header and the reply function. Thus, sites are required to continue to keep the Path line in a working reply format as much as possible, until January 1, 1984. It is recognized that the path is not always a valid reply string in older implementations, and no requirement to fix this problem is placed on implementations. However, the existing convention of placing the site name and an "!" at the front of the path, and of starting the path with the site name, an "!", and the user name, should be maintained at least until 1984.
For example, if John Smith is visiting CCA and wishes to post an article to the network, using friend Sarah Jones account, the message might read
     From: smith@ucbvax.uucp (John Smith)
     Sender: jones@cca.arpa (Sarah Jones)
If a gateway  program  enters  a
mail  message  into  the network
at site sri-unix, the lines might
read
     From: John.Doe@CMU-CS-A.ARPA
     Sender: network@sri-unix.ARPA
The primary purpose of this field
is to be able  to  track down  articles
to determine how they were entered
into the network.  The  full  name
may  be  optionally  given,  in parentheses,
as in the From line.
This field is intended to be used to clean up articles with a limited usefulness, or to keep important articles around for longer than usual. For example, a message announcing an upcoming seminar could have an expiration date the day after the seminar, since the message is not useful after the seminar is over. Since local sites have local policies for expiration of news (depending on available disk space, for instance), users are discouraged from providing expiration dates for articles unless there is a natural expiration date associated with the topic. System software should almost never provide a default Expires line. Leave it out and allow local policies to be used unless there is a good reason not to.
The purpose of the References header is to allow articles to be grouped into conversations by the user interface program. This allows conversations within a newsgroup to be kept together, and potentially users might shut off entire conversations without unsubscribing to a newsgroup. User interfaces may not make use of this header, but all automatically generated followups should generate the References line for the benefit of systems that do use it, and manually generated followups (e.g. typed in well after the original article has been printed by the machine) should be encouraged to include them as well.
For upward compatibility, messages that match the newsgroup pattern "all.all.ctl" should also be interpreted as control messages. If no Control: header is present on such messages, the subject is used as the control message. However, messages on newsgroups matching this pattern do not conform to this standard.
     Newsgroups: net.auto,net.wanted
     Distribution: nj.all
so that  it  would  only  go  to
persons  subscribing  to net.auto
or  net.wanted within New Jersey.
The intent of this header is to further
restrict the distribution of  a newsgroup,
not to increase it.  A local newsgroup,
such as nj.crazy-eddie, will probably
not be propagated by  sites outside
New  Jersey  that do not show such
a newsgroup as valid. Wildcards in
newsgroup names in the  Distribution
line are allowed. Followup articles
should default to the same Distribution
line as the original  article,  but
the user  can change it to a more
limited one, or escalate the distribution
if it was originally restricted and
a  more widely distributed reply
is appropriate.
Implementors and administrators may choose to allow control messages to be automatically carried out, or to queue them for manual processing. However, manually processed messages should be dealt with promptly.
     cancel <message ID>
If an article with the given message
ID is present on  the local  system,
the  article is cancelled.  This
mechanism allows a user to cancel
an article after the  article  has
been distributed over the network.Only the author of the article or the local super user is allowed to use this message. The verified sender of a message is the Sender line, or if no Sender line is present, the From line. The verified sender of the cancel message must be the same as either the Sender or From field of the original message. A verified sender in the cancel message is allowed to match an unverified From in the original message.
     ihave <message ID list> <remotesys>
     sendme <message ID list> <remotesys>
This message is part  of  the   "ihave/sendme"
protocol, which  allows  one  site
(say  "A")  to tell another site
("B")  that  a particular message
has been received on  A. Suppose
that site A receives article  "ucbvax.1234",
and wishes to transmit the article
to site  B.   A  sends  the control
message   "ihave  ucbvax.1234  A"
to site B (by posting it to newsgroup
"to.B").   B  responds  with  the
control  message   "sendme  ucbvax.1234
B"  (on newsgroup to.A) if it has
not already received  the  article.
Upon receiving the Sendme message,
A sends the article to B.This protocol can be used to cut down on redundant traffic between sites. It is optional and should be used only if the particular situation makes it worthwhile. Frequently, the outcome is that, since most original messages are short, and since there is a high overhead to start sending a new message with UUCP, it costs as much to send the Ihave as it would cost to send the article itself.
One possible solution to this overhead problem is to batch requests. Several message ID's may be announced or requested in one message. If no message ID's are listed in the control message, the body of the message should be scanned for message ID's, one per line.
newgroup <groupname>This control message creates a new newsgroup with the name given. Since no articles may be posted or forwarded until a newsgroup is created, this message is required before a newsgroup can be used. The body of the message is expected to be a short paragraph describing the intended use of the newsgroup.
     rmgroup <groupname>
This message removes a  newsgroup
with  the  given  name. Since  the
newsgroup  is  removed  from every
site on the network, this  command
should  be  used  carefully  by a
responsible administrator.
sendsys (no arguments)The "sys" file, listing all neighbors and which newsgroups are sent to each neighbor, will be mailed to the author of the control message (Reply-to, if present, otherwise From). This information is considered public information, and it is a requirement of membership in USENET that this information be provided on request, either automatically in response to this control message, or manually, by mailing the requested information to the author of the message. This information is used to keep the map of USENET up to date, and to determine where netnews is sent.
The format of the file mailed back to the author should be the same as that of the "sys" file. This format has one line per neighboring site (plus one line for the local site), containing four colon separated fields. The first field has the site name of the neighbor, the second field has a newsgroup pattern describing the newsgroups sent to the neighbor. The third and fourth fields are not defined by this standard. A sample response:
     From cbosgd!mark  Sun Mar 27 20:39:37 1983
     Subject: response to your sendsys request
     To: mark@cbosgd.UUCP
     Responding-System: cbosgd.UUCP
     cbosgd:osg,cb,btl,bell,net,fa,to,test
     ucbvax:net,fa,to.ucbvax:L:
     cbosg:net,fa,bell,btl,cb,osg,to.cbosg:F:/usr/spool/outnews/cbosg
     cbosgb:osg,to.cbosgb:F:/usr/spool/outnews/cbosgb
     sescent:net,fa,bell,btl,cb,to.sescent:F:/usr/spool/outnews/sescent
     npois:net,fa,bell,btl,ug,to.npois:F:/usr/spool/outnews/npois
     mhuxi:net,fa,bell,btl,ug,to.mhuxi:F:/usr/spool/outnews/mhuxi
senduuname (no arguments)The "uuname" program is run, and the output is mailed to the author of the control message (Reply-to, if present, otherwise From). This program lists all uucp neighbors of the local site. This information is used to make maps of the UUCP network. The sys file is not the same as the UUCP L.sys file. The L.sys file should never be transmitted to another party without the consent of the sites whose passwords are listed therein.
It is optional for a site to provide this information. Some reply should be made to the author of the control message, so that a transmission error won't be blamed. It is also permissible for a site to run the uuname program (or in some other way determine the uucp neighbors) and edit the output, either automatically or manually, before mailing the reply back to the author. The file should contain one site per line, beginning with the uucp site name. Additional information may be included, separated from the site name by a blank or tab. The phone number or password for the site should NOT be included, as the reply is considered to be in the public domain. (The uuname program will send only the site name and not the entire contents of the L.sys file, thus, phone numbers and passwords are not transmitted.)
The purpose of this message is to generate and maintain UUCP mail routing maps. Thus, connections over which mail can be sent using the site!user syntax should be included, regardless of whether the link is actually a UUCP link at the physical level. If a mail router should use it, it should be included. Since all information sent in response to this message is optional, sites are free to edit the list, deleting secret or private links they do not wish to publicise.
version (no arguments)The name and version of the software running on the local system is to be mailed back to the author of the article (Reply-to if present, otherwise From).
It is not a requirement that USENET sites have mail systems capable of understanding the ARPA Internet mail syntax, but it is strongly recommended. Since From, Reply-To, and Sender lines use the Internet syntax, replies will be difficult or impossible without an internet mailer. A site without an internet mailer can attempt to use the Path header line for replies, but this field is not guaranteed to be a working path for replies. In any event, any site generating or forwarding news messages must have an internet address that allows them to receive mail from sites with internet mailers, and they must include their internet address on their From line.
One problem with this method is that it may not be possible to convince the mail system that the From line of the message is valid, since the mail message was generated by a program on a system different from the source of the news article. Another problem is that error messages caused by the mail transmission would be sent to the originator of the news article, who has no control over news transmission between two cooperating hosts and does not know who to contact. Transmission error messages should be directed to a responsible contact person on the sending machine.
A solution to this problem is to encapsulate the news article into a mail message, such that the entire article (headers and body) are part of the body of the mail message. The convention here is that such mail is sent to user "rnews" on the remote system. A mail message body is generated by prepending the letter "N" to each line of the news article, and then attaching whatever mail headers are convenient to generate. The N's are attached to prevent any special lines in the news article from interfering with mail transmission, and to prevent any extra lines inserted by the mailer (headers, blank lines, etc.) from becoming part of the news article. A program on the receiving machine receives mail to "rnews", extracting the article itself and invoking the "rnews" program. An example in this format might look like this:
     Date: Monday, 3-Jan-83 08:33:47 MST
     From: news@cbosgd.UUCP
     Subject: network news article
     To: rnews@npois.UUCP
     NRelay-Version: B 2.10  2/13/83 cbosgd.UUCP
     NPosting-Version: B 2.9 6/21/82 sask.UUCP
     NPath: cbosgd!mhuxj!harpo!utah-cs!sask!derek
     NFrom: derek@sask.UUCP (Derek Andrew)
     NNewsgroups: net.test
     NSubject: necessary test
     NMessage-ID: <176@sask.UUCP>
     NDate: Monday, 3-Jan-83 00:59:15 MST
     N
     NThis really is a test.  If anyone out there more than 6
     Nhops away would kindly confirm this note I would
     Nappreciate it.  We suspect that our news postings
     Nare not getting out into the world.
     N
Using mail solves the spooling problem,
since  mail  must always  be  spooled
if  the  destination  host  is down.
However, it adds more overhead to
the transmission process (to encapsulate
and  extract  the  article) and makes
it harder for software to give different
priorities  to news and mail.
News articles are combined into a script, separated by a header of the form:
##! rnews 1234where 1234 is the length, in bytes, of the article. Each such line is followed by an article containing the given number of bytes. (The newline at the end of each line of the article is counted as one byte, for purposes of this count, even if it is stored as CRLF.) For example, a batch of articles might look like this:
      #! rnews 374
      Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP
      Posting-Version: version B 2.10 2/13/83; site eagle.UUCP
      Path: cbosgd!mhuxj!mhuxt!eagle!jerry
      From: jerry@eagle.uucp (Jerry Schwarz)
      Newsgroups: net.general
      Subject: Usenet Etiquette -- Please Read
      Message-ID: <642@eagle.UUCP>
      Date: Friday, 19-Nov-82 16:14:55 EST
      Here is an important message about USENET Etiquette.
      #! rnews 378
      Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP
      Posting-Version: version B 2.10 2/13/83; site eagle.UUCP
      Path: cbosgd!mhuxj!mhuxt!eagle!jerry
      From: jerry@eagle.uucp (Jerry Schwarz)
      Newsgroups: net.followup
      Subject: Notes on Etiquette article
      Message-ID: <643@eagle.UUCP>
      Date: Friday, 19-Nov-82 17:24:12 EST
      There was something I forgot to mention in the last message.
Batched news is recognized because
the first character  in the  message
is  "#".   The message is then passed
to the unbatcher for interpretation.
USENET is a directed graph. Each node in the graph is a host computer, each arc in the graph is a transmission path from one host to another host. Each arc is labelled with a newsgroup pattern, specifying which newsgroup classes are forwarded along that link. Most arcs are bidirectional, that is, if site A sends a class of newsgroups to site B, then site B usually sends the same class of newsgroups to site A. This bidirectionality is not, however, required.
USENET is made up of many subnetworks. Each subnet has a name, such as "net" or "btl". The special subnet "net" is defined to be USENET, although the union of all subnets may be a superset of USENET (because of sites that get local newsgroup classes but do not get net.all). Each subnet is a connected graph, that is, a path exists from every node to every other node in the subnet. In addition, the entire graph is (theoretically) connected. (In practice, some political considerations have caused some sites to be unable to post articles reaching the rest of the network.)
An article is posted on one machine to a list of newsgroups. That machine accepts it locally, then forwards it to all its neighbors that are interested in at least one of the newsgroups of the message. (Site A deems site B to be "interested" in a newsgroup if the newsgroup matches the pattern on the arc from A to B. This pattern is stored in a file on the A machine.) The sites receiving the incoming article examine it to make sure they really want the article, accept it locally, and then in turn forward the article to all their interest neighbors. This process continues until the entire network has seen the article.
An important part of the algorithm is the prevention of loops. The above process would cause a message to loop along a cycle forever. In particular, when site A sends an article to site B, site B will send it back to site A, which will send it to site B, and so on. One solution to this is the history mechanism. Each site keeps track of all articles it has seen (by their message ID) and whenever an article comes in that it has already seen, the incoming article is discarded immediately. This solution is sufficient to prevent loops, but additional optimizations can be made to avoid sending articles to sites that will simply throw them away.
One optimization is that an article should never be sent to a machine listed in the Path line of the header. When a machine name is in the Path line, the message is known to have passed through the machine. Another optimization is that, if the article originated on site A, then site A has already seen the article. (Origination can be determined by the Posting-Version line.)
Thus, if an article is posted to newsgroup "net.misc", it will match the pattern "net.all" (where "all" is a metasymbol that matches any string), and will be forwarded to all sites that subscribe to net.all (as determined by what their neighbors send them). These sites make up the "net" subnetwork. An article posted to "btl.general" will reach all sites receiving "btl.all", but will not reach sites that do not get "btl.all". In effect, the articles reaches the "btl" subnetwork. An article posted to newsgroups "net.micro,btl.general" will reach all sites subscribing to either of the two classes.