Web Architecture: Information Representation Abstract The World Wide Web is the universe of network-accessible information, an embodyment of human knowledge. In this report on the software archi tecture of the web, we discuss formally (using Larch) the data models and structures used in the web, and their semantics; that is, their r elationship to human knowledge. d d @@From octets to characters to markup languages and message formats fo r email and other transports. d d Keywords: Larch, Character Set, Hypermedia, MIME, Internet Media Types , SGML, ILU, Relational Data Model, GIF, Image Formats d d @@terminology needs refinement: data/information, information represen tation, knowledge, knowledge representation, semantics, language, met a-information. d d d Information Primitives d d @@Bits, bytes, characters, numbers, symbols, and Strings. d d -720 octet: trait d -720 includes d -720 intfield(Octet, 0..255) d -720 sequence(OctetSeq, Octet) d -720 d @@discuss i18n. Use examples from "character set considered harmful" d d -720 character: trait d -720 includes octet d -720 sequence(CharSeq, Char) d -720 d -720 introduces d -720 encodes: Encoding, CharSeq, OctetSeq -> Bool d -720 "encodes(e,cs,os) ::= d -720 In the encoding e, the sequence of d -720 characters cs is represented by d -720 the sequence of octets os" d -720 d -720 code: CodedCharSet, Char, Int d -720 "code(ccs, c, p) ::= d -720 In the coded character set ccs, d -720 the character at position p is c." d -720 d Simple Structures d d lists, sequences, strings d d Relations, Tables, and Databases d d -720 relational-table: trait d -720 includes d -720 initeSet(Rows, Row) d -720 list(Row, Value) d -720 set(Type, Value) d -720 list(Schema, ColDesc) d -720 set(Query, Row) d -720 d -720 introduces: d -720 tuple: Table[Schema, Rows] d -720 tuple: ColDesc[name: Identifier, t: Type] d -720 d -720 empty: Schema -> Table d -720 d -720 check: Schema, Row -> Bool d -720 insert: Table, Row -> Table d -720 project: Table, Schema -> Table d -720 select: Table, Query -> Table d -720 join: Table, Table -> Table d -720 d -720 asserts: d -720 check([], []) = True d -720 check(s, r) = r[0] \\ in s[0].type /\\ check(rest(s), rest(r)) d -720 d -720 check(t.schema, r) -> insert(t,r).schema = t.schema d -720 check(t.schema, r) -> insert(t,r).rows = insert(r, t.rows) d -720 d -720 select(t,q).schema = t.schema d -720 r \ select(t,q).rows = r \ t.rows /\\ r \\ in q d -720 d -720 project(t,s).schema = s d -720 r \ project(t, []) = (r = []) d -720 subseq(s, t.schema, map) -> ( d -720 r \\ in project(t,s).rows = rr \ t.rows /\\ subseq(r, rr, map) ) d -720 d -720 join(t1, t2).schema = append(t1.schema, t2.schema) d -720 r \\ in join(t1, t2).rows = head(r, length(t1.schema)) \ t1.rows /\\ tail(r, length(t2.schema)) \ t2.rows d -720 d -720 relational-sort: trait d -720 includes: d -720 relational-table d -720 d -720 introduces: d -720 ordered: Schema, Order, Row, Row -> Bool d -720 d -720 relational-query: trait d -720 d -720 includes: d -720 relational-table, d -720 lambda(Schema for Formals, Row for Actuals, Bool for Value) d -720 expression d -720 d -720 introduces: d -720 where: Schema, Expr -> Query d -720 d -720 asserts: d -720 r \ where(s, e) = apply(lambda(s, e), r) d -720 d -720 d Regular Expressions and Finite State Machines d d @@protocol state machines, language idioms d d -720 regular-expression: trait d -720 d -720 introduces d -720 empty -> RE d -720 union: RE, RE -> RE (* RE / RE *) d -720 concatenate: RE, RE -> RE (* RE , RE *) d -720 kleene-star: RE -> RE (* RE* *) d -720 d -720 asserts: d -720 Assoc(concatenate) d -720 AssocComm(union) d -720 ForAll a, b, c: RE d -720 a/a = a d -720 a** = a* d -720 a,a* = a* d -720 a*,a = a* d -720 a / (b,c) = (a/b), (a/c) d -720 d -720 re-util: trait d -720 includes: d -720 regular-expression d -720 d -720 introduces d -720 optional: RE -> RE (* RE? *) d -720 one-or-more: RE -> RE (* RE+ *) d -720 asserts: d -720 a? = a/empty d -720 a+ = a,a* d -720 d Languages and Grammars d d -720 grammar: trait d -720 includes d -720 FiniteSet(Alphabet, Symbol) d -720 List(String, Symbol) d -720 Set(Language, String) d -720 FiniteSet(Productions, Production) d -720 Relation(Derives, String) d -720 d -720 introduces d -720 tuple: Grammar[a: Alphabet, s: Symbol, d -720 p: Productions] d -720 tuple: Production[lhs,rhs] d -720 derives: Grammar -> Derives d -720 L: Grammar -> Language d -720 d -720 asserts d -720 forall w: String, g: Grammar d -720 w \ L(g) = [g.s] w d d Automated Parsing d d -720 d scanner: trait d -720 includes character d -720 sequence(Construct, Token) d -720 finite(State) d -720 d -720 introduces: d -720 Token = [Symbol, OctetSeq] d -720 recognize: Scanner, State, OctetSeq, d -720 Construct, State, OctetSeq -> Bool d -720 "recognize(s, si, osi, c, so, oso) ::= d -720 Scanner s, in state si, when presented d -720 with OctetSeq osi, recognizes Construct c, d -720 enters state so, and leaves oso remaining." d -720 d Message and Document Formats d d @@more traits: internet media type, ILU message (pickle) d d -720 BodyPart: trait d -720 d -720 SGML: trait d -720 includes: d -720 character d -720 list(Entity, Character) d -720 list(Names, Name) d -720 d -720 introduces: d -720 document-character-set: Document -> CodedCharacterSet d -720 d -720 document-element: Document -> Element d -720 d -720 name: ElementType -> Name d -720 name: Element -> Name d -720 d -720 attributes: ElementType -> Names d -720 attributes: Element -> Names d -720 attrval : Element, Name -> String d -720 d -720 conforming: Document -> Bool d -720 d Turing Machines and Programs d d read, @@cite "Ultimate..." papers d d -720 lambda: trait d -720 includes d -720 list(Formals) d -720 list(Actuals) d -720 d -720 introduces: d -720 lambda: Formals, Expr -> Closure d -720 apply: Closure, Actuals -> Value d -720 d -720 expression: trait d -720 includes relational-table d -720 @@change this to generic expr trait d -720 introduces: d -720 where: Expr -> Query d -720 d -720 equal: Identifier, Value -> Expr d -720 equal: Identifier, Identifier -> Expr d -720 conjunction: Expr /\\ Expr -> Expr d -720 disjunction: Expr \\/ Expr -> Expr d -720 d -720 asserts: d -720 s[n].name = i -> d -720 (apply(lambda(s, equal(i,v)), r) = d -720 (r[n] = v)) d -720 s[n1].name = i1 / \\ s[n2].name = i2 -> d -720 (apply(lambda(s, equal(i1,i2)), r) = d -720 (r[n1] = r[n2])) d -720 d Objects, Operations, Attributes, and Types d d Logic: Argument, Deduction, and Proof d d Hypertext: web of nodes and links d (graph of vertices and edges) d semantics of n1 -l-> n2 d is predicate: P(n1, n2) d object types d ontology d frames d rules d d -720 Logic: trait d -720 d -720 includes d -720 Language(Formula) d -720 Set(Formulas, Formula) d -720 Relation(Logic, Formulas, Formula) d -720 Tree(Argument, Formula, Arc) d -720 d -720 introduces d -720 axiom: Logic, Formula -> Bool d -720 theorem: Logic, Formula -> Bool d -720 proof: Logic, Argument, Formula -> Bool d -720 d -720 asserts d -720 ForAll l:Logic, f,n: Formula, d -720 hyp: Formulas d -720 axiom(f) -> theorem(f) d -720 theorems(hyp) /\\ -> d -720 theorem(f) d -720 root(a, f) d -720 (forall n:Formula d -720 leaf(a, n) -> axiom(n)) /\\ d -720 (forall hyp: Formulas, c: Formula d -720 children(a, c, hyp) -> )) d -720 -> proof(l, a, f) d -720 d -720 theorem(l, f) = Exist a:Argument d -720 proof(l, a, f) d d }