Survival guide - Programming Languages Theory

#me

works at W3C, we do Web-related stuff :-)
do Linked Data, and a big soon-to-be-released project in Scala
slides available at http://www.w3.org/2012/Talks/bertails-devoxxfr/languages/

Why this talk?

One of the challenges that I face today is to make programmers believe that learning new stuff will only help them think better. It's not mandatory that they will need all of these tools as part of their day job. But broadening your mental model can only help your thought process to leverage a wider playground.

Debasish Ghosh, It's the familiarity model!

Agenda

Defining main concepts
On Language design
On Programming Paradigms
On Languages
Commonalities in Programming Languages
Functional programming
Object Oriented Programming
Types
Polymorphism
Programming with Types

Theoritical Foundations

Formal Languages
Automata
Algorithmics
λ-calculus
Semantics
Formal verification
Type Theory
Complexity
Logic
Philosophy

Computer Science and Philosophy

What philosophy gave us

moving away from hardware
logic
abstraction

What philosophy can teach us

conceptual analysis
reflective examination of practice and thought
thinking about aspects of experience
the practice of a distinctive way of life and thought

Finding your way among alternatives 1/2

[...] engineering design is a social process [...] When there are alternatives, [Bucciarelli] says, then there can be better and worse. In such a situation, "The really important and interesting question becomes: What do we mean by a better design?"

Bucciarelli (1994), p. 197

Finding your way among alternatives 2/2

Alternatives and choice

programming style
programming methodology
understanding and choosing concepts
...

Assumption

There is more than one truth
What to do when the inherent philosophy of a language is not enough for solving a given problem?

Defining main concepts

What's in a Language

Specification

Syntax
Features
Semantics

Connected concepts

Compiler: Parser, Type-Checker, ...
Runtime
IDE

What's in a Program?

computation / calculation
input / output
other systems machine are there too: filesystem, network, console, etc.
sometimes, the machine is there too: memory, disk, etc.

Q: what's more important, the calculation or its result? Are you declarative or procedural?

My definition: Programming is about transforming a specification into phrases that could be understood by a machine.

The nature of Programming

"natural" programming style

"it's better because that's the natural way of programming"
our mind works with logic, what about Logic Programming then?
on the field (eg. my experience as a teacher :-), novice think more easily with state
what about OOP then?

A Social Aspect to Programming

Starts at school, with pedagogy

book
practice

Social pressure and local culture

communities, trends and perception
Java to Scala: complex type system, for academics
Scala to Java: old, verbose
Ruby developpers about everything else: it stinks :-)

Culture and Environment

the former imposes the later

startup: time to market
corporate IT: least surprise
academics and others: exploring / refreshing

lttle research on what matters

impact from environment
impact from language features
impact from cognitive models

The Tasks of Programming

programming subtasks

conceptualization / understanding the problem: domain knowledge (statistics, banking)
design: design strategy, architecture, algorithms
coding: programming language, programming conventions
maintenance: debugging, testing, evolution

Q: do people actually take those into accounts?

Programming is a complex cognitive and social task: need for abstraction

Language Design and Acquisition of Programming

Evolution implies changes in truth, because we have a better understanding of the problems

started with ingeneers/hackers

they provide improvements based on needs (speed, power, elegance)

moved in the hands of theoricians

improvements by design rather than demand
langages are conform to formal models (functional languages, lambda-calculus, etc.)

what we see today

trade-off between utility and well-foundness
keep in mind that there is no proof that well-foudness is ok with the cognitive effort (still I believe in it)

On Language design

Language design

Good language design may be summarized in five catch phrases: simplicity, security, fast translation, efficient object code, and readability. [...] Many language designers have adopted alternative principles.

Hoare, 73

Simplicity

small range of instructions
uniform format
simple effects
can be described and understood independently

Obviously related to syntax!

Orthogonality and Complexity

related to simplicity
there should be not more than one way of expressing any action in the language
language components are mutually independent

In a truly orthogonal language, a small set of basic facilities may be combined without arbitrary restrictions according to systematic rules.

M. Petre, Expert Programmers and Programming Languages

Q: is Scala complex or difficult?

Security

Hoare's principles

only syntactically correct programs should be accepted
results and error should be predictable
surprising and wild programs are not wanted

This lead us to Correctness

Correctness

a program can be proven to conform to a specification
this drives the aspiration to develop languages that conform to formal models

Readability

Programs are read by people: the language should encouraged clarity, hence:

The readability of programs is immeasurably more important than their writability.

Hoare

Authors choose stylistics factors commonly thought to influence style with little or no supporting evidence demonstrating the importance of effect of that characteristic on program comprehension and maintainability.

Homan and Cook, 88

Clarity of structure

let the developper express the intended solution structure visibly
not that much related to syntax

Modularity

large programs are manageable in "local" and independent pieces
with locality (aka. limited scope), modules interact via interface
with independance: separate coompilation, separate debugging, independent testing, etc.

Abstraction

hiding underlying complexity
more high-level operations

The influence of the programming language on programming

common conviction: the programming language guides the developper to the solution
studies only prove that they influence (read Expert Programmers and Programming Languages, by M. Petre)
to be considered: evangelism around languages tend to tell programmers how they should think

Vehiculing meaning

The highest goal of programming language design [is] to enable good ideas to be expressed

Hoare, 81

I view a programming language as a vehicule for the description of (potentially highly sophisticated) abstract mechanisms.

Dijsktra, 76

The goal should be: language design should not influence thinking but reflect it clearly

On Programming Paradigms

Paradigm

A paradigm is what members of a scientific community, and they alone, share.

Thomas Khun, The Essential Tension, 1977

Imperative programming style

computation = sequence of statements
a program manipulates state

usually related to

procedural
object oriented
scripting
...

Procedural Programming

aka. imperative programming
references to the state of the machine, in particular memory
Object Programming is somewhat an orthogonal concept
Q: what do records and procedures become in OOP?

Scripting

general language vs task-centric
hosted / embedded in other languages
sometimes, starts as DSL and evolve in full-feature language
usually, light environment: Web browser, shell, etc.

Declarative programming style

computation = description
functional programming
logic programming
...

Logic programming

search for truth
constrained enough to be decidable
facts + rules (predicates)
can be extended with constraints

parent(bill,mary).
parent(mary,john).
ancestor(X,Y) :- parent(X,Y).
ancestor(X,Y) :- parent(X,Z),ancestor(Z,Y).

Datalog

restriction on Prolog
pretty useful for data/databases (check Datomic!)

On Languages

Language

Syntax

parser
grammar

Semantics

compiler / interpreter
Semantics preservation
runtime

what about

standard library
framework
standards

Parsing

from text to abstract representation of the program
sometimes, tokenizer is another task

How to parse

lex/yacc and variants
parser combinators

Formal Grammar

formally describes valid sentences
does not say anything about correct ones
BNF = Backus Normal Form

Abstract Syntax Tree

abstract model to manipulate a program
different ways to encode an AST

If(
  Equal(
    Add(Int(1), Int(2))
    Int(3)),
  Call("println", "true"),
  Call("println", "false"))

Semantics

Denotational semantics
Operational semantics
Axiomatic semantics

Commonalities in Programming Languages

Variables

variable declaration
variable assignment
value
by value vs by reference
are functions just values?

// Java
String i;
i = "foo";
i = "bar";

// Scala
val i = new java.lang.StringBuilder("foo")
i = new java.lang.StringBuilder("foobar") // not valid
i.append("bar") // valid

// PHP
$i // the value depends on the context

Lazy evaluation 1/2

def time(computation: => Unit) = {
  val timestamp = System.currentTimeMillis()
  computation
  println(System.currentTimeMillis() - timestamp)
}

time(Thread.sleep(2000))

Lazy evaluation 2/2

Sometimes I searched eight hours for the source of a space leak. Other times I had to really dig around the internet to learn how to force a computation so that I could measure the time of said computation.

Eugen Kiss, The Functional View of the New Languages

lazyness per default considered harmful
some things are difficult without it: lazy datastructures
hard to get it right when done by hand
read John Hughes' Why functional programming matters

(im)mutability

concerns a data-structure, or the state of an object
val is not enough
immutable objects can be passed around and read with no lock
easier to reason about (especially for invariants)

can be achieved

per convention (eg. no setter)
per construction (language feature)

Q: What about your favorite language?

Immutable data-structure 1/4

sealed trait ListInt

case class Node(i: Int, next: ListInt) extends ListInt

case class Empty() extends ListInt

def concat(l1: ListInt, l2: ListInt): ListInt = l1 match {
  case Empty() => l2
  case Node(i, next) => Node(i, concat(next, l2))
}

Immutable data-structure 2/4

Immutable data-structure 3/4

Immutable data-structure 4/4

remarks

previous copy still there
can be garbage-collected
sub-parts can be shared

in practice

there are a lot of things you can do this way
to read: Chris Okasaki's Purely Functional Data Structures (thesis)
very functional approach

Functional programming

Disclaimer

Shamelessly stolen from Rúnar's Guerilla Guide to Functional Programming

You should buy his book Functional Programming in Scala

Programming with Functions

Function

A function f: A => B relates every value of type A to exactly one value of type B

A partial function is the same but for a subset of A

no side-effect

Reading/writing files
Assigning to variables
Setting fields on objects
Mutating data structures
Throwing exceptions

Referential Transparency

An expression e is referentially transparent if for every program p every occurrence of e in p can be replaced with the value of e without affecting the observable result of p.

A function f is pure if the expression f(x) is referentially transparent for all referentially transparent x.

Referential Transparency - example 1/3

val e1 = "scala"
val e2 = e1.reverse
val e3 = e2.reverse
val e4 = e1 + e2 + e3

Referential Transparency - example 2/3

val e2 = "scala".reverse
val e3 = e2.reverse
val e4 = "scala" + e2 + e3

Referential Transparency - example 3/3

val e4 = "scala" + "alacs" + "scala"

Benefits of being pure

A restriction on the kinds of programs you write.
Only referentially transparent expressions.
Only pure functions.
No side-effects, ever.
Can be done in any language.

Q: Is this pure?

def reverseTwice(l: List[Int]): List[Int] = {
  val reverseOnce = l.reverse
  println(reverseOnce.toString())
  val reverseAgain = reverseOnce
  return reverseAgain
}

def sum(l: List[Int]): Int = {
  var acc = 0
  for {
    i <- 0 to (l.size - 1)
  } acc += l(i)
  return acc
}

Are there limits to being pure?

I mean, we do interact with the world, right?

Everything is an expression

even if-else
even a block
no need for return
what about statements?

Higher-order function

A higher-order function is a function that takes other functions as input.

val isOdd: Int => Boolean = { x: Int => x % 2 == 0 }

// List(0, 2, 4, 6, 8, 10)  
val odd_numbers = (0 to 10).toList.filter(isOdd)

Functional patterns

map

List(a, b, c).map(f) == List(f(a), f(b), f(c))

foldLeft

List(a, b, c).foldLeft(x)(f) == f(f(f(x, a), b, c))

Object Oriented Programming

Objects

Objects in the real world have only one thing in common -- they are all different.

Antero Taivalsaari, Classes vs. Prototypes Some Philosophical and Historical Observations

Main concepts

Dynamic lookup
Abstraction
Subtyping
Inheritance

Dynamic lookup

message semantics
the object makes the method choice
OOP is all about being dynamic

Dynamic lookup =? overloading

Q: what's in a message?
static lookup table
actor model
general case: dynamic lookup ≠ overloading

Abstraction

General

implementation details are hidden
interface

In the case of objects

data
functions
behavior
public interface / private implementation
relation with Abstract Data Types?

Subtyping / substitutibity

using one type in place of another
constraint on safe/allowed substitutibity?
- syntactic: declarative approach
- dynamic: runtime presence
- semantic: cannot be easily enforced
main property: sharing behavior

Inheritance

all about reusing/sharing code between objects
avoid code duplication
Q: subtyping =? inheritance
general case: subtyping ≠ inheritance
- subtyping is about the interface
- inheritance is about implementation
in pratice: both concepts come together

Virtual method / overriding

semantics dealing with ambiguity
being virtual by default
enforcing a safe use of overriding

Semantics of super

refers to another object
usually, follows class hierarchy

Virtual method table

achieve method dispatch
name binding: early binding vs late binding
single vs multiple dispatch

Class

about classifying objects
grouping shared properties in one description
assigning a name

Multiple inheritence and diamond problem

Fantom's way to solve it: name restriction to avoid collision

Mixins

change usual semantics of super
called traits in Scala
class D extends A with B with C
remark: in Scala, you can't mixi traits dynamically, but the following is ok val d = new A() with B with C
linearization: path following order of mixins declarations
linearization's algorithm proven to be correct

Mixins example

taken from Programming in Scala

Prototypes

not class-based
cloning the prototype is the thing
different object extension strategies
- delegation: the extended object is implemented as if having its components split and untouched
- concatenation: the extended object is copied, then modified

Philosophical considerations on Objects

OOP is hard(er than you think)

class vs prototype
which view of the world do you have? (if you're French, you're probably living by Aristotle's concepts)
no optimum/perfect class hierarchy
good enough approximations
would you trust objects not in your hierarchy?

and you probably get it wrong

do you encapsulate state?
do you enforce invariants?
even in the presence of inheritance?
easy to violate object intent when using code re-use (eg. Java's equals, hashCode, etc.)

Comparing concepts

function
method
constructor

Implementation details

Objects in Python

class MyClass:
  """A simple example class"""
  i = 12345
  def f(self):
    return 'hello world'
  def __init__(self):
    self.data = []

Static methods

where do static methods/values fit in the OOP picture?
singleton

Bridging the gap between FP and OOP

functions as objects
objects as functions

Types

A type indicates a set of values that have the same sort of generic meaning or intended purpose

Some types, such as abstract types and function types, might not be represented as values in the running computer program.

Types can be annotated or inferred.

Type System

A type system defines

how a programming language classifies values and expressions into types
how it can manipulate those types
how they interact

Type Checker

A type checker verifies if an expression has a type that belongs to the type system

Inference

An inference engine

tries to recontruct a type
starts with simple known facts
may try to achieve polymorphism (difficult)
should deal with type annotation

What's in a name? static and dynamic

Let's ask the public :-)

And what about strong vs weak?

Static / dynamic type system

static type system

operates only on source code
assign types to all expressions
types may not be retain at runtime

dynamic type system

compiler generates code to keep tracks of type
the runtime checks that the type is conform to the operation on the expression

the static-vs-dynamic drama

Many programmers have used very poor statically typed languages.

Many programmers have used dynamically typed languages very poorly.

Chris Smith, What To Know Before Debating Type Systems

Critics about being Static or Dynamic 1/2

[Types] are mainly good for documentation — documentation for both man and machine. The idea that they make your programs more reliable is a myth.

Gilad Bracha, Software Engineering Radio, episode 140

Critics about being Static or Dynamic 2/2

[...] what type-checkers do is they attempt to prove that your program conforms to the type system. The type system guarantees you that, if you conform to it, certain types of bad thing won’t happen. That doesn’t mean that if you don’t conform to it bad things will happen. They do not prove that your program is bad; they prove that your program isn’t good, according to their definition of good.

Gilad Bracha, Software Engineering Radio, episode 140

Alternatives

Gradual Typing / Soft Typing
Bringing Dynamics in Static
More powerful type systems
Effect Systems

Lying with Types

String typing™

IO Monad

def reverseTwice(l: List[Int]): IO[List[Int]]

enforcing the use of types everywhere can be cumbersome

Effects

an effect system is an extension of the type system
it modelizes an effect kind
the region denotes "on what" the effect is being done
in A Generic Type-and-Effect System, Daniel Marino and Todd Millstein, the region is generalized as: grant + check

Examples

checked exceptions in Java
race condition checking
region-based memory management

Type System: expressivity and limits

decidability
semi-decidability
tradeoff between power and automatisation (inference)

Static Analysis

Static Analysis refers to any kind of analysis that does not require to execute the program
ex: lint in JavaScript, Sonar for Java, PMD, checkstyle, etc.
can also be used to increase the safety and confidence in a program

Type safety

You know, maybe the dynamic crowd just wants freedom to not have to think as carefully about what they're doing. The software might not be correct or robust, but maybe it doesn't have to be.

Rúnar Bjarnason

Q: Are types enough to achieve safety? Is testing a mandatory phase in any programming language?

Proofs and Types

Curry-Howard Isomorphism: a proof is a program, the formula it proves is a type for the program

What's a proof for:

forall x: nat, exists y: nat,  x = 2 * y  \/  x = 2 * y + 1

related to dependent-types:

div2 :  (x: nat) => { y: nat | x = 2 * y  \/  x = 2 * y + 1 }

Polymorphism

Polymorphism: how to achieve it

Polymorphism, literally means “having multiple forms”, refers to constructs that can take on different types as needed.

Parametric polymorphism
Ad hoc polymorphism
Subtype polymorphism

Dynamically typed languages support all kind of polymorphism by definition.

Parametric polymorphism

for functions and datatype
involves... a parameter
no restriction on type, but can be bounded (Bounded Parametric Polymorphism)

reverse : 'a list -> 'a list

Ad-hoc polymorphism@@

when the polymorphic type are known
aka. overloading
also, typeclasses are a form of ad-hoc polymorphism

Subtype polymorphism

When a function takes a value of type T, and that S is a subtype for T, then we can apply the substitution principle.

Liskov substitution principle

The Liskov Substitution Principle defines that a subtype may be substituted for a super type and the behavior remains unchanged.

Let q(x) be a property provable about objects x of type T. Then q(y) should be provable for objects y of type S where S is a subtype of T.

B. Liskov and J. Wing, 94

counter-example: Rectangle / Square, getter / setter / area

Programming with Types

void / Unit

the fundamental difference between a procedure and a function is in its type
void denotes that there is no type
Unit is a special type for representing procedures
it has only one inhabitant (hence its name)
Q: as we return something, is there a performace impact?
Q: no other use?

Structural Type

sometimes, nominal typing in OOP is too restrictive
and Pythonists brag too much about Duck Typing
Q: if it quacks like a duck, is it really a duck?

Union Type

a Union Type denotes the mathematical union of its contituing types
two kinds: tagged (Scala) and un-tagged union (Ceylon)
impact the way you deal with the type
tagging allow to avoid reflection techniques
tagging imposes to use a discrimant, but that can be very cheap

instanceOfUnionType match {
  case o: SomeType => ...
  case o: SomeType(x, y, z) => ...
}

oh, wait, I'm told that Scala type system is powerful enough to encode unboxed Union Types :-)

Intersection Type

an Intersection Type denotes a type that is the common subset of all its constituing types
by the way, did you know they exist in Java?

T extends Foo & Bar

Algebraic Data Type

an ADT is described by a bunch of functions defining how to construct values for this type
a language using ADT must offer a mechanism to unwrap values from an ADT, usually with pattern matching
Scala generalizes the concept with extractors

case class Foo(s: String, i: Int)

class Bar(val s: String, val i: Int)

object Bar {
  def unapply(bar: Bar): Option[(String, Int)] =
    Some((bar.s, bar.i))
}

Variance

usually, generic type parameters are nonvariant
but often, they must be either covariant or contravariant

Higher-Kinded Types

Kinds are type constructors
you shouldn't be able to get an expression of type List, because it's not a type
Kings make programs much more generic

Typeclass

import java.util.Comparator

def max1[T](l: List[T], comparator: Comparator[T]): Option[T] =
  l.reduceLeftOption{ (x, y) => if (comparator.compare(x, y) < 0) x else y }

def max2[T](l: List[T])(implicit comparator: Comparator[T]): Option[T] =
  l.reduceLeftOption{ (x, y) => if (comparator.compare(x, y) < 0) x else y }

implicit val intComparator = new Comparator[Int] {
  def compare(i: Int, j: Int) = i - j
}

max2(List(1, 2, 3))

def max3[T : Comparator](l: List[T]): Option[T] = {
  val comparator = implicitly[Comparator[T]]
  l.reduceLeftOption{ (x, y) => if (comparator.compare(x, y) < 0) x else y }
}

Phantom Types

A Phantom Type is a type that does not contribute to the computation
it can be used as a witness (living in the type world) to check that a type constraint is as expected
for example, to make sure that a builder was correctly used

Phantom Type - example 1/3

class Pizza()

class PizzaBuilder() {
  def withTomato = new PizzaBuilder()
  def build() = new Pizza()
}

object PizzaBuilder {
  def apply() = new PizzaBuilder
}

val pizza = PizzaBuilder().build()

Phantom Type - example 2/3

class Pizza()

trait TRUE
trait FALSE

class PizzaBuilder[HasTomato] private () {
  def withTomato = new PizzaBuilder[TRUE]()
  def build()(implicit ev: HasTomato =:= TRUE) = new Pizza()
}

object PizzaBuilder {
  def apply() = new PizzaBuilder[FALSE]
}

PizzaBuilder().build() // Cannot prove that FALSE =:= TRUE
PizzaBuilder().withTomato.build() // OK

Phantom Type - example 3/3

class Pizza()

trait TRUE
trait FALSE

class PizzaBuilder private () {
  type HasTomato
  def withTomato = new PizzaBuilder() { type HasTomato = TRUE }
  def build()(implicit ev: HasTomato =:= TRUE) = new Pizza() { type HasTomato = TRUE }
}

object PizzaBuilder {
  def apply() = new PizzaBuilder() { type HasTomato = FALSE }
}

PizzaBuilder().withTomato.build()

Types at runtime

erasure
reification
reflection
manifest

Dependent Types

A dependent type depending on a value
it usually has the form { x | P(x) }
the property can be check at runtime

Singleton Types

a singleton type denotes the unique type of a particular instance
so you can say that a method returns a particular object
makes easy to specify methods that modify the object

Self Types

A self type is an assumed type for this
very useful for trait/mixins

class Foo() { ... }

trait Bar { this: Foo => ... }

val foobar = new Foo() with Bar

HList

A HList with a List with heterogenous types, but still supports the usual operations (map, head, tail, etc.)

trait Fruit
case class Apple() extends Fruit
case class Pear() extends Fruit

type FFFF = Fruit :: Fruit :: Fruit :: Fruit :: HNil
type APAP = Apple :: Pear :: Apple :: Pear :: HNil

val a : Apple = Apple()
val p : Pear = Pear()

val apap : APAP = a :: p :: a :: p :: HNil
val ffff : FFFF = apap  // APAP <: FFFF

Revisiting the visitor pattern

Demo! Remember all we learnt about ADT, dispatch, higher-order function