Devoxx France@DevoxxFR / #DevoxxFR
Survival guide - Programming Languages Theory
Why this talk?
One of the challenges that I face today is to make
programmers believe that learning new stuff will only help
them think better. It's not mandatory that they will need all
of these tools as part of their day job. But broadening your
mental model can only help your thought process to leverage a
wider playground.
Debasish Ghosh, It's the familiarity model!
Agenda
- Defining main concepts
- On Language design
- On Programming Paradigms
- On Languages
- Commonalities in Programming Languages
- Functional programming
- Object Oriented Programming
- Types
- Polymorphism
- Programming with Types
Theoritical Foundations
- Formal Languages
- Automata
- Algorithmics
- λ-calculus
- Semantics
- Formal verification
- Type Theory
- Complexity
- Logic
- Philosophy
Computer Science and Philosophy
What philosophy gave us
- moving away from hardware
- logic
- abstraction
What philosophy can teach us
- conceptual analysis
- reflective examination of practice and thought
- thinking about aspects of experience
- the practice of a distinctive way of life and thought
Finding your way among alternatives 1/2
[...] engineering design is a social process [...] When
there are alternatives, [Bucciarelli] says, then there can be better
and worse. In such a situation, "The really important and interesting
question becomes: What do we mean by a better design?"
Bucciarelli (1994), p. 197
Finding your way among alternatives 2/2
Alternatives and choice
- programming style
- programming methodology
- understanding and choosing concepts
- ...
Assumption
- There is more than one truth
- What to do when the inherent philosophy of a language is not enough for solving a given problem?
What's in a Language
Specification
- Syntax
- Features
- Semantics
Connected concepts
- Compiler: Parser, Type-Checker, ...
- Runtime
- IDE
What's in a Program?
- computation / calculation
- input / output
- other systems machine are there too: filesystem, network, console, etc.
- sometimes, the machine is there too: memory, disk, etc.
Q: what's more important, the calculation or its result?
Are you declarative or procedural?
My definition: Programming is about transforming a
specification into phrases that could be understood by a machine.
The nature of Programming
"natural" programming style
- "it's better because that's the natural way of programming"
- our mind works with logic, what about Logic Programming then?
- on the field (eg. my experience as a teacher :-), novice think more easily with state
- what about OOP then?
A Social Aspect to Programming
Starts at school, with pedagogy
Social pressure and local culture
- communities, trends and perception
- Java to Scala: complex type system, for academics
- Scala to Java: old, verbose
- Ruby developpers about everything else: it stinks :-)
Culture and Environment
the former imposes the later
- startup: time to market
- corporate IT: least surprise
- academics and others: exploring / refreshing
lttle research on what matters
- impact from environment
- impact from language features
- impact from cognitive models
The Tasks of Programming
programming subtasks
- conceptualization / understanding the problem: domain
knowledge (statistics, banking)
- design: design strategy, architecture, algorithms
- coding: programming language, programming conventions
- maintenance: debugging, testing, evolution
Q: do people actually take those into accounts?
Programming is a complex cognitive and social task: need for
abstraction
Language Design and Acquisition of Programming
Evolution implies changes in truth, because we have a better
understanding of the problems
started with ingeneers/hackers
- they provide improvements based on needs (speed, power,
elegance)
moved in the hands of theoricians
- improvements by design rather than demand
- langages are conform to formal models (functional
languages, lambda-calculus, etc.)
what we see today
- trade-off between utility and well-foundness
- keep in mind that there is no proof that well-foudness is
ok with the cognitive effort (still I believe in it)
Language design
Good language design may be summarized in five catch
phrases: simplicity, security, fast translation, efficient
object code, and readability. [...] Many language designers
have adopted alternative principles.
Hoare, 73
Simplicity
- small range of instructions
- uniform format
- simple effects
- can be described and understood independently
Obviously related to syntax!
Orthogonality and Complexity
- related to simplicity
- there should be not more than one way of expressing any
action in the language
- language components are mutually independent
In a truly orthogonal language, a small set of basic
facilities may be combined without arbitrary restrictions
according to systematic rules.
M. Petre, Expert Programmers and Programming
Languages
Q: is Scala complex or difficult?
Security
Hoare's principles
- only syntactically correct programs should be accepted
- results and error should be predictable
- surprising and wild programs are not wanted
This lead us to Correctness
Correctness
- a program can be proven to conform to a specification
- this drives the aspiration to develop languages that conform to
formal models
Readability
Programs are read by people: the language should encouraged
clarity, hence:
The readability of programs is immeasurably more important
than their writability.
Hoare
Authors choose stylistics factors commonly thought to
influence style with little or no supporting evidence
demonstrating the importance of effect of that characteristic
on program comprehension and maintainability.
Homan and Cook, 88
Clarity of structure
- let the developper express the intended solution structure
visibly
- not that much related to syntax
Modularity
- large programs are manageable in "local" and independent
pieces
- with locality (aka. limited scope), modules interact via
interface
- with independance: separate coompilation, separate
debugging, independent testing, etc.
Abstraction
- hiding underlying complexity
- more high-level operations
The influence of the programming language on programming
- common conviction: the programming language guides the
developper to the solution
- studies only prove that they influence (read Expert
Programmers and Programming Languages, by M. Petre)
- to be considered: evangelism around languages tend to tell
programmers how they should think
Vehiculing meaning
The highest goal of programming language design [is] to
enable good ideas to be expressed
Hoare, 81
I view a programming language as a vehicule for the
description of (potentially highly sophisticated) abstract
mechanisms.
Dijsktra, 76
The goal should be: language design should not influence
thinking but reflect it clearly
Paradigm
A paradigm is what members of a scientific community, and
they alone, share.
Thomas Khun, The Essential Tension, 1977
Imperative programming style
- computation = sequence of statements
- a program manipulates state
usually related to
- procedural
- object oriented
- scripting
- ...
Procedural Programming
- aka. imperative programming
- references to the state of the machine, in particular memory
- Object Programming is somewhat an orthogonal concept
- Q: what do records and procedures become in OOP?
Scripting
- general language vs task-centric
- hosted / embedded in other languages
- sometimes, starts as DSL and evolve in full-feature language
- usually, light environment: Web browser, shell, etc.
Declarative programming style
- computation = description
- functional programming
- logic programming
- ...
Logic programming
- search for truth
- constrained enough to be decidable
- facts + rules (predicates)
- can be extended with constraints
parent(bill,mary).
parent(mary,john).
ancestor(X,Y) :- parent(X,Y).
ancestor(X,Y) :- parent(X,Z),ancestor(Z,Y).
Datalog
- restriction on Prolog
- pretty useful for data/databases
(check Datomic!)
Language
Syntax
Semantics
- compiler / interpreter
- Semantics preservation
- runtime
what about
- standard library
- framework
- standards
Parsing
- from text to abstract representation of the program
- sometimes, tokenizer is another task
How to parse
- lex/yacc and variants
- parser combinators
Formal Grammar
- formally describes valid sentences
- does not say anything about correct ones
- BNF = Backus Normal Form
Abstract Syntax Tree
- abstract model to manipulate a program
- different ways to encode an AST
If(
Equal(
Add(Int(1), Int(2))
Int(3)),
Call("println", "true"),
Call("println", "false"))
Semantics
- Denotational semantics
- Operational semantics
- Axiomatic semantics
Commonalities in Programming Languages
Variables
- variable declaration
- variable assignment
- value
- by value vs by reference
- are functions just values?
// Java
String i;
i = "foo";
i = "bar";
// Scala
val i = new java.lang.StringBuilder("foo")
i = new java.lang.StringBuilder("foobar") // not valid
i.append("bar") // valid
// PHP
$i // the value depends on the context
Lazy evaluation 1/2
def time(computation: => Unit) = {
val timestamp = System.currentTimeMillis()
computation
println(System.currentTimeMillis() - timestamp)
}
time(Thread.sleep(2000))
Lazy evaluation 2/2
Sometimes I searched eight hours for the source of a space
leak. Other times I had to really dig around the internet to
learn how to force a computation so that I could measure the
time of said computation.
Eugen Kiss, The Functional View of the New Languages
- lazyness per default considered harmful
- some things are difficult without it: lazy datastructures
- hard to get it right when done by hand
- read John Hughes' Why functional programming matters
(im)mutability
- concerns a data-structure, or the state of an object
val
is not enough
- immutable objects can be passed around and read with no lock
- easier to reason about (especially for invariants)
can be achieved
- per convention (eg. no setter)
- per construction (language feature)
Q: What about your favorite language?
Immutable data-structure 1/4
sealed trait ListInt
case class Node(i: Int, next: ListInt) extends ListInt
case class Empty() extends ListInt
def concat(l1: ListInt, l2: ListInt): ListInt = l1 match {
case Empty() => l2
case Node(i, next) => Node(i, concat(next, l2))
}
Immutable data-structure 2/4
Immutable data-structure 3/4
Immutable data-structure 4/4
remarks
- previous copy still there
- can be garbage-collected
- sub-parts can be shared
in practice
Programming with Functions
Function
A function f: A => B
relates every value of
type A
to exactly one value of type B
A partial function is the same but for a subset
of A
no side-effect
- Reading/writing files
- Assigning to variables
- Setting fields on objects
- Mutating data structures
- Throwing exceptions
Referential Transparency
An expression e
is referentially transparent if for every program p
every occurrence of e
in p
can be replaced with the value of e
without affecting the observable result of p
.
A function f
is pure if the expression f(x)
is referentially transparent for all referentially transparent x
.
Referential Transparency - example 1/3
val e1 = "scala"
val e2 = e1.reverse
val e3 = e2.reverse
val e4 = e1 + e2 + e3
Referential Transparency - example 2/3
val e2 = "scala".reverse
val e3 = e2.reverse
val e4 = "scala" + e2 + e3
Referential Transparency - example 3/3
val e4 = "scala" + "alacs" + "scala"
Benefits of being pure
- A restriction on the kinds of programs you write.
- Only referentially transparent expressions.
- Only pure functions.
- No side-effects, ever.
- Can be done in any language.
Q: Is this pure?
def reverseTwice(l: List[Int]): List[Int] = {
val reverseOnce = l.reverse
println(reverseOnce.toString())
val reverseAgain = reverseOnce
return reverseAgain
}
def sum(l: List[Int]): Int = {
var acc = 0
for {
i <- 0 to (l.size - 1)
} acc += l(i)
return acc
}
Are there limits to being pure?
- I mean, we do interact with the world, right?
Everything is an expression
- even if-else
- even a block
- no need for
return
- what about statements?
Higher-order function
A higher-order function is a function that takes other
functions as input.
val isOdd: Int => Boolean = { x: Int => x % 2 == 0 }
// List(0, 2, 4, 6, 8, 10)
val odd_numbers = (0 to 10).toList.filter(isOdd)
Functional patterns
map
List(a, b, c).map(f) == List(f(a), f(b), f(c))
foldLeft
List(a, b, c).foldLeft(x)(f) == f(f(f(x, a), b, c))
Object Oriented Programming
Objects
Objects in the real world have only one thing in common --
they are all different.
Antero Taivalsaari, Classes vs. Prototypes Some
Philosophical and Historical Observations
Main concepts
- Dynamic lookup
- Abstraction
- Subtyping
- Inheritance
Dynamic lookup
- message semantics
- the object makes the method choice
- OOP is all about being dynamic
Dynamic lookup =? overloading
- Q: what's in a message?
- static lookup table
- actor model
- general case: dynamic lookup ≠ overloading
Abstraction
General
- implementation details are hidden
- interface
In the case of objects
- data
- functions
- behavior
- public interface / private implementation
- relation with Abstract Data Types?
Subtyping / substitutibity
- using one type in place of another
- constraint on safe/allowed substitutibity?
- syntactic: declarative approach
- dynamic: runtime presence
- semantic: cannot be easily enforced
- main property: sharing behavior
Inheritance
- all about reusing/sharing code between objects
- avoid code duplication
- Q: subtyping =? inheritance
- general case: subtyping ≠ inheritance
- subtyping is about the interface
- inheritance is about implementation
- in pratice: both concepts come together
Virtual method / overriding
- semantics dealing with ambiguity
- being virtual by default
- enforcing a safe use of overriding
Semantics of super
- refers to another object
- usually, follows class hierarchy
Virtual method table
- achieve method dispatch
- name binding: early binding vs late binding
- single vs multiple dispatch
Class
- about classifying objects
- grouping shared properties in one description
- assigning a name
Multiple inheritence and diamond problem
Fantom's way to solve it: name restriction to avoid collision
Mixins
- change usual semantics of super
- called
trait
s in Scala
-
class D extends A with B with C
- remark: in Scala, you can't mixi traits dynamically, but the following is ok
val d = new A() with B with C
- linearization: path following order of mixins declarations
- linearization's algorithm proven to be correct
Mixins example
taken from Programming in Scala
Prototypes
- not class-based
- cloning the prototype is the thing
- different object extension strategies
- delegation: the extended object is implemented as if having its components split and untouched
- concatenation: the extended object is copied, then modified
Philosophical considerations on Objects
OOP is hard(er than you think)
- class vs prototype
- which view of the world do you have? (if you're French, you're probably living by Aristotle's concepts)
- no optimum/perfect class hierarchy
- good enough approximations
- would you trust objects not in your hierarchy?
and you probably get it wrong
- do you encapsulate state?
- do you enforce invariants?
- even in the presence of inheritance?
- easy to violate object intent when using code re-use (eg. Java's
equals
, hashCode
, etc.)
Comparing concepts
- function
- method
- constructor
Implementation details
Objects in Python
class MyClass:
"""A simple example class"""
i = 12345
def f(self):
return 'hello world'
def __init__(self):
self.data = []
Static methods
- where do static methods/values fit in the OOP picture?
- singleton
Bridging the gap between FP and OOP
- functions as objects
- objects as functions
Types
A type indicates a set of values that have the same sort of
generic meaning or intended purpose
Some types, such as abstract types and function types, might
not be represented as values in the running computer program.
Types can be annotated or inferred.
Type System
A type system defines
- how a programming language classifies values and
expressions into types
- how it can manipulate those types
- how they interact
Type Checker
A type checker verifies if an expression has a type that
belongs to the type system
Inference
An inference engine
- tries to recontruct a type
- starts with simple known facts
- may try to achieve polymorphism (difficult)
- should deal with type annotation
What's in a name? static and dynamic
Let's ask the public :-)
And what about strong vs weak?
Static / dynamic type system
static type system
- operates only on source code
- assign types to all expressions
- types may not be retain at runtime
dynamic type system
- compiler generates code to keep tracks of type
- the runtime checks that the type is conform to the operation on the expression
the static-vs-dynamic drama
- Many programmers have used very poor statically typed
languages.
- Many programmers have used dynamically typed languages
very poorly.
Chris Smith, What To Know Before Debating Type Systems
Critics about being Static or Dynamic 1/2
[Types] are mainly good for documentation — documentation
for both man and machine. The idea that they make your
programs more reliable is a myth.
Gilad Bracha, Software Engineering Radio, episode 140
Critics about being Static or Dynamic 2/2
[...] what type-checkers do is they attempt to prove that your
program conforms to the type system. The type system
guarantees you that, if you conform to it, certain types of
bad thing won’t happen. That doesn’t mean that if you don’t
conform to it bad things will happen. They do not prove that
your program is bad; they prove that your program isn’t good,
according to their definition of good.
Gilad Bracha, Software Engineering Radio, episode 140
Alternatives
- Gradual Typing / Soft Typing
- Bringing Dynamics in Static
- More powerful type systems
- Effect Systems
Effects
- an effect system is an extension of the type system
- it modelizes an effect kind
- the region denotes "on what" the effect is being done
- in A Generic Type-and-Effect System, Daniel Marino and Todd Millstein, the region is generalized as: grant + check
Examples
- checked exceptions in Java
- race condition checking
- region-based memory management
Type System: expressivity and limits
- decidability
- semi-decidability
- tradeoff between power and automatisation (inference)
Static Analysis
- Static Analysis refers to any kind of analysis that does not require to execute the program
- ex: lint in JavaScript, Sonar for Java, PMD, checkstyle, etc.
- can also be used to increase the safety and confidence in a program
Type safety
You know, maybe the dynamic crowd just wants freedom to not
have to think as carefully about what they're doing. The
software might not be correct or robust, but maybe it doesn't
have to be.
Rúnar Bjarnason
Q: Are types enough to achieve safety? Is testing a mandatory phase in any programming language?
Proofs and Types
- Curry-Howard Isomorphism: a proof is a program, the formula it proves is a type for the program
- What's a proof for:
forall x: nat, exists y: nat, x = 2 * y \/ x = 2 * y + 1
- related to dependent-types:
div2 : (x: nat) => { y: nat | x = 2 * y \/ x = 2 * y + 1 }
Polymorphism: how to achieve it
Polymorphism, literally means “having multiple forms”, refers
to constructs that can take on different types as needed.
- Parametric polymorphism
- Ad hoc polymorphism
- Subtype polymorphism
Dynamically typed languages support all kind of polymorphism by
definition.
Parametric polymorphism
- for functions and datatype
- involves... a parameter
- no restriction on type, but can be bounded (Bounded Parametric Polymorphism)
reverse : 'a list -> 'a list
Ad-hoc polymorphism@@
- when the polymorphic type are known
- aka. overloading
- also, typeclasses are a form of ad-hoc polymorphism
Subtype polymorphism
When a function takes a value of type T, and that S is a
subtype for T, then we can apply the substitution principle.
Liskov substitution principle
The Liskov Substitution Principle defines that a subtype may be
substituted for a super type and the behavior remains unchanged.
Let q(x) be a property provable about objects x of type T. Then q(y) should be provable for objects y of type S where S is a subtype of T.
B. Liskov and J. Wing, 94
counter-example: Rectangle / Square, getter / setter / area
void / Unit
- the fundamental difference between a procedure and a function is in its type
void
denotes that there is no type
Unit
is a special type for representing procedures
- it has only one inhabitant (hence its name)
- Q: as we return something, is there a performace impact?
- Q: no other use?
Structural Type
- sometimes, nominal typing in OOP is too restrictive
- and Pythonists brag too much about Duck Typing
- Q: if it quacks like a duck, is it really a duck?
Union Type
- a Union Type denotes the mathematical union of its contituing types
- two kinds: tagged (Scala) and un-tagged union (Ceylon)
- impact the way you deal with the type
- tagging allow to avoid reflection techniques
- tagging imposes to use a discrimant, but that can be very cheap
instanceOfUnionType match {
case o: SomeType => ...
case o: SomeType(x, y, z) => ...
}
oh, wait, I'm told that Scala type system is
powerful enough to encode unboxed Union Types :-)
Intersection Type
- an Intersection Type denotes a type that is the common subset of all its constituing types
- by the way, did you know they exist in Java?
T extends Foo & Bar
Algebraic Data Type
- an ADT is described by a bunch of functions defining how to construct values for this type
- a language using ADT must offer a mechanism to unwrap values from an ADT, usually with pattern matching
- Scala generalizes the concept with extractors
case class Foo(s: String, i: Int)
class Bar(val s: String, val i: Int)
object Bar {
def unapply(bar: Bar): Option[(String, Int)] =
Some((bar.s, bar.i))
}
Variance
- usually, generic type parameters are nonvariant
- but often, they must be either covariant or contravariant
Higher-Kinded Types
- Kinds are type constructors
- you shouldn't be able to get an expression of type List, because it's not a type
- Kings make programs much more generic
Typeclass
import java.util.Comparator
def max1[T](l: List[T], comparator: Comparator[T]): Option[T] =
l.reduceLeftOption{ (x, y) => if (comparator.compare(x, y) < 0) x else y }
def max2[T](l: List[T])(implicit comparator: Comparator[T]): Option[T] =
l.reduceLeftOption{ (x, y) => if (comparator.compare(x, y) < 0) x else y }
implicit val intComparator = new Comparator[Int] {
def compare(i: Int, j: Int) = i - j
}
max2(List(1, 2, 3))
def max3[T : Comparator](l: List[T]): Option[T] = {
val comparator = implicitly[Comparator[T]]
l.reduceLeftOption{ (x, y) => if (comparator.compare(x, y) < 0) x else y }
}
Phantom Types
- A Phantom Type is a type that does not contribute to the
computation
- it can be used as a witness (living in the type world) to check that a type constraint is as expected
- for example, to make sure that a builder was correctly used
Phantom Type - example 1/3
class Pizza()
class PizzaBuilder() {
def withTomato = new PizzaBuilder()
def build() = new Pizza()
}
object PizzaBuilder {
def apply() = new PizzaBuilder
}
val pizza = PizzaBuilder().build()
Phantom Type - example 2/3
class Pizza()
trait TRUE
trait FALSE
class PizzaBuilder[HasTomato] private () {
def withTomato = new PizzaBuilder[TRUE]()
def build()(implicit ev: HasTomato =:= TRUE) = new Pizza()
}
object PizzaBuilder {
def apply() = new PizzaBuilder[FALSE]
}
PizzaBuilder().build() // Cannot prove that FALSE =:= TRUE
PizzaBuilder().withTomato.build() // OK
Phantom Type - example 3/3
class Pizza()
trait TRUE
trait FALSE
class PizzaBuilder private () {
type HasTomato
def withTomato = new PizzaBuilder() { type HasTomato = TRUE }
def build()(implicit ev: HasTomato =:= TRUE) = new Pizza() { type HasTomato = TRUE }
}
object PizzaBuilder {
def apply() = new PizzaBuilder() { type HasTomato = FALSE }
}
PizzaBuilder().withTomato.build()
Types at runtime
- erasure
- reification
- reflection
- manifest
Dependent Types
- A dependent type depending on a value
- it usually has the form
{ x | P(x) }
- the property can be check at runtime
Singleton Types
- a singleton type denotes the unique type of a particular instance
- so you can say that a method returns a particular object
- makes easy to specify methods that modify the object
Self Types
- A self type is an assumed type for
this
- very useful for trait/mixins
class Foo() { ... }
trait Bar { this: Foo => ... }
val foobar = new Foo() with Bar
HList
A HList with a List with heterogenous types, but still supports the usual operations (map, head, tail, etc.)
trait Fruit
case class Apple() extends Fruit
case class Pear() extends Fruit
type FFFF = Fruit :: Fruit :: Fruit :: Fruit :: HNil
type APAP = Apple :: Pear :: Apple :: Pear :: HNil
val a : Apple = Apple()
val p : Pear = Pear()
val apap : APAP = a :: p :: a :: p :: HNil
val ffff : FFFF = apap // APAP <: FFFF
Revisiting the visitor pattern
Demo! Remember all we learnt about ADT, dispatch, higher-order function
←
→
/
#