You are currently browsing the category archive for the ‘Montague’ category.

By the way, folks, one of the reasons I haven’t completed the Montague Grammar posts is that UCLA’s Terence Parsons has written an excellent introductory text on model-theoretic semantics which is available for free on his website. He even talks about more recent semantical concepts, though it doesn’t quite reach the pitch of abstraction of things like “Dynamic Montague Grammar“. Check it out.

After a longish break, the Montague Grammar series returns with the first installment discussing Montague’s actual analysis of his fragment of English. Montague’s analysis of nouns (proper and common) hinges on logical devices known as *generalized quantifiers*, which were first studied by Mostowski and Lindström in the 1960s. They noticed that some concepts, like “there exist uncountably many X”, were not definable in terms of the “ordinary” existential and universal quantifiers: new quantifiers, the generalized quantifiers, had to be introduced. Generalized quantifiers are not of a greater “order” than ordinary quantifiers: first-order generalized quantifiers make use of exactly the same model-theoretic framework as the ordinary quantifiers, but divvy up those structures in different ways. Montague’s logic is a “higher-order” intensional logic, but the same principles apply: his new quantificational devices do not fundamentally alter the theory of types on which they are based.

There are three generalized quantifiers introduced in PTQ. The first one is written with a combining inverted breve: Î. This symbolizes λuφ, the set of all objects where proposition φ holds when u is substituted for its variable. The second one is written with a circonflex, used in intensional logic to symbolize the intension of a word: ^uφ represents “the property of objects of type u expressed by φ”. The third generalized quantifier, written with an asterisk, applies the breve quantifier to properties of individual concepts (intensions): a* is equivalent to Î[I(^a)], or the set of properties which the intension of the term *a* (the function determining what object it is in various possible worlds) has. If we iterate the asterisk, we get the set of properties of properties of an individual concept (in PTQ this is written with a cursive P, and forms the basis of Montague’s semantic analysis of the word “to be”).

These devices are necessary to incorporate the “puzzles of intensionality” which motivated Montague Grammar, such as the difference between *de re* and *de dicto* readings of “John talks about a unicorn”, into its semantic analysis: in particular, the asterisk is used to symbolize the “meaning” of a word, since no matter whether one is using an expression in an extensional or an intensional context it can be defined by all the properties of its “sense”. (In early Montague Grammar seminars, they used to ask “What is the meaning of life?” and answer by writing “life*” on the chalkboard, but this really means that the meaning of life is everything that is true of every way it could be.)

It’s time for the world’s shortest and most simple-minded introduction to the model theory of modal logic. Since “model theory” generally employs some fairly exotic concepts, I suppose it’d be best to begin by trying to concretize the idea of a model of a sentence. A model of a logical sentence establishes a systematic correspondence between the parts of that sentence and mathematical entities possessing the same formal properties. Since these sentences, like sentences in a natural language, can be indefinitely complicated by operations for building new sentences out of old ones (like joining two sentences by “and”), establishing this correspondence for all sentences requires a way of “disassembling” an arbitrary sentence: as I’ve remarked before, in model theory this takes the form of a recursive definition, where the meaning of a longer sentence is defined in terms of its subsentences until we reach semantic primitives, which are “satisfied” (represented) by arbitrarily chosen mathematical entities. For example, the recursive definition for a logic involving disjunction (“inclusive or”) would feature a clause stating this:

“X v Y” is satisfied if either X is satisfied or Y is satisfied.

All the regular “extensional” connectives have relatively simple clauses like this: when we get to the quantifiers things get a little more complicated, but by making the mathematical entities satisfying quantified sentences infinite sequences and assigning quantified variables to specific positions in those sequences (“for all x” being satisfied by all sequences varying in at most the xth position, “there is some x” satisfied by at least one sequence varying in at most the xth position) Tarski solved that problem in the ’30s. He used this complete recursive definition of logical sentences to define a logical truth as a sentence which cannot fail to be satisfied. What took a longer time was figuring out how to model-theoretically represent modal or “intensional” operators, which cause the meaning of a sentence to not be a straightforward truth-functional consequence of the meaning of its component parts; Kripke solved this problem by developing a model theory using “possible worlds”. A model of an ordinary first-order language consists of an ordered pair, the sentences of the language and a satisfaction relation: a model of a modal language consists of an ordered triple, written <W,R,V>. Let me explain each element.

W is the set of possible worlds; V is the set of the sets of the propositions which are true in particular possible worlds, and R is the “accessibility relation”, which determines how relevant truth in one possible world is to truth in another. If a statement is true in an accessible world, the statement is possibly true in the world under consideration (symbolized ◊x): if a statement is true in all accessible worlds, the statement is necessarily true in the world under consideration (□x). Varying the accessibility relation is very important for comparative study of modal logics: a different accessibility relation gives you a model of a different modal logic. Conveniently for us, Montague uses the most intuitive accessibility relation, an equivalence relation: it is reflexive (world x is accessible from itself), transitive (if x is accessible from y and y is accessible from z, x is accessible from z), and symmetric (if x is accessible from y, y is accessible from x). This means that all worlds are accessible from each other, such that a statement which is true in some possible world is possibly true in all worlds, or is necessarily possible: this is the model-theoretic equivalent of an axiom of the modal logic S5, ◊x→□◊x, and S5 is the language defined by the accessibility relation of equivalence.

But the language of metaphysical possibility and necessity, or “alethic” modality, is not the only intensional logic possible. One other such logic is tense logic, which has historical roots in the ancient and medieval philosophy of time but erupted into the modern philosophical consciousness through the work of Arthur Prior. Ordinary tense logic is “multi-modal”, featuring two primitive modalities Gx (x is going to be the case) and Hx (x has been the case). These can be combined to express many of the statements about time, such as Gx → GHx (if it is going to be the case that x, after that point it is going to be the case that x has been the case). Ordinary tense logic requires an accessibility relation modeling the order of instants of time before and after other instants, and Montague chooses a linear order like “less than or equal to”, which is reflexive, transitive, and *antisymmetric* (if a is accessible from b, and b is accessible from a, then a must be b). Gx is true if x is true at some instant of time following (accessible from) the instant of time under consideration, and Hx is true if x was true at some instant of time which the instant of time under consideration is accessible from.

There is one further application of modality in Montague Grammar: intensions. The arbitrary mathematical entities representing things and truth-values in Montague Grammar are called “e” and “t”: Montague says we can think of them as the numbers 0 and 1, if we like (i.e. it’s not important what they actually are). He adds to this a third entity, “s”, representing “senses” or intensions: these three categories combine type-theoretically, as in <s,t>, a function from senses to truth values. Generally speaking an intension is a function from possible worlds to “extensions”, either things or truth-values: the intension of “blue” would be an ordered pair of possible worlds and a set of the sets of things which are blue in each possible world. Carnap developed intensions as a way of modeling Frege’s idea of the *Sinn* or “cognitive significance” of a word, not just what it happened to refer to but what it would refer to if the world were different (or, as the case might very well be, we *believed* it was different): the adequacy of this as an interpretation of Frege has been hotly contested for decades, but a highly ramified use of intensions is critical for Montague’s analysis of the meaning of words, even common nouns like “ball”.

Having discussed categorial grammar, I can introduce a logical notation employed by Montague which in some respects runs counter to it in intention: the “lambda calculus”. In the early 1930s, the logician Alonzo Church was searching for an alternative to axiomatic set theory for formulating fundamental mathematical principles. Instead of the “functions-as-graphs” concept set theory borrowed from Frege, Church wanted to use the more intuitive conception we have of mathematical functions as methods or rules for deriving an answer by following precise steps. The solution he came up with, the lambda calculus, involves two complementary ideas — a way of specifying the methodical content of a function, and a way of computing that function for a specific value.

The first operation, called “function abstraction”, takes a variable and indicates the procedure the value of the variable is to be substituted into. The variable is written after a lower-case Greek lambda (from whence the name) and before a period separating it from the expression of the procedure: λx.x+2 signifies the function that takes a number and adds two to it. Function abstractions can be nested within other function abstractions: using a procedure developed before Church by Moses Schoenfinkel (but known as “currying”, after Haskell Curry, who developed ideas similar to Church’s contemporaneously) functions of two variables can be represented by iteration of function abstractions using only one variable: λx.λy.(x+y) represents the familiar procedure of adding one variable to another. As originally conceived, lambda abstraction could employ predicates as well: λP.”John is P” would symbolize a function taking any predicate and applying it to John — but although a variant of this predicate abstraction is crucially important for Montague Grammar, it is fraught with peril, as I will explain below.

Like quantifiers, the lambda expression *binds* the variable in the expression: if it is not the case that all variables are bound or “spoken for” by variables, either directly or by currying, in an expression, then the function is not fully defined. If the function is fully defined, then we can generate a result by the operation of “function application”: written (f)x, it returns the value of the function f for the value x. The application (λx.x+2)3 returns the value 5, for example, since 3 is substituted in the expression x+2 and that expression is evaluated. So far, the lambda calculus might seem superfluous, since we already know how to carry out the operation of defining a function and evaluating it. However, “currying” gives a little taste of the power of defining mathematical concepts this way: and in fact all the mathematical objects used in set theory can be given “functional” definitions using the lambda calculus.

For example, function abstraction doesn’t have to be tied to a “concrete” mathematical procedure: we can put a lambda next to a variable ranging over functions, and define a “function of functions” like composition. Even the natural numbers can be defined using the lambda calculus: in “Church numerals”, the number 0 is represented as λf.λx.x, a function which takes another function and applies it to x 0 times, returning x. In fact, the lambda calculus was *so* powerful one could easily derive a contradiction similar to Russell’s paradox for naive set theory by abstracting over predicates, as was quickly noticed by the pioneering computer scientist Stephen Kleene. There are two ways around this. One way is to stick with the contradiction-free fragment of lambda calculus abstracting only over functions, the “λI-calculus”; and although this notation is not powerful enough to derive all set-theoretic concepts it is far from useless, as it is expressively equivalent to the formal model of computation devised by Church’s student Alan Turing, the “Turing Machine”.

Consequently, all computer languages employ methods similar to the lambda calculus for specifying subroutines, and the “functional” programming languages directly emulate the lambda calculus’s ability to specify functions of functions (in fact, programs written in them are “desugared” into a version of lambda calculus during compilation). However, the λI-calculus isn’t quite enough for the purposes of Montague Grammar — so I need to say a little bit about the other way around the paradoxes, the typed lambda calculus. The theory of types was introduced by Bertrand Russell to deal with the set-theoretic paradoxes: in all its forms, it amounts to carefully circumscribing the “levels” involved in a mathematical operation, to prevent paradoxical entities like “the set of all sets which are members of themselves”. Using a variant of the theory of types developed by Frank Ramsey, Church introduced a contradiction-free version of the lambda calculus. In the typed lambda calculus, the “type” of the function input and its output are both specified using the notation A→B: the function can only accept inputs of type A and return outputs of type B.

As before, this restriction becomes more intelligible when you consider more complicated formulations: a function which takes a function as argument and returns another function would be typed (A→B)→(A→B); the input must have the type of a function, A→B. Going into how Montague Grammar uses typed lambda expressions would be too much too soon, but it is critically important that the prospective Montague Grammarian develop some facility with them. (If my exposition has left you cold, the paper-puzzle game Alligator Eggs may trick you into “getting” these concepts.)

I’ll start off the series on Montague Grammar today. The exposition will follow Montague’s paper “The Proper Treatment of Quantification in Ordinary English”, since that has historically been the most influential of Montague’s semantic writings: I guess I might say something about “Universal Grammar” at the end, although I expect the reader will find mastering the ideas of “PTQ” to be more than enough. As for acquiring your own copy, “PTQ” was published after Montague’s untimely death in a Synthese volume, then reprinted in *Formal Philosophy*, Montague’s collected papers. Most large university libraries will have *Formal Philosophy* and the paper is (rather unhappily) short and suitable for photocopying: however, it was also recently made available again in the anthology *Formal Semantics*, co-edited by Barbara Partee (a linguist who is responsible for Montague’s posthumous influence in that discipline).

The paper has four sections: the first is devoted to syntactic rules for a fragment of English — which is small, but includes a number of “intensional” verbs that make trouble for less complicated semantic approaches. These syntactic rules make use of *categorial grammar*, the topic of this post. I think explaining categorial grammar in sufficient generality, for people who might have no more “mathematical sophistication” than I had when I started out reading this stuff, will require going pretty far back: back, in fact, to Dirichlet’s definition of a mathematical function. When we are using them naively, to calculate results, mathematical functions seem to be “rules” which we follow to arrive at a certain result. But this approach is fraught with unclarity, and a major advance in the foundations of mathematics occurred when Gustav Lejeune Dirichlet defined a mathematical function as a collection of ordered pairs, one element of a pair being from the domain and another from the range; such a structure is known as a “graph”.

Now, if you took high school algebra after the introduction of “New Math” (i.e., are not seventy years old) someone once tried to teach you this definition; maybe it even took. But the real power of Dirichlet’s definition comes when you consider “higher-order” functions like composition, where you feed the results of one function into another function. Getting the composition of “functions-as-rules” straight in your head is very tricky, but the definition in terms of graphs is simple; just as one function can be represented by the ordered pairs <d, r>, composition can be represented by an ordered pair containing an ordered pair of the two functions being composed, and the composed function with the first function’s domain and (a subset of) the second function’s range: <<<a, b>,<c,d>>,<a,d>>. In this way, you can explain the functional articulation of mathematical concepts with ease.

What does all this have to do with the semantics of natural language? Well, enter Gottlob Frege. Frege’s attempt to formalize the language of mathematics required an analysis of language which divided up parts of speech in a really novel way, inventing the *quantifiers* we are familiar with today: functions-as-graphs are at the heart of his method. For example, Frege analyzed predicates (“x is red”) as functions from objects to truth-values: if an object possesses the property described by a predicate, the function maps it onto the truth-value “true”, and if not onto “false”. That might seem obvious, but other parts of speech can be given more complex but illuminating glosses in this manner: the Polish logician Kazimierz Adjukiewicz consequently took Frege’s syntactic analysis and formalized it as *categorial grammar*.

The building-blocks of categorial grammar are noun phrases (often written “N”), sentences (“S”), and functional relationships between them (symbolized by a slash): the predicate example given above would be written “N/S”, since it takes a noun and returns a sentence. An adverb, which takes a predicate and returns another predicate, would be written (N/S)/(N/S). Now, Montague added one further twist to the categorial-grammar framework; he noticed that some expressions of English were categorially equivalent, yet commonly identified as different parts of speech. For example, some verb phrases modifying another verb phrase (“try to”) would have the same analysis as the adverbs above. To “save the appearances”, he used a double slash, e.g. “N//S”, to keep one set of expressions distinct from the others.