The
Principles
of Programming

THE THEORY UNDERLYING COMPUTING LANGUAGES

BYRON SPICE AND KEVIN O’CONNELL

If you ask Computer Science Professor Bob Harper, he’ll tell you that “Mommy, go’ed to the store” is more than a statement a small child might make. Though grammatically incorrect, the statement demonstrates that the child understands that adding “ed” to the end of a verb will make it past tense. By using this simple algorithm — defined as a series of sequential steps involved in any process — the child generates a sentence they have never heard anyone speak, and has learned without being taught how to use a variable as a part of the language acquisition process.

Harper is a member of the School of Computer Science’s Principles of Programming group (PoP), which focuses more on the philosophy behind how humans communicate with machines and each other and less on the subtleties of syntax in programming code. “Whatever a program is, it’s a way of expressing ideas,” said Harper. “Fundamentally, what we’re doing is talking to each other.”

Even though we generally think of algorithms strictly in terms of computing, that concept may be limiting, and human cognition shouldn’t be overlooked. The late Dutch mathematician and philosopher L.E.J. Brouwer promoted the idea that math is a linguistic activity and that, as with the child who can fashion a primitive sentence, humans are born with an instinctive understanding of the concept of an algorithm — distinguishing them from other animals.

This concept, known as “intuitionism,” flew in the face of conventional wisdom at the time. But it makes sense to Harper. “It’s a very down-to-earth conception, which I like,” he said.

Bob Harper

Bob Harper, Professor of Computer Science

Theory A and Theory B

The theory of computation developed in the mid-20th century has two major schools of thought. The first, led by British mathematician Alan Turing, conceived of programs as read and write functions that acted on computer memory. Commonly known as Theory A, Turing’s theory is based on abstract mathematical models of computation, or Turing machines, and defines computability in terms of what a Turing machine can compute in step-by-step execution of algorithms.

The second theory, Theory B, originated with American mathematician and computer scientist Alonzo Church, who defined computability in terms of functions that can be computed using a formal system for expressing computation, known as lambda calculus (λ-calculus).

Church’s idea enabled fully expressive languages that didn’t rely on numbers, but on deductive reasoning. He proposed that any effectively calculable function can be expressed in lambda calculus. With lambda calculus and its use of variables at its core, Theory B profoundly influenced the development of programming languages, especially functional programming languages such as Lisp and Haskell. It also played a role in the theory of programming semantics.

Turing’s theory has been embraced by most American computer science departments, while Church’s theory enjoys more popularity in Europe, Harper said. And while both theories have been proven to be equally computable, Harper and his PoP colleagues feel the approach of the latter lends itself to a more complete and therefore more productive understanding of programming.

“It’s called Church’s thesis, but it should be called Church’s law,” Harper said. “As far as I’m aware, it’s the only scientific law in computer science.” There’s nothing directly analogous to it in a Turing machine, he added.

Though he’s not part of PoP, but rather “more of an algorithms guy,” Daniel Sleator, a professor in the Computer Science Department, tends to agree with Harper — at least in theory — about the usefulness of programming based on Church’s approach.

“In a language like Java or C++, you have to festoon your program with all the types of all the objects in the program,” Sleator said. “You have to write them all out. Whereas functional languages can derive the type based on the way you’re using these variables.” Sleator also noted that his programs almost always work — and work the first time — when programming in OCaml rather than when writing in Java or C++, which often need debugging.

The PoP Group

CMU has been the leading American outpost for Theory B since Dana Scott, Church’s former student and a Turing Award winner, joined the Computer Science Department in 1982. Scott recruited such stalwarts to the department as John Reynolds, Stephen Brookes, Frank Pfenning and Peter Lee, “making CMU one of the premier places in the world for PoP,” Harper added.

Church’s use of variables demands that programmers pay attention to how they compose their programs. PoP places a strong emphasis on formal methods, a branch of computer science that uses mathematical techniques to specify and verify software systems. These methods also delve into formal verification, which involves mathematically proving the correctness of software and ensuring it behaves as intended. This focus on formality is more rigorous and theoretical compared to standard programming practices that often rely on testing and debugging.

During the 1980s and 1990s, PoP’s focus on formal methods gained prominence. While these methods offered rigorous ways to ensure software correctness, some people saw them as overly theoretical and detached from real-world programming challenges. This misconception caused skepticism and occasional resistance from other computer scientists focused on programming languages.

Harper worries that most computer scientists think they know about programming languages but fail to truly understand the philosophical underpinnings of languages and the computation involved in them.

Sleator understands Harper’s concerns. Through his years of teaching programming, Sleator has learned to value languages based on both theories, and also respects the practical idea of using different languages for different purposes — and thinking about a problem creatively enough to know when to employ a certain language to accomplish a particular task. The danger comes in over relying on languages students learn early in their education, which can lead to inefficient programs.

“The kids love to use Python, but it’s not a great language for algorithmic programming. It’s slow,” Sleator said. “The point of algorithms is you’ll want your code to run fast. That’s the whole point of the field — we figure out how to speed up algorithms.”

Dana Scott

Dana Scott, Hillman University Professor of Computer Science, Philosophy and Mathematical Logic (Emeritus)

Daniel Sleator

Daniel Sleator, Professor in CSD

Alan J. Perlis

Computer Language Archetype

Alan J. Perlis

Laying the foundation for the Principles of Programming group at CMU, as well as for nearly all programming languages since, Alan Jay Perlis (April 1, 1922 – February 7, 1990) was a visionary computer scientist who reshaped the landscape of programming languages, software engineering and computer science education. Born in Pittsburgh, Perlis made many seminal contributions while a professor at CMU.

Perlis co-developed Fortran, the first programming language specifically designed for symbolic mathematical computations, which significantly influenced the development of subsequent programming languages. He also played a crucial role in the early development of Lisp, one of the oldest high-level programming languages. Lisp introduced the concept of symbolic processing and recursion.

Perlis was an early pioneer in the development of time-sharing systems, which allow multiple users to interact with a computer simultaneously, laying the foundation for concurrent computing systems. Additionally, he advocated for structured programming and software engineering principles, emphasizing the importance of clear and modular code design to improve program reliability and maintainability.

Perlis’s oft quoted book “Epigrams in Programming” is a witty collection of thought-provoking statements about programming and computer science which continue to provide insights and guidance for programmers today. His emphasis on the importance of teaching programming as a problem-solving skill continues to impact curriculum development at CMU and the field of computer science pedagogy more broadly.

Alan Perlis was the first recipient of the ACM Turing Award in 1966 for his contributions to computer programming languages, and his work, ideas and accomplishments continue to influence how we design, write, and teach software today.

Developments in Type Theory

Under Harper’s guidance, one area of emphasis in PoP’s work has been type theory — a foundational concept that deals with classifying data based on its category or type. Type theory provides a formal framework for specifying and verifying the behavior of programs, ensuring they operate correctly and safely.

Harper’s work specifically has hugely influenced type theory, garnering him the 2021 ACM SIGPLAN Programming Languages Achievement Award for his impact on the foundation and design of programming languages, type systems and formal methods.

His work on type systems for programming languages also helped create languages that are both expressive and safe, striking a balance between flexibility and correctness. One obvious example is Standard ML (sML), a statically typed functional programming language known for its powerful type system that has influenced the development of subsequent programming languages like OCaml, Haskell and others.

In addition, Harper is known for his work with type systems for program modules, which are separable components that are needed to manage the sheer complexity of programs and to help share code across implementations.

A sophisticated form of type theory known as dependent types, where types can depend on values, has been another area of focus for Harper. Dependent types prove extremely valuable in ensuring program correctness and have applications in areas like formal verification.

For a deeper dive into Harper’s work, including his work on dependent types, check out the book, “Practical Foundations for Programming Languages,” which explores the theory and practical aspects of dependent types and is the text he uses for the undergraduate course he teaches on Foundations of Programming Languages.

Guy Blelloch

Guy Blelloch, Professor in CSD

Toward the Future

Since its founding, the PoP group has clearly made important contributions and exerted a large influence on the programming languages field. Current members and alumni have applied PoP ideas to fields beyond type theory and formal verification. Technologies from block chain to quantum computing have benefited from programming languages rooted in lambda calculus because of their elegance, simplicity and efficiency.

And there are many examples close to home in SCS, even if we begin talking about blending the ideas within Theories A and B.

A few examples: Guy Blelloch, a professor in CSD, studies parallelism — particularly efficient parallel algorithms and data structures — and his work holds wide-ranging implications for concurrent computing systems.

Harper’s work in graduate school in the later 1980s at the University of Edinburgh led to development of the Logical Framework (LF), also known as the Edinburgh Logical Framework (ELF). LF, co-authored by Furio Honsell and Gordon Plotkin, which provides a means to define (or present) logics and is based on a general treatment of syntax, rules and proofs by means of a typed λ-calculus with dependent types.

After Harper came to CMU, he collaborated with Frank Pfenning and Karl Crary, both professors of computer science and PoP members, on work with LF, resulting in a powerful tool for precisely and formally specifying the syntax and semantics of programming languages. Harper worked closely with Crary on the verification of an important property of a language called type safety, which has mechanized the proof of safety of the entire Standard ML language using the Twelf implementation of LF developed by Pfenning and his students. Twelf, a research project funded by the National Science Foundation, provides a uniform meta-language for specifying, implementing and proving properties of programming languages and logics.

In addition to these works and the many others happening within CMU, Sleator noted that some languages rooted in Turing machine-style computation are beginning to take on aspects traditionally thought to be within the domain of Theory B and λ-calculus. Popular languages like C++ are adopting concepts rooted in Church’s theory that PoP championed decades ago.

“In C++ there’s a type called ‘Auto,’ which says, ‘The programmer can write ‘Auto’ in front of the variable instead of saying ‘Integer Float,’ or whatever it is, and simply say ‘Auto,’ and then the language can decide what type it is,” Sleator said. “They were doing that 30 years ago in these sML-type languages.”

The recent inclusion of these lambda expressions in languages like C++, demonstrates that programmers value such functions and want them in the languages they use, Sleator said.

And while the group has had — and will continue to have — an incredible, practical impact on programming languages, Harper takes a more philosophical approach.

He sees, at its core, something much deeper in their work.

“Computation is fundamental to who we are as a species,” Harper said. “Computing, so conceived, is more fundamental than math.”

As the gap between the two theories draws closer, in some circles, those working closely in the theory of programming languages are finding value beyond the traditional approaches to programming a computer or writing a code. We are learning far more about ourselves.

“God is trying to tell us something,” said Harper.  ■

Frank Pfenning

Frank Pfenning, Professor in CSD

Karl Crary

Karl Crary, Professor in CSD