Go at Google

SPLASH, Tucson, Oct 25, 2012

Rob Pike

Google, Inc.

Preamble

What is Go?

Go is:

History

Design began in late 2007.

Key players:

Became open source in November 2009.

Developed entirely in the open; very active community.
Language stable as of Go 1, early 2012.

Go at Google

Go is a programming language designed by Google to help solve Google's problems.

Google has big problems.

Big hardware

Big software

And of course:

Development at Google can be slow, often clumsy.

But it is effective.

The reason for Go

Goals:

Go was designed by and for people who write—and read and debug and maintain—large software systems.

Go's purpose is not research into programming language design.

Go's purpose is to make its designers' programming lives better.

Today's theme

A talk about software engineering more than language design.

To be more accurate:

In short:

Features?

Reaction upon launch:
My favorite feature isn't in Go! Go Sucks!

This misses the point.

Pain

What makes large-scale development hard with C++ or Java (at least):

Language features don't usually address these.

Focus

In the design of Go, we tried to focus on solutions to these problems.

Example: indentation for structure vs. C-like braces

Dependencies in C and C++

A personal history of dependencies in C

#ifdef

#ifndef "guards":

#ifndef _SYS_STAT_H_

1984: <sys/stat.h> times 37

ANSI C and #ifndef guards:

A personal history of dependencies in Plan 9's C

Plan 9, another take:

Need to document dependencies, but much faster compilation.

In short:

A personal history of dependencies in C++

C++ exacerbated the problem:

2004: Mike Burrows and Chubby: <xxx> times 37,000

1984: Tom Cargill and pi

A personal history of dependencies at Google

Plan 9 demo: a story

Early Google: one Makefile

2003: Makefile generated from per-directory BUILD files

Dependencies still not checkable!

Result

To build a large Google binary on a single computer is impractical.

In 2007, instrumented the build of a major Google binary:

Tools can help

New distributed build system:

Even with Google's massive distributed build system, a large build still takes minutes.
(In 2007 that binary took 45 minutes; today, 27 minutes.)

Poor quality of life.

Enter Go

While that build runs, we have time to think.

Want a language to improve the quality of life.

And dependencies are only one such problem....

Primary considerations

Must work at scale:

Must be familiar, roughly C-like

Modernize

The old ways are old.

Go should be:

The design of Go

From a software engineering perspective.

Dependencies in Go

Dependencies

Dependencies are defined (syntactically) in the language.

Explicit, clear, computable.

import "encoding/json"

Unused dependencies cause error at compile time.

Efficient: dependencies traversed once per source file...

Hoisting dependencies

Consider:
A imports B imports C but A does not directly import C.

The object code for B includes all the information about C needed to import B.
Therefore in A the line

import "B"

does not require the compiler to read C when compiling A.

Also, the object files are designed so the "export" information comes first; compiler doing import does not need to read whole file.

Exponentially less data read than with #include files.

With Go in Google, about 40X fanout (recall C++ was 2000x)
Plus in C++ it's general code that must be parsed; in Go it's just export data.

No circular imports

Circular imports are illegal in Go.

The big picture in a nutshell:

Forces clear demarcation between packages.

Simplifies compilation, linking, initialization.

API design

Through the design of the standard library, great effort spent on controlling dependencies.

It can be better to copy a little code than to pull in a big library for one function.
(A test in the system build complains if new core dependencies arise.)

Dependency hygiene trumps code reuse.

Example:
The (low-level) net package has own itoa to avoid dependency on the big formatted I/O package.

Packages

Packages

Every Go source file, e.g. "encoding/json/json.go", starts

package json

where json is the "package name", an identifier.
Package names are usually concise.

To use a package, need to identify it by path:

import "encoding/json"

And then the package name is used to qualify items from package:

var dec = json.NewDecoder(reader)

Clarity: can always tell if name is local to package from its syntax: Name vs. pkg.Name.
(More on this later.)

Package combines properties of library, name space, and module.

Package paths are unique, not package names

The path is "encoding/json" but the package name is json.
The path identifies the package and must be unique.
Project or company name at root of name space.

import "google/base/go/log"

Package name might not be unique; can be overridden. These are both package log:

import "log"                          // Standard package
import googlelog "google/base/go/log" // Google-specific package

Every company might have its own log package; no need to make the package name unique.

Another: there are many server packages in Google's code base.

Remote packages

Package path syntax works with remote repositories.
The import path is just a string.

Can be a file, can be a URL:

go get github.com/4ad/doozer   // Command to fetch package

import "github.com/4ad/doozer" // Doozer client's import statement

var client doozer.Conn         // Client's use of package

Go's Syntax

Syntax

Syntax is not important...

Tooling is essential, so Go has a clean syntax.
Not super small, just clean:

Declarations

Uses Pascal/Modula-style syntax: name before type, more type keywords.

var fn func([]int) int
type T struct { a, b int }

not

int (*fn)(int[]);
struct T { int a, b; }

Easier to parse—no symbol table needed. Tools become easier to write.

One nice effect: can drop var and derive type of variable from expression:

var buf *bytes.Buffer = bytes.NewBuffer(x) // explicit
buf := bytes.NewBuffer(x)                  // derived

For more information:

Function syntax

Function on type T:

func Abs(t T) float64

Method of type T:

func (t T) Abs() float64

Variable (closure) of type T:

negAbs := func(t T) float64 { return -Abs(t) }

In Go, functions can return multiple values. Common case: error.

func ReadByte() (c byte, err error)

c, err := ReadByte()
if err != nil { ... }

More about errors later.

No default arguments

Go does not support default function arguments.

Why not?

Extra verbosity may happen but that encourages extra thought about names.

Related: Go has easy-to-use, type-safe support for variadic functions.

Naming

Export syntax

Simple rule:

Applies to variables, types, functions, methods, constants, fields....

That Is It.

Not an easy decision.
One of the most important things about the language.

Can see the visibility of an identifier without discovering the declaration.

Clarity.

Scope

Go has very simple scope hierarchy:

Locality of naming

Nuances:

No surprises when importing:

Names do not leak across boundaries.

In C, C++, Java the name y could refer to anything
In Go, y (or even Y) is always defined within the package.
In Go, x.Y is clear: find x locally, Y belongs to it.

Function and method lookup

Method lookup by name only, not type.
A type cannot have two methods with the same name, ever.
Easy to identify which function/method is referred to.
Simple implementation, simpler program, fewer surprises.

Given a method x.M, there's only ever one M associated with x.

Semantics

Basics

Generally C-like:

Should feel familiar to programmers from the C family.

But...

Many small changes in the aid of robustness:

Plus some big ones...

Bigger things

Some elements of Go step farther from C, even C++ and Java:

Concurrency

Concurrency

Important to modern computing environment.
Not well served by C++ or even Java.

Go embodies a variant of CSP with first-class channels.

Why CSP?

Must be able to couple concurrency with computation.

Example: concurrency and cryptography.

CSP is practical

For a web server, the canonical Go program, the model is a great fit.

Go enables simple, safe concurrent programming.
It doesn't forbid bad programming.

Focus on composition of regular code.

Caveat: not purely memory safe; sharing is legal.
Passing a pointer over a channel is idiomatic.

Experience shows this is a practical design.

Garbage collection

The need for garbage collection

Too much programming in C and C++ is about memory allocation.
But also the design revolves too much about memory management.
Leaky abstractions, leaky dependencies.

Go has garbage collection, only.

Needed for concurrency: tracking ownership too hard otherwise.
Important for abstraction: separate behavior from resource management.
A key part of scalability: APIs remain local.

Use of the language is much simpler because of GC.
Adds run-time cost, latency, complexity to the implementation.

Day 1 design decision.

Garbage collection in Go

A garbage-collected systems language is heresy!
Experience with Java: Uncontrollable cost, too much tuning.

But Go is different.
Go lets you limit allocation by controlling memory layout.

Example:

type X struct {
    a, b, c int
    buf [256]byte
}

Example: Custom arena allocator with free list.

Interior pointers

Early decision: allow interior pointers such as X.buf from previous slide.

Tradeoff: Affects which GC algorithms that can be used, but in return reduces pressure on the collector.

Gives the programmer tools to control GC overhead.

Experience, compared to Java, shows it has significant effect on memory pressure.

GC remains an active subject.
Current design: parallel mark-and-sweep.
With care to use memory wisely, works well in production.

Interfaces

Composition not inheritance

Object orientation and big software

Go is object-oriented.
Go does not have classes or subtype inheritance.

What does this mean?

No type hierarchy

O-O is important because it provides uniformity of interface.
Outrageous example: the Plan 9 kernel.

Problem: subtype inheritance encourages non-uniform interfaces.

O-O and program evolution

Design by type inheritance oversold.
Generates brittle code.
Early decisions hard to change, often poorly informed.
Makes every programmer an interface designer.
(Plan 9 was built around a single interface everything needed to satisfy.)

Therefore encourages overdesign early on: predict every eventuality.
Exacerbates the problem, complicates designs.

Go: interface composition

In Go an interface is just a set of methods:

type Hash interface {
    Write(p []byte) (n int, err error)
    Sum(b []byte) []byte
    Reset()
    Size() int
    BlockSize() int
}

No implements declaration.
All hash implementations satisfy this implicitly. (Statically checked.)

Interfaces in practice: composition

Tend to be small: one or two methods are common.

Composition falls out trivially. Easy example, from package io:

type Reader interface {
    Read(p []byte) (n int, err error)
}

Reader (plus the complementary Writer) makes it easy to chain:

Dependency structure is not a hierarchy; these also implement other interfaces.

Growth through composition is natural, does not need to be pre-declared.

And that growth can be ad hoc and linear.

Compose with functions, not methods

Hard to overstate the effect that Go's interfaces have on program design.

One big effect: functions with interface arguments.

func ReadAll(r io.Reader) ([]byte, error)

Wrappers:

func LoggingReader(r io.Reader) io.Reader
func LimitingReader(r io.Reader, n int64) io.Reader
func ErrorInjector(r io.Reader) io.Reader

The designs are nothing like hierarchical, subtype-inherited methods.
Much looser, organic, decoupled, independent.

Errors

Error handling

Multiple function return values inform the design for handling errors.

Go has no try-catch control structures for exceptions.
Return error instead: built-in interface type that can "stringify" itself:

type error interface { Error() string }

Clear and simple.

Philosophy:

Forces you think about errors—and deal with them—when they arise.
Errors are normal. Errors are not exceptional.
Use the existing language to compute based on them.
Don't need a sublanguage that treats them as exceptional.

Result is better code (if more verbose).

(OK, not all errors are normal. But most are.)

Tools

Tools

Software engineering requires tools.

Go's syntax, package design, naming, etc. make tools easy to write.

Standard library includes lexer and parser; type checker nearly done.

Gofmt

Always intended to do automatic code formatting.
Eliminates an entire class of argument.
Runs as a "presubmit" to the code repositories.

Training:

Sharing:

Scaling:

Often cited as one of Go's best features.

Gofmt and other tools

Surprise: The existence of gofmt enabled semantic tools:
Can rewrite the tree; gofmt will clean up output.

Examples:

And good front-end libraries enable ancillary tools:

Gofix

The gofix tool allowed us to make sweeping changes to APIs and language features leading up to the release of Go 1.

Also allows us to update code even if the old code still works.

Recent example:

Changed Go's protocol buffer implementation to use getter functions; updated all Google Go code to use them with gofix.

Conclusion

Go at Google

Go's use is growing inside Google.

Several big services use it:

Many small ones do, many using Google App Engine.

Go outside Google

Many outside companies use it, including:

What's wrong?

Not enough experience yet to know if Go is truly successful.
Not enough big programs.

Some minor details wrong. Examples:

Gofix and gofmt gave us the opportunity to fix many problems, ranging from eliminating semicolons to redesigning the time package.
But we're still learning (and the language is frozen for now).

The implementation still needs work, the run-time system in particular.

But all indicators are positive.

Summary

Software engineering guided the design.
But a productive, fun language resulted because that design enabled productivity.

Clear dependencies
Clear syntax
Clear semantics
Composition not inheritance
Simplicity of model (GC, concurrency)
Easy tooling (the go tool, gofmt, godoc, gofix)

Try it!

Thank you