Go Dynamic Tools

Gophercon 2015, July 9, 2015

Dmitry Vyukov

Google

Video

A video of this talk was recorded at GopherCon in Denver.

About me

Did a bunch of work on Go:

But actually on dynamic testing tools team:

Agenda

Data race detector

What is a data race?

A data race occurs when two goroutines access the same variable concurrently and at least one of the accesses is a write.

All bets are off !

Any data race can destroy the memory/type-safety of a Go program.

There are no "benign" data races

// goroutine 1       // goroutine 2
m[k1] = v1           m[k2] = v2

Bad!

// goroutine 1       // goroutine 2
stat++               stat++

Also bad!

Compilers assume race-free programs and do aggressive optimizations
based on that assumption (e.g. assume "ownership" over written-to variables).

Races are non-deterministic and hard to debug.

Usage

$ go test -race mypkg    // to test the package
$ go run -race mysrc.go  // to run the source file
$ go build -race mycmd   // to build the command
$ go install -race mypkg // to install the package

That's it!

Example

package main

func main() {
    m := make(map[int]int)
    go func() {
        m[1] = 1
    }()
    m[2] = 2
}

Example report

WARNING: DATA RACE
Write by goroutine 5:
  runtime.mapassign1()
      runtime/hashmap.go:411 +0x0
  main.main.func1()
      race.go:6 +0x60

Previous write by main goroutine:
  runtime.mapassign1()
      runtime/hashmap.go:411 +0x0
  main.main()
      race.go:8 +0xb6

Goroutine 5 (running) created at:
  main.main()
      race.go:7 +0x76

Achievements

Instrumentation

Compiler instrumentation pass enabled by -race.

func foo(p *int) {
    *p = 1
}

Becomes:

func foo(p *int) {
    runtime.funcenter(caller_pc)
    runtime.racewrite(p)
    *p = 1
    runtime.funcexit()
}

Run-time module

Handles:

Algorithm is based on dynamic modelling of happens-before relation:

Usage tips

Dynamic tools are only as good as your tests are.

Go-fuzz

Randomized testing

A different approach to testing that finds [lots of] bugs that other testing approaches do not. Intended mostly for programs that parse complex inputs.

Generate random blob -> feed into program -> see if it crashes -> profit!

Completely random blobs won't uncover lots of bugs.

How can we generate diverse but meaningful inputs that will trigger
nil derefs, off-by-ones, etc?

Coverage-guided fuzzing

Genetic algorithms to the rescue!

Instrument program for code coverage
Collect initial corpus of inputs
for {
    Randomly mutate an input from the corpus
    Execute and collect coverage
    If the input gives new coverage, add it to corpus
}

Example

The following code wants "ABCD" input:

if input[0] == 'A' {
    if input[1] == 'B' {
        if input[2] == 'C' {
            if input[3] == 'D' {
                panic("input must not be ABCD")
            }
        }
    }
}

Corpus progression:

""
"", "A"
"", "A", "AB"
"", "A", "AB", "ABC"
"", "A", "AB", "ABC", "ABCD"

Game over

CRC32 checksum verification in image/png/reader.go

func (d *decoder) verifyChecksum() error {
    if binary.BigEndian.Uint32(d.tmp[:4]) != d.crc.Sum32() {
        return FormatError("invalid checksum")
    }
    return nil
}

Probability that random mutations will alter input in an interesting way and
guess CRC32 at the same time is basically ZERO.

Sonar

Don't need to guess, program knows it!

+ v1 := binary.BigEndian.Uint32(d.tmp[:4])
+ v2 := d.crc.Sum32()
+ __go_fuzz.Sonar(v1, v2)
if v1 != v2 {
    return FormatError("invalid checksum")
}

Then, find v1 in the input and replace it with v2. Done!

Game over 2

Mutations and sonar do low-level changes ("bit-flipping"):

Original:

`<item name="foo"><prop name="price">100</prop></item>`

Mutated:

`<item name="foo"><prop name="price">100</prop><<item>`

Also want high-level changes!

Versifier

Versifier reverse-engineers [text] protocol and learns its structure.

abc          -> alphanum token
123, 1e-2    -> number
"..."        -> quoted
[...]        -> parenthesized
...,...,...  -> list
...\n...\n   -> lines

Then, applies structural mutations to inputs.

Versifier example

Original:

`<item name="foo"><prop name="price">100</prop></item>`

Versified (all valid xml):

<item    name="rb54ana"><item  name="foo"><prop name="price"></prop><prop/></item></item>
<item name=""><prop name="price">=</prop><prop/> </item>
<item name=""><prop F="">-026023767521520230564132665e0333302100</prop><prop/></item>
<item SN="foo_P"><prop name="_G_nx">510</prop><prop name="vC">-9e-07036514</prop></item>
<item name="foo"><prop name="c8">prop name="p"</prop>/}<prop name="price">01e-6</prop></item>
<item name="foo"><item name="foo"><prop JY="">100</prop></item>8<prop/></item>

Algorithm

Achievements

Achievements

fmt.Sprintf("%.[]")
panic: runtime error: index out of range

regexp.MustCompile("((0(0){0}))").ReplaceAllString("00000", "00$00")
panic: runtime error: slice bounds out of range

ioutil.ReadAll(flate.NewReader(strings.NewReader("4LJNIMK\a\x00\x00\xff..\xff.....\xff")))
runs forever

var x = 1/"."[0]
crashes compiler

archive/tar: hang
archive/zip: cap out of range
encoding/gob: stack overflow
encoding/asn1: index out of range
image/jpeg: Decode hangs
image/png: nil deref
math/big: incorrect string->Float conversion
crypto/x509: division by zero
...

Usage

func Fuzz(data []byte) int {
    gob.NewDecoder(bytes.NewReader(data))
    return 0
}
$ go-fuzz-build github.com/dvyukov/go-fuzz/examples/gob
$ go-fuzz -bin=gob-fuzz.zip -workdir=examples/gob

Execution tracer

Execution tracer

Gives insight into dynamic execution of a program.

Captures with nanosecond precision:

Execution tracer

Recap

Thank you

Dmitry Vyukov

Google

Use the left and right arrow keys or click the left and right edges of the page to navigate between slides.
(Press 'H' or navigate to hide this message.)