Skip to main content

Keys, Values and Rules: Three Important Shake Concepts

The title was a click-bait! This article will actually try to explain five instead of three important notions in Shake.

These are:
  1. Rules
  2. Keys
  3. Values
  4. The Build Database
  5. Actions

This short blog post was inspired by the hurdles with my Shake based build, after the new Shake version was released, which had breaking API changes.

Jump to the next section if you are not interested in the why and how of this blog post.

Shake is rule based build system much like GNU make. Like make it is robust, unlike make, it is pretty fast and supports dynamic build dependencies.

But you knew all that already, if you are the target audience of this post, since this post is about me explaining to myself by explaining to you, how that build tool, I used for years, actually works.

Although I used it for years, I never read the paper or wrapped my head around it more than absolutely necessary to get the job done.

When Shake was updated to version 0.16.x, the internal API for custom rules was removed. Until then I was using this API with code mindlessly hacked together until it worked, without actually knowing what I was doing.

But this API change forced me into reading the paper and understanding the wording and concepts. This blog post contains a glossary of some of the terms used in Shake, that I finally understood.

It turned out, that I didn't actually need to read all the documentation. Like I always do, I mindlessly threw bits of code at the project to make it work and it did not work. I concluded that mindlessly throwing bits and pieces of code at the project was not sufficient, and since the Shake upgrade was the only change I made, I assumed I had misunderstood enough about how Shake works, that it would justify me digging deeper into it.

Of course, later I discoverd, that the problem that persuaded me to leave the path of ignorance, had nothing to do with how I used the Shake API. It is important to understand, that the Shake API is pretty neat, in that once it compiles and correctly resolves rules, one can rely on the correctness of the internal mechanisms of Shake.

Important Shake Concepts

Before we start, let me clarify, that by build program I mean a Haskell program using Shake, that creates a specific set of external outputs from a specific set of (optionally external) inputs via execution of arbitrary IO actions, such as invoking external programs.

Rule

A build program basically consist of rules.

A rule maps a key to a build action, that creates the actual output artifact, which the key refers to.

There are two kinds of rules, and there will be an upcoming blog post about extending Shake with custom rules, and that article will explain these two types.

The build action that is stored for a key in a rule must return meta-data, that will be kept in a database and passed to the next invokation of that build action.

This bit of meta-data is called a value.

Value

A value is representation of the content of the artifact generated for a key.

The value is used by the build action to compare old and new build output in order to determine if the build output is different or not.
More precisely, it must only determine if the output is different enough to justify the rebuld of the actions depending on it.

If, for example, the value represents an executable file generated by a compiler, it is possible to directly use the file contents as the value, but it is often faster and requires less disk space to use some placeholder value like an SHA-1 hash, or maybe even a file system modification timestamp.

Shake would apply the build action for that file to "Just" the previous hash, so the action can compare it to the new hash whenever the output file was rebuilt.

The action will return "ChangedRecomputeSame", if the hashes are equal after a rebuild, and Shake would then skip rebuilding the artifacts that depend upon that file, or it can return "ChangedRecomputeDiff" when the hashes differ, and Shake will then also rebuld the dependent artifacts.

A value value should be represented by a custom Haskell data type.
For example:

    data OutputFileHash =  OutputFileHash Integer

Key

A key represents a specific artifact to be generated by a build program.
Key values are used to specify build targets and dependencies.

A key should also be represented by a custom Haskell data type.
For example:

    data OutputFile = OutputFile FilePath
     

Build output meta-data database

Shake uses a persistent database, stored in a file, to pass build output meta-data from one build to the next.

This database basically contains a map of keys to results.

After a key was (re)built, the database entry for that key will be updated with the new result.

A build result contains:
  • the value value
  • the timestamp of last rebuild
  • the timestamp of last time the value changed
Results contain enough information to determine, if dependent artifacts need to be rebuilt or not.

Action

A shake Action can do one of two things:
  1. Actually do something like invoking ghc or gcc, i.e. perform IO via liftIO
  2. Depend on other artifacts via their keys, which is done by apply or apply1 in Shake.

Comments

Popular posts from this blog

Lazy Evaluation(there be dragons and basement cats)

Lazy Evaluation and "undefined" I am on the road to being a haskell programmer, and it still is a long way to go. Yesterday I had some nice guys from #haskell explain to me lazy evaluation. Take a look at this code: Prelude> let x = undefined in "hello world" "hello world" Prelude> Because of Haskells lazyness, x will not be evaluated because it is not used, hence undefined will not be evaluated and no exception will occur. The evaluation of "undefined" will result in a runtime exception: Prelude> undefined *** Exception: Prelude.undefined Prelude> Strictness Strictness means that the result of a function is undefined, if one of the arguments, the function is applied to, is undefined. Classical programming languages are strict. The following example in Java will demonstrate this. When the programm is run, it will throw a RuntimeException, although the variable "evilX" is never actually used, strictness requires that all argu

Learning Haskell, functional music

As you might have realized, I started to learn Haskell. One of the most fun things to do in any programming language is creating some kind of audible side effects with a program. Already back in the days when I started programming, I always played around with audio when toying around with a new language. I have found a wonderful set of lecture slides about haskell and multimedia programming, called school of expression. Inspired by the slides about functional music I implemented a little song. Ahh ... and yes it is intended to sound slightly strange . I used the synthesis toolkit to transform the music to real noise, simply by piping skini message to std-out. I used this command line to achieve the results audible in the table: sven@hhi1214a:~/Mukke$ ghc -o test1 test1.hs && ./test1 | stk-demo Plucked -n 16 -or -ip Sound samples: Plucked play Clarinet play Whistle(attention very crazy!) play As always the source... stueck = anfang :+: mitte :+: ende anfang = groovy :+: (Trans

The purpose of the MOCK

In response to a much nicer blog entry, that can be found here . There are actually several distinct "tests" that make up usual unit tests, among them two that really do stand out: one kind of testing to test method flows, one to test some sort of computation. Mock objects are for the purpose of testing method flows. A method flow is a series of message transmissions to dependent objects. The control flow logic inside the method(the ifs and whiles) will alter the flow in repsonse to the parameters of the method call parameters passed by calling the method under test, depending on the state of the object that contains the method under test and the return values of the external method calls(aka responses to the messages sent). There should be one test method for every branch of an if statement, and usuale some sort of mock control objects in the mock framework will handle loop checking. BTW: I partly use message transmission instead of method invocation to include other kind