blob: 22a6ff7c2d2e6ea1028ca5f67ddf244db2a5d874 [file] [log] [blame]
{namespace buck.what_makes_buck_so_fast}
/***/
{template .soyweb}
{call buck.page}
{param title: 'What Makes Buck so Fast?' /}
{param content}
Buck exploits a number of strategies to reduce build times.
<h2>A build rule knows all of the inputs that can affect its output</h2>
Buck is designed so that anything that can affect the output of a build rule must be specified
as an input to the build rule: hidden state is not allowed. (This is also important for ensuring that
results are consistent and reproducible for all developers.) Therefore, we can be sure that once a
rule's <code>deps</code> are satisfied, the rule itself can be built. This gives us confidence that
the <a href="http://en.wikipedia.org/wiki/Directed_acyclic_graph">DAG</a> that results from build
rules and their <code>deps</code> is true: all dependencies are captured in the graph.
<p>
Having a DAG makes it straightforward for rules to be built in parallel, which can dramatically reduce
build times. The execution model for Buck is very simple: starting with the leaf nodes
of the graph, add them to a queue of rules to be built. When a thread is available, a rule is removed
from the queue, and built. Assuming it is built successfully, it notifies all of the rules that depend
on it that it is done. When a rule gets such a notification, it checks whether all its dependencies have
been satisfied, and if so, it gets added to the queue. Computation proceeds in this manner until all of
the nodes in the graph have gone through the queue.
Therefore, breaking modules into finer dependencies creates opportunities for increased
parallelism, improving throughput.
<h2>Buck can store the outputs it generates in a cache</h2>
A build rule knows all of the inputs that can affect its output, and therefore it can combine that
information into a hash that represents the total input. This hash is used as
a <em>cache key</em> where the associated value in the cache is the output produced by the rule.
(See {call buck.concept_buckconfig /} for information on how to set up a cache.)
The following information contributes to the cache key for a build rule:
<ul>
<li>The values of the arguments used to define the build rule in the build file.
<li>The contents of any file arguments for the build rule.
<li>The version of Buck being used to build the rule. (This means that upgrading Buck to a new
version invalidates all of the cache keys generated by the old version.)
<li>The cache key for each of the rule's <code>deps</code>.
</ul>
<p>
When Buck begins to build a build rule, the first thing it does is compute the <em>cache key</em> for
the rule. If there is a hit in any of the caches specified in <code>.buckconfig</code>, then it will
fetch the rule's output from the cache instead of building the rule locally. For outputs that are
expensive to compute, this is a substantial savings. It also makes it fast to rebuild when switching
between branches in a <a href="http://en.wikipedia.org/wiki/Distributed_version_control_system">DVCS</a> such as Git or Mercurial.
<p>
Because Buck uses the cache key to determine whether to rebuild a rule, you should never have to run {call buck.cmd_clean /}.
If anything that could affect the output of the rule changes, then the cache key should change, as well.
Because the change in input will cause a cache miss, Buck will rebuild the rule, overwriting its old outputs.
Since out-of-date outputs are guaranteed to be overwritten, there is no reason to clean the build.
<p>
If you are using some sort of <a href="http://en.wikipedia.org/wiki/Continuous_integration">continuous
integration (CI)</a> system, you will likely want your CI builds to populate a cache that can be read by your local builds.
That way, when a developer syncs to a revision that has already been built on your CI system, running <code>buck
build</code> should not build anything locally, as all outputs should be able to be pulled from the cache.
This works because the cache key computed by Buck when run on the CI system should match the key computed by Buck
on your local machine.
<h2>A Java rule computes its ABI when it is built</h2>
Oftentimes, a developer will modify Java code in a way that does not affect its interface. For example, adding
or removing private methods, as well as modifying the implementation of existing methods (regardless of their visibility),
does not change the <a href="http://en.wikipedia.org/wiki/Application_binary_interface">ABI</a> of a Java file.
<p>
When Buck builds a {call buck.java_library /} rule, it also computes its ABI<sup><a href="#footnote-abi">1</a></sup>.
Normally, modifying a private method
in a {call buck.java_library /} would cause it and all rules that depend on it to be rebuilt because the change in
cache keys would propagate up the DAG. However, Buck has special logic for a {call buck.java_library /} where,
if the <code>.java</code> input files have not changed since the previous build, and the ABI for each of its Java
dependencies has not changed since the previous build, then the {call buck.java_library /} will not be recompiled.
This is valid because we know that neither the input <code>.java</code> files nor the ABI against which they
would be compiled has changed, so the result would be the same if the rule were rebuilt. This localizes how much
Java code needs to be recompiled in response to a change, again reducing build times.
<hr>
<p>
<sup id="footnote-abi">1</sup>The ABI computed by Buck is stricter
than <a href="http://docs.oracle.com/javase/specs/jls/se7/html/jls-13.html">what the Java Language Specification (JLS)
uses to define binary compatibility</a>.
In Buck, the ABI defines an equivalence relationship, whereas in the JLS, it does not. This is primarily done
because the ABI is represented as a SHA-1 hash, which is cheap to compare. However, this does mean that adding a
new method to a class results in a new ABI for a {call buck.java_library /} in Buck, which triggers a rebuild in dependent rules.
By comparison, the JLS would consider the old and new versions of the library binary compatible, thereby determining
recompilation of dependent rules unnecessary.
{/param}
{/call}
{/template}