| {namespace buck.what_makes_buck_so_fast} |
| |
| /***/ |
| {template .soyweb} |
| {call buck.page} |
| {param title: 'What Makes Buck so Fast?' /} |
| {param description} |
| An overview of what makes Buck fast at compiling your code. |
| {/param} |
| {param content} |
| |
| <p>Buck exploits a number of strategies to reduce build times:</p> |
| |
| <h2>A build rule knows all of the inputs that can affect its output</h2> |
| |
| <p> |
| |
| Buck is designed so that anything that can affect the output of a build rule must be specified |
| as an input to the build rule: hidden state is not allowed. (This is also important for ensuring that |
| results are consistent and reproducible for all developers.) Therefore, we can be sure that once a |
| rule's <code>deps</code> are satisfied, the rule itself can be built. This gives us confidence that |
| the <a href="http://en.wikipedia.org/wiki/Directed_acyclic_graph">DAG</a> that results from build |
| rules and their <code>deps</code> is true: all dependencies are captured in the graph. |
| |
| <p> |
| |
| Having a DAG makes it straightforward for rules to be built in parallel, which can dramatically reduce |
| build times. The execution model for Buck is very simple: starting with the leaf nodes |
| of the graph, add them to a queue of rules to be built. When a thread is available, a rule is removed |
| from the queue, and built. Assuming it is built successfully, it notifies all of the rules that depend |
| on it that it is done. When a rule gets such a notification, it checks whether all its dependencies have |
| been satisfied, and if so, it gets added to the queue. Computation proceeds in this manner until all of |
| the nodes in the graph have gone through the queue. |
| Therefore, breaking modules into finer dependencies creates opportunities for increased |
| parallelism, improving throughput. |
| |
| <h2>Buck can store the outputs it generates in a cache</h2> |
| |
| <p> |
| |
| A build rule knows all of the inputs that can affect its output, and therefore it can combine that |
| information into a hash that represents the total input. This hash is used as |
| a <em>cache key</em> where the associated value in the cache is the output produced by the rule. |
| (See {call buck.concept_buckconfig /} for information on how to set up a cache.) |
| The following information contributes to the cache key for a build rule: |
| |
| <ul> |
| <li>The values of the arguments used to define the build rule in the build file. |
| <li>The contents of any file arguments for the build rule. |
| <li>The version of Buck being used to build the rule. (This means that upgrading Buck to a new |
| version invalidates all of the cache keys generated by the old version.) |
| <li>The cache key for each of the rule's <code>deps</code>. |
| </ul> |
| |
| <p> |
| |
| When Buck begins to build a build rule, the first thing it does is compute the <em>cache key</em> for |
| the rule. If there is a hit in any of the caches specified in <code>.buckconfig</code>, then it will |
| fetch the rule's output from the cache instead of building the rule locally. For outputs that are |
| expensive to compute, this is a substantial savings. It also makes it fast to rebuild when switching |
| between branches in a <a href="http://en.wikipedia.org/wiki/Distributed_version_control_system">DVCS</a> such as Git or Mercurial. |
| |
| <p> |
| |
| Because Buck uses the cache key to determine whether to rebuild a rule, you should never have to run {call buck.cmd_clean /}. |
| If anything that could affect the output of the rule changes, then the cache key should change, as well. |
| Because the change in input will cause a cache miss, Buck will rebuild the rule, overwriting its old outputs. |
| Since out-of-date outputs are guaranteed to be overwritten, there is no reason to clean the build. |
| |
| <p> |
| |
| If you are using some sort of <a href="http://en.wikipedia.org/wiki/Continuous_integration">continuous |
| integration (CI)</a> system, you will likely want your CI builds to populate a cache that can be read by your local builds. |
| That way, when a developer syncs to a revision that has already been built on your CI system, running |
| {sp}{call buck.cmd_build /} should not build anything locally, as all outputs should be able to be pulled from the cache. |
| This works because the cache key computed by Buck when run on the CI system should match the key computed by Buck |
| on your local machine. |
| |
| <h2>If a Java library's API doesn't change, code that uses the library doesn't need to be rebuilt</h2> |
| |
| <p> |
| |
| Oftentimes, a developer will modify Java code in a way that does not affect its externally-visible |
| API. For example, adding or removing private methods, as well as modifying the implementation of |
| existing methods (regardless of their visibility), does not change the API of a Java file. |
| |
| <p> |
| |
| When Buck builds a {call buck.java_library /} rule, it also computes its API. |
| Normally, modifying a private method |
| in a {call buck.java_library /} would cause it and all rules that depend on it to be rebuilt because the change in |
| cache keys would propagate up the DAG. However, Buck has special logic for a {call buck.java_library /} where, |
| if the <code>.java</code> input files have not changed since the previous build, and the API for each of its Java |
| dependencies has not changed since the previous build, then the {call buck.java_library /} will not be recompiled. |
| This is valid because we know that neither the input <code>.java</code> files nor the API against which they |
| would be compiled has changed, so the result would be the same if the rule were rebuilt. This localizes how much |
| Java code needs to be recompiled in response to a change, again reducing build times. |
| |
| <h2>Rules can calculate their own "ABI" keys</h2> |
| |
| <p> |
| |
| As a generalization of the Java library API optimization, |
| every rule type has the freedom to determine whether or not to rebuild itself |
| based on information about the state of its dependencies. |
| For example, when editing a file in an {call buck.android_resource /} rule, |
| we don't need to recompile all dependent resources and libraries |
| if the set of exposed symbols doesn't change |
| (for example, if we just changed a padding value). |
| If we recompile an {call buck.android_library /} due to a dependency change, |
| but the resulting classes are identical, |
| we don't need to re-run DX. |
| |
| <p> |
| |
| This mechanism is fairly general. |
| When the build engine is preparing to build a rule, |
| in addition to the normal cache key, |
| it generates a key that excludes the keys of the dependencies. |
| This is combined with a key that the rule generates |
| by hashing whatever parts of its dependencies it considers "visible". |
| Usually, the dependency will help with this process |
| by outputting the relevant information |
| (like the Java API or hash of all classes) |
| to a single small file. |
| If both keys match the values from the last build, |
| then there is no need to rebuild. |
| |
| <p> |
| |
| Note that this optimization is currently separate from the distributed cache. |
| We'd like to combine them so that the cache can be used to fetch rules |
| built by a continuous integration server as long as the source files |
| and visible parts of the dependencies match. |
| |
| <h2>Buck prefers to use first-order dependencies</h2> |
| |
| <p> |
| |
| By default, Buck uses first-order dependencies when compiling Java. |
| This means that compilation can only see explicitly declared dependencies, |
| not other libraries that your dependencies depend on. |
| This behavior can be changed at runtime with the |
| {sp}{call buck.cmd_build /} command by specifying a |
| different value for <code>--build-dependencies</code>. |
| |
| <p> |
| |
| We recommend keeping the default, however. |
| First-order dependencies dramatically shrink the set of APIs |
| that your library is exposed to, |
| which dramatically reduces the scope of changes |
| that will force your library to be rebuilt. |
| |
| <h2>Fast Dex merging for Android</h2> |
| |
| <p> |
| |
| Other build tools use also Android's DX merge support |
| to merge your main program's dex file with third-party libraries. |
| However, Buck's support for fine-grained libraries |
| allows dex merging to work at a much higher granularity. |
| |
| <p> |
| |
| Buck also includes a customized version of DX |
| that includes significant performance improvements. |
| It uses a faster algorithm for merging many dex files. |
| It also has support for running multiple copies of DX |
| concurrently within a single long-lived buckd process, |
| which eliminates most of DX's start-up time. |
| |
| <p> |
| |
| As a result, when editing a small module and performing an incremental build, |
| we frequently see less than 1 second spent generating classes.dex. |
| |
| <h2>Graph enhancement for increased rule granularity</h2> |
| |
| <p> |
| |
| Frequently, the granularity at which we expect users to declare build rules |
| is very different from the granularity at which |
| we want the build system to model them. |
| Users want coarse-grained rules for simplicity (like "android_binary"), |
| but the build system wants fine-grained rules |
| (like "aapt package" and "dex merge") |
| to allow for parallelism and fine-grained caching. |
| |
| <p> |
| |
| Internally, Buck uses a mechanism called "graph enhancement" |
| that allows its internal "action graph" (the DAG used for building) |
| to be different from what the user declared (internally called the "target graph"). |
| Graph enhancement can add new synthetic rules |
| to break a monolithic task (like {call buck.android_binary /}) |
| into independent subtasks |
| that might only have a subset of the original dependencies, |
| so dex merging does not depend on running a full <code>aapt package</code>. |
| It can also move dependency edges, |
| so compiling Android libraries does not depend on dexing their dependencies. |
| |
| {/param} |
| {/call} |
| {/template} |