docs/concept/what_makes_buck_so_fast.soy - buck - Git at Google

 {namespace buck.what_makes_buck_so_fast}

 /***/
 {template .soyweb}
   {call buck.page}
     {param title: 'What Makes Buck so Fast?' /}
     {param content}

 Buck exploits a number of strategies to reduce build times.

 <h2>A build rule knows all of the inputs that can affect its output</h2>

 Buck is designed so that anything that can affect the output of a build rule must be specified
 as an input to the build rule: hidden state is not allowed. (This is also important for ensuring that
 results are consistent and reproducible for all developers.) Therefore, we can be sure that once a
 rule's <code>deps</code> are satisfied, the rule itself can be built. This gives us confidence that
 the <a href="http://en.wikipedia.org/wiki/Directed_acyclic_graph">DAG</a> that results from build
 rules and their <code>deps</code> is true: all dependencies are captured in the graph.

 <p>

 Having a DAG makes it straightforward for rules to be built in parallel, which can dramatically reduce
 build times. The execution model for Buck is very simple: starting with the leaf nodes
 of the graph, add them to a queue of rules to be built. When a thread is available, a rule is removed
 from the queue, and built. Assuming it is built successfully, it notifies all of the rules that depend
 on it that it is done. When a rule gets such a notification, it checks whether all its dependencies have
 been satisfied, and if so, it gets added to the queue. Computation proceeds in this manner until all of
 the nodes in the graph have gone through the queue.
 Therefore, breaking modules into finer dependencies creates opportunities for increased
 parallelism, improving throughput.

 <h2>Buck can store the outputs it generates in a cache</h2>

 A build rule knows all of the inputs that can affect its output, and therefore it can combine that
 information into a hash that represents the total input. This hash is used as
 a <em>cache key</em> where the associated value in the cache is the output produced by the rule.
 (See {call buck.concept_buckconfig /} for information on how to set up a cache.)
 The following information contributes to the cache key for a build rule:

 <ul>
   <li>The values of the arguments used to define the build rule in the build file.
   <li>The contents of any file arguments for the build rule.
   <li>The version of Buck being used to build the rule. (This means that upgrading Buck to a new
       version invalidates all of the cache keys generated by the old version.)
   <li>The cache key for each of the rule's <code>deps</code>.
 </ul>

 <p>

 When Buck begins to build a build rule, the first thing it does is compute the <em>cache key</em> for
 the rule. If there is a hit in any of the caches specified in <code>.buckconfig</code>, then it will
 fetch the rule's output from the cache instead of building the rule locally. For outputs that are
 expensive to compute, this is a substantial savings. It also makes it fast to rebuild when switching
 between branches in a <a href="http://en.wikipedia.org/wiki/Distributed_version_control_system">DVCS</a> such as Git or Mercurial.

 <p>

 Because Buck uses the cache key to determine whether to rebuild a rule, you should never have to run {call buck.cmd_clean /}.
 If anything that could affect the output of the rule changes, then the cache key should change, as well.
 Because the change in input will cause a cache miss, Buck will rebuild the rule, overwriting its old outputs.
 Since out-of-date outputs are guaranteed to be overwritten, there is no reason to clean the build.

 <p>

 If you are using some sort of <a href="http://en.wikipedia.org/wiki/Continuous_integration">continuous
 integration (CI)</a> system, you will likely want your CI builds to populate a cache that can be read by your local builds.
 That way, when a developer syncs to a revision that has already been built on your CI system, running <code>buck
 build</code> should not build anything locally, as all outputs should be able to be pulled from the cache.
 This works because the cache key computed by Buck when run on the CI system should match the key computed by Buck
 on your local machine.

 <h2>A Java rule computes its ABI when it is built</h2>

 Oftentimes, a developer will modify Java code in a way that does not affect its interface. For example, adding
 or removing private methods, as well as modifying the implementation of existing methods (regardless of their visibility),
 does not change the <a href="http://en.wikipedia.org/wiki/Application_binary_interface">ABI</a> of a Java file.

 <p>

 When Buck builds a {call buck.java_library /} rule, it also computes its ABI<sup><a href="#footnote-abi">1</a></sup>.
 Normally, modifying a private method
 in a {call buck.java_library /} would cause it and all rules that depend on it to be rebuilt because the change in
 cache keys would propagate up the DAG. However, Buck has special logic for a {call buck.java_library /} where,
 if the <code>.java</code> input files have not changed since the previous build, and the ABI for each of its Java
 dependencies has not changed since the previous build, then the {call buck.java_library /} will not be recompiled.
 This is valid because we know that neither the input <code>.java</code> files nor the ABI against which they
 would be compiled has changed, so the result would be the same if the rule were rebuilt. This localizes how much
 Java code needs to be recompiled in response to a change, again reducing build times.

 <hr>

 <p>

 <sup id="footnote-abi">1</sup>The ABI computed by Buck is stricter
 than <a href="http://docs.oracle.com/javase/specs/jls/se7/html/jls-13.html">what the Java Language Specification (JLS)
 uses to define binary compatibility</a>.
 In Buck, the ABI defines an equivalence relationship, whereas in the JLS, it does not. This is primarily done
 because the ABI is represented as a SHA-1 hash, which is cheap to compare. However, this does mean that adding a
 new method to a class results in a new ABI for a {call buck.java_library /} in Buck, which triggers a rebuild in dependent rules.
 By comparison, the JLS would consider the old and new versions of the library binary compatible, thereby determining
 recompilation of dependent rules unnecessary.
     {/param}
   {/call}
 {/template}
	{namespace buck.what_makes_buck_so_fast}

	/***/
	{template .soyweb}
	{call buck.page}
	{param title: 'What Makes Buck so Fast?' /}
	{param content}

	Buck exploits a number of strategies to reduce build times.

	<h2>A build rule knows all of the inputs that can affect its output</h2>

	Buck is designed so that anything that can affect the output of a build rule must be specified
	as an input to the build rule: hidden state is not allowed. (This is also important for ensuring that
	results are consistent and reproducible for all developers.) Therefore, we can be sure that once a
	rule's <code>deps</code> are satisfied, the rule itself can be built. This gives us confidence that
	the <a href="http://en.wikipedia.org/wiki/Directed_acyclic_graph">DAG</a> that results from build
	rules and their <code>deps</code> is true: all dependencies are captured in the graph.

	<p>

	Having a DAG makes it straightforward for rules to be built in parallel, which can dramatically reduce
	build times. The execution model for Buck is very simple: starting with the leaf nodes
	of the graph, add them to a queue of rules to be built. When a thread is available, a rule is removed
	from the queue, and built. Assuming it is built successfully, it notifies all of the rules that depend
	on it that it is done. When a rule gets such a notification, it checks whether all its dependencies have
	been satisfied, and if so, it gets added to the queue. Computation proceeds in this manner until all of
	the nodes in the graph have gone through the queue.
	Therefore, breaking modules into finer dependencies creates opportunities for increased
	parallelism, improving throughput.

	<h2>Buck can store the outputs it generates in a cache</h2>

	A build rule knows all of the inputs that can affect its output, and therefore it can combine that
	information into a hash that represents the total input. This hash is used as
	a <em>cache key</em> where the associated value in the cache is the output produced by the rule.
	(See {call buck.concept_buckconfig /} for information on how to set up a cache.)
	The following information contributes to the cache key for a build rule:

	<ul>
	<li>The values of the arguments used to define the build rule in the build file.
	<li>The contents of any file arguments for the build rule.
	<li>The version of Buck being used to build the rule. (This means that upgrading Buck to a new
	version invalidates all of the cache keys generated by the old version.)
	<li>The cache key for each of the rule's <code>deps</code>.
	</ul>

	<p>

	When Buck begins to build a build rule, the first thing it does is compute the <em>cache key</em> for
	the rule. If there is a hit in any of the caches specified in <code>.buckconfig</code>, then it will
	fetch the rule's output from the cache instead of building the rule locally. For outputs that are
	expensive to compute, this is a substantial savings. It also makes it fast to rebuild when switching
	between branches in a <a href="http://en.wikipedia.org/wiki/Distributed_version_control_system">DVCS</a> such as Git or Mercurial.

	<p>

	Because Buck uses the cache key to determine whether to rebuild a rule, you should never have to run {call buck.cmd_clean /}.
	If anything that could affect the output of the rule changes, then the cache key should change, as well.
	Because the change in input will cause a cache miss, Buck will rebuild the rule, overwriting its old outputs.
	Since out-of-date outputs are guaranteed to be overwritten, there is no reason to clean the build.

	<p>

	If you are using some sort of <a href="http://en.wikipedia.org/wiki/Continuous_integration">continuous
	integration (CI)</a> system, you will likely want your CI builds to populate a cache that can be read by your local builds.
	That way, when a developer syncs to a revision that has already been built on your CI system, running <code>buck
	build</code> should not build anything locally, as all outputs should be able to be pulled from the cache.
	This works because the cache key computed by Buck when run on the CI system should match the key computed by Buck
	on your local machine.

	<h2>A Java rule computes its ABI when it is built</h2>

	Oftentimes, a developer will modify Java code in a way that does not affect its interface. For example, adding
	or removing private methods, as well as modifying the implementation of existing methods (regardless of their visibility),
	does not change the <a href="http://en.wikipedia.org/wiki/Application_binary_interface">ABI</a> of a Java file.

	<p>

	When Buck builds a {call buck.java_library /} rule, it also computes its ABI<sup><a href="#footnote-abi">1</a></sup>.
	Normally, modifying a private method
	in a {call buck.java_library /} would cause it and all rules that depend on it to be rebuilt because the change in
	cache keys would propagate up the DAG. However, Buck has special logic for a {call buck.java_library /} where,
	if the <code>.java</code> input files have not changed since the previous build, and the ABI for each of its Java
	dependencies has not changed since the previous build, then the {call buck.java_library /} will not be recompiled.
	This is valid because we know that neither the input <code>.java</code> files nor the ABI against which they
	would be compiled has changed, so the result would be the same if the rule were rebuilt. This localizes how much
	Java code needs to be recompiled in response to a change, again reducing build times.

	<hr>

	<p>

	<sup id="footnote-abi">1</sup>The ABI computed by Buck is stricter
	than <a href="http://docs.oracle.com/javase/specs/jls/se7/html/jls-13.html">what the Java Language Specification (JLS)
	uses to define binary compatibility</a>.
	In Buck, the ABI defines an equivalence relationship, whereas in the JLS, it does not. This is primarily done
	because the ABI is represented as a SHA-1 hash, which is cheap to compare. However, this does mean that adding a
	new method to a class results in a new ABI for a {call buck.java_library /} in Buck, which triggers a rebuild in dependent rules.
	By comparison, the JLS would consider the old and new versions of the library binary compatible, thereby determining
	recompilation of dependent rules unnecessary.
	{/param}
	{/call}
	{/template}