tf_1.8_xla_doc
|
#include <copy_insertion.h>
Inherits HloPassInterface.
Public Member Functions | |
StatusOr< bool > | Run (HloModule *module) override |
Run the pass on the given module. More... | |
Static Public Member Functions | |
static StatusOr< bool > | AddCopiesForBufferAssignment (HloModule *module) |
Google docs:
Copy insertion is a legalization HLO pass which inserts copies (kCopy instructions) to eliminate several kinds of problems in the HLO module.
(1) Entry parameter or a constant live out of the entry computation. Entry computation arguments and constants have different lifetimes than the computation result and cannot share the same allocation. Parameters and constants live out of non-entry computations do not need copies.
(2) Different values which are simultaneously live and which must be held in the same buffer. This can occur in while bodies. Specifically, the while loop state (the arguments to the while instruction) is updated in-place and the update may clobber the value from the previous iteration before the previous value is dead. Computations called from kCall instructions do not need such copies because kCall has no update in-place semantics.
(3) The buffer set of the root instruction of the entry computation must be unambiguous and distinct. That is, InstructionAliasSet::IsAmbiguous and InstructionAliasSet::IsDistinct return true.
|
static |
Google docs:
The CPU and GPU backend need additional copies added due to deficiencies in buffer assignment. Specifically, copies are needed for constants live-out of computations, and for values which are live-in and live-out of the same computation. These copies are needed because buffer-assignment uses a computation-scoped analyis (TuplePointsToAnalysis) and has limited visibility across computation boundaries. This method adds these necessary copies. Returns whether the module was modified.
TODO(b/62548313): Remove this when buffer assignment is module-scoped.
|
override |
Run the pass on the given module.
Returns whether the module was changed (copies were inserted).
Google docs:
Copy insertion is performed in three steps:
(1) Add copies conservatively to guarantee that there is no live-range interference. This is done simplistically and usually results in more copies than is strictly necessary.
(2) Using a more fine-grained analysis, remove as many copies that were added in (1) as possible while ensuring no live-range interference.
(3) Add copies to resolve issues not related to live range interference such as parameters and constants live out of the entry computation.
We add copies then remove them (step (1) then (2)) rather than simply adding only the copies that are necessary because, in general, it is difficult to figure out the minimal set of copies to add once there is interference. On the other hand, it is easy to determine if removing a copy will introduce interference.
The final copy insertion in (3) is done separately to simplify the implementation of copy removal in (2) which is the most complicated part of the pass. As is, copy removal only has to reason about live range interference. If all copies were added in step (1) then copy removal would also have to reason about things like constants and parameters live out of the computation.
xla::anonymous_namespace{copy_insertion.cc}::AddCopiesToResolveInterference()
xla::TupleSimplifier
for structures done by deep copies