eval is performed.
Why Lazy Evaluation
Lazy evaluation provides several powerful features that make MLX efficient and flexible.Transforming Compute Graphs
Lazy evaluation lets us record a compute graph without actually doing any computations. This is useful for function transformations likegrad and vmap and graph optimizations.
Currently, MLX does not compile and rerun compute graphs. They are all generated dynamically. However, lazy evaluation makes it much easier to integrate compilation for future performance enhancements.
Only Compute What You Use
In MLX you do not need to worry as much about computing outputs that are never used. For example:expensive_fun. Use this pattern with care though, as the graph of expensive_fun is still built, and that has some cost associated to it.
Lazy evaluation can be beneficial for saving memory while keeping code simple. Say you have a very large model derived from
mlx.nn.Module. You can instantiate this model without computing anything until you perform an eval. If you update the model with float16 weights, your maximum consumed memory will be half that required if eager computation was used instead.When to Evaluate
A common question is when to useeval. The trade-off is between letting graphs get too large and not batching enough useful work.
For example:
eval is at each iteration of this outer loop.
Here is a concrete example:
Implicit Evaluation
An important behavior to be aware of is when the graph will be implicitly evaluated. Anytime youprint an array, convert it to a numpy.ndarray, or otherwise access its memory via memoryview, the graph will be evaluated. Saving arrays via save (or any other MLX saving functions) will also evaluate the array.
Calling array.item() on a scalar array will also evaluate it. In the example above, printing the loss (print(loss)) or adding the loss scalar to a list (losses.append(loss.item())) would cause a graph evaluation. If these lines are before mx.eval(loss, model.parameters()) then this will be a partial evaluation, computing only the forward pass.
Calling
eval on an array or set of arrays multiple times is perfectly fine. This is effectively a no-op.