From 509f05ff815727b3bb2157dc09edbf7d6b4ce346 Mon Sep 17 00:00:00 2001 From: Lu Weizheng Date: Wed, 25 Dec 2024 22:50:08 +0800 Subject: [PATCH] operator fusion doc --- doc/source/getting_started/installation.rst | 2 ++ doc/source/user_guide/best_practices.rst | 1 + doc/source/user_guide/operator_fusion.rst | 34 +++++++++++++++++++++ 3 files changed, 37 insertions(+) create mode 100644 doc/source/user_guide/operator_fusion.rst diff --git a/doc/source/getting_started/installation.rst b/doc/source/getting_started/installation.rst index 3c41c9b85..fd198f9a5 100644 --- a/doc/source/getting_started/installation.rst +++ b/doc/source/getting_started/installation.rst @@ -34,6 +34,7 @@ an older version of pandas, you should either upgrade your pandas or downgrade X ======= =================== ======== ========= ========== =========== =========== Xorbits Python `NumPy`_ `pandas`_ `xgboost`_ `lightgbm`_ `datasets`_ ======= =================== ======== ========= ========== =========== =========== +0.8.2 3.9,3.10,3.11,3.12 2.2.1 2.2.3 2.1.3 4.5.0 3.2.0 0.8.1 3.9,3.10,3.11,3.12 2.1.3 2.2.3 2.1.3 4.5.0 3.1.0 0.8.0 3.9,3.10,3.11,3.12 2.1.3 2.2.3 2.1.2 4.5.0 3.1.0 0.7.4 3.9,3.10,3.11 1.26.4 2.2.3 2.1.1 4.5.0 3.0.1 @@ -67,6 +68,7 @@ For example: ======= =================== ======== ========= Xorbits Python `CuPy`_ `cuDF`_ ======= =================== ======== ========= +0.8.2 3.10,3.11,3.12 13.3.0 24.10 0.8.1 3.10,3.11,3.12 13.3.0 24.10 ======= =================== ======== ========= diff --git a/doc/source/user_guide/best_practices.rst b/doc/source/user_guide/best_practices.rst index 2ad6e9bdd..97b5cea0a 100644 --- a/doc/source/user_guide/best_practices.rst +++ b/doc/source/user_guide/best_practices.rst @@ -14,3 +14,4 @@ practices, and helps users solve some common problems. loading_data storage_backend chunking + operator_fusion diff --git a/doc/source/user_guide/operator_fusion.rst b/doc/source/user_guide/operator_fusion.rst new file mode 100644 index 000000000..2259cc059 --- /dev/null +++ b/doc/source/user_guide/operator_fusion.rst @@ -0,0 +1,34 @@ +.. _operator_fusion: + +=============== +Operator Fusion +=============== + +Xorbits implements operator fusion optimization to reduce memory access overhead and improve computational efficiency. +The fusion engine combines multiple nearby operators into a single fused one. Rather than implementing our own fusion +engine from scratch, Xorbits leverages existing state-of-the-art fusion engines: NumExpr, JAX, or CuPy. Operator Fusion +is available automatically when one of the fusion packages is installed in your Python environment. Operator fusion is +especially effective for ``xorbits.numpy``. + +How It Works +----------- + +The optimization process works as follows: + +1. Identifies sequences of operations that can be fused together. + +Note that NumExpr, JAX, and CuPy are single-machine fusion engines, while Xorbits is a distributed toolkit. +Xorbits will check which operations can be fused. For example, operations like single-axis reduction (``len(op.axis) == 1`` +for ``xorbits.numpy.sum()`` or ``xorbits.numpy.max()``) can be fused, while other reduction operations are not. + +2. Groups compatible operations into a single fused operation. + +3. Executes the fused operation using the appropriate fusion engines (JAX, NumExpr, or CuPy). + +This optimization reduces: + +* Memory allocation/deallocation overhead +* Data movement between operations + +The fusion can optimize chains of element-wise operations and simple reductions, +where memory bandwidth is often the bottleneck rather than computational intensity. \ No newline at end of file