diff --git a/docs/refmans/gui/viselements/generic/chart.md_template b/docs/refmans/gui/viselements/generic/chart.md_template
index d5d528271..d90c9e6ee 100644
--- a/docs/refmans/gui/viselements/generic/chart.md_template
+++ b/docs/refmans/gui/viselements/generic/chart.md_template
@@ -176,39 +176,44 @@ for more details.
## Large datasets
-Displaying large datasets poses several challenges, both from a technical and user experience
-standpoint. These challenges can impact performance, usability, and the overall effectiveness of
-data presentation.
+Displaying large datasets presents several challenges from both technical and user experience
+perspectives. These challenges can impact performance, usability, and the overall effectiveness of
+data visualization.
The most prominent issues are:
- Performance Issues
- - Loading Time: Large datasets can significantly increase the display time of your application,
- leading to a poor user experience.
- - Memory Usage: Keeping large amounts of data in memory can strain client devices, potentially
- leading to browser crashes or sluggish performance.
- - Network Performance: Transmitting large datasets over the network can be slow and expensive in
- terms of bandwidth, especially for users on limited data plans or slow connections.
+ - Loading Time: Large datasets can significantly increase application load times, leading to a
+ poor user experience;
+ - Memory Usage: Storing large datasets in memory can strain client devices, potentially causing
+ browser crashes or slow performance;
+ - Network Performance: Transmitting large datasets over the network can be slow and
+ resource-intensive, especially for users with limited bandwidth or slow connections.
- User Experience Challenges
- - Designing visualizations that scale well with the size of the data and remain informative and
- actionable can be challenging.
+ Designing visualizations that effectively scale with large datasets while remaining informative
+ and actionable can be difficult.
- Technical Limitations
- Browsers have limitations on how much data they can efficiently process and display, which can
- restrict the amount of data that can be shown at one time.
+ Browsers have inherent limits on how much data they can efficiently process and display,
+ restricting the amount of data that can be shown at once.
-The chart control provides different classes that can help deliver relevant representations of large
-data sets by applying a technical method called *data decimation*.
-Data decimation involves reducing the volume of data to be sent to the browser and displayed without
-significantly losing the usefulness of the original data. The goal is to make visualization more
-efficient while retaining the essential characteristics of the data.
+The chart control in Taipy GUI provides several classes that can help manage and visualize large
+datasets by using a technique called *data decimation*.
+Data decimation reduces the amount of data sent to the browser and displayed without significantly
+losing its value, making visualization more efficient while preserving essential characteristics.
-The [`taipy.gui.data`](../../../../refmans/reference/pkg_taipy/pkg_gui/pkg_data/index.md) package
-defines implementations of different decimation algorithms in classes that inherit the `Decimator^`
-class.
+The `taipy.gui.data^` package offers various implementations of decimation algorithms through
+classes that inherit from the `Decimator^` class.
-To use these algorithms and manage the challenges posed by representing large datasets, you must
-instantiate the decimator class that best matches the dataset you are dealing with and set the
-instance to the [`decimator`](#p-decimator) property of the chart control that represents the data.
+To leverage these algorithms and handle large datasets efficiently:
+
+- Instantiate the decimator class that best suits your dataset.
+- Assign the instance to the [`decimator`](#p-decimator) property of the chart control representing
+ the data.
+
+An [advanced example](charts/advanced.md#large-datasets) demonstrates when and how decimation can
+be used.
+This documentation also contains an [article](../../../../tutorials/articles/decimator/index.md)
+that provides details on this feature.
## The *rebuild* property {data-source="gui:doc/examples/charts/example_rebuild.py"}
@@ -299,7 +304,8 @@ name to select the charts on your page and apply style.
## [Stylekit](../../../../userman/gui/styling/stylekit.md) support
-The [Stylekit](../../../../userman/gui/styling/stylekit.md) provides a specific class that you can use to style charts:
+The [Stylekit](../../../../userman/gui/styling/stylekit.md) provides a specific class that you can
+use to style charts:
* *has-background*
When the chart control uses the *has-background* class, the rendering of the chart
diff --git a/docs/refmans/gui/viselements/generic/charts/advanced.md_template b/docs/refmans/gui/viselements/generic/charts/advanced.md_template
index 858d4b7b7..847227caf 100644
--- a/docs/refmans/gui/viselements/generic/charts/advanced.md_template
+++ b/docs/refmans/gui/viselements/generic/charts/advanced.md_template
@@ -1,4 +1,3 @@
-
# Advanced topics
Taipy exposes advanced features that the Plotly library provides. This section
@@ -90,6 +89,185 @@ Here is what the resulting plot looks like:
Unbalanced data sets
+## Large datasets {data-source="gui:doc/examples/charts/advanced_large_datasets.py"}
+
+When binding a chart component to a large dataset, performance becomes a critical consideration.
+Large datasets can consume significant system resources, resulting in slower rendering times,
+reduced responsiveness, and overall degraded performance, especially in interactive
+applications.
+These issues can negatively impact the user experience, particularly when users need to interact
+with or manipulate the chart.
+
+Consider a scenario where an application needs to visualize large datasets, allowing users to
+trigger calculations and make decisions based on the data.
+Below is a basic example of a large dataset represented in Python:
+```python
+x_values = ...
+Y_values = ...
+data = pd.DataFrame({"X": x_values, "Y": y_values})
+```
+
+In this example, *x_values* could be a sequence of 50,000 integers, while *y_values* is generated
+from a noisy log-sine function, also containing 50,000 samples.
+
+To represent this data using Taipy GUI, you can define a `chart` control as follows:
+!!! taipy-element
+ default={data}
+ type=markers
+ x=X
+ y=Y
+
+The initial chart display with 50,000 data points will look like this:
+
+
+
+ Initial dataset
+
+
+As seen in the chart, with 50,000 data points, it becomes difficult to interpret the information due
+to over-plotting. Furthermore, rendering such a large dataset takes significant time, and the entire
+dataset must be transmitted to the frontend, further affecting performance.
+
+To improve both application performance and chart readability, it is necessary to reduce the number
+of data points rendered. This can be achieved through techniques such as downsampling, aggregation,
+or filtering, which can limit the volume of data without losing critical insights
+
+
Solution 1: Linear interpolation
+
+One approach to handle large datasets is to apply linear interpolation. This technique reduces the
+number of data points by approximating the values between points along a straight line, effectively
+downsampling the dataset.
+
+Using Python and the [NumPy](https://numpy.org/) package, this solution can be implemented
+easily and efficiently:
+```python linenums="1"
+x_values = x_values[::100]
+y_values = y_values.reshape(-1, 100)
+y_values = np.mean(y_values, axis=1)
+```
+
+- line 1: The original *x_values* array is reduced by selecting one value for every 100 points.
+- line 2: The *y_values* array is reshaped, grouping every 100 consecutive data points.
+- line 3: The mean of each group of 100 points is calculated, resulting in a smaller dataset.
+
+This results in a dataset with 100 times fewer points, while still preserving the overall shape of
+the original data.
+
+Here is the updated chart using the downsampled (interpolated) dataset:
+
+
+
+ Downsampled dataset
+
+
+While linear interpolation significantly reduces the dataset size and improves chart performance,
+it has some limitations. Specifically, it smooths out the data, which can obscure important details,
+such as sharp changes or high-frequency fluctuations.
+With this technique, the noise present in the original sine function has been smoothed away, which
+might not be desirable for data analysts or scientists who need to observe finer data
+characteristics.
+
+
Solution 2: Sub-sampling
+
+Sub-sampling is a simple technique where a representative subset of the original data is selected,
+for example, by picking every 100th data point. This directly reduces the number of points,
+enhancing performance.
+
+With [NumPy](https://numpy.org/) and Python's slicing syntax, sub-sampling is straightforward:
+```python
+x_values = x_values[::100]
+y_values = y_values[::100]
+```
+Both *x_values* and *y_values* are reduced by selecting every 100th element from the original
+arrays.
+
+As a result, the dataset *data* is reduced to 1/100th of its original size, while still retaining
+the overall shape of the data.
+
+Here is the chart with the sub-sampled dataset:
+
+
+
+ Sub-sampled dataset
+
+
+However, sub-sampling has its limitations. It may skip over significant trends or abrupt changes in
+the data, especially if key points are not selected. While it's a quick and efficient solution, it
+may result in the loss of critical details in datasets with high-frequency variations or sudden
+transitions.
+As you can see, only a very few number of noisy data points remain after the sampling, which is
+expected.
+
+
Solution 3: Decimation
+
+Decimation is a more refined approach to reducing dataset size while preserving essential
+information and trends. By selectively removing data points based on specific criteria, such as
+frequency content or statistical significance, decimation balances performance with data integrity.
+
+
+The `chart` control has a [*decimator*](../chart.md#p-decimator) property that accepts an instance
+of a subclass of `Decimator^`. This class transforms the dataset declared in the
+[*data*](../chart.md#p-data) property, reducing the number of points to be displayed
+while ensuring that key data features are retained.
+
+
+To use decimation, you must instantiate a decimator:
+```python
+decimator = MinMaxDecimator(200)
+```
+
+In this case, the decimator limits the displayed points to 200, which is an extreme reduction, but
+it highlights the effect. For practical usage, typical values range from 1000 to 2000, depending on
+the horizontal resolution of the screen: it's often unnecessary to render more points than a monitor
+can display.
+
+Several decimator types are available, including `MinMaxDecimator^`, `LTTB^`, `RDP^`, and
+`ScatterDecimator^`. Each of these implements different algorithms, better suited for specific
+shapes of data.
+
+To apply the decimator to the chart, set the *decimator* variable to the
+[*decimator*](../chart.md#p-decimator) property:
+!!! taipy-element
+ default={data}
+ type=markers
+ x=X
+ y=Y
+ decimator={decimator}
+
+Here is the resulting chart after applying decimation:
+
+
+
+ Decimation applied
+
+
+The chart retains more of the dataset's characteristics compared to simpler methods like
+sub-sampling or linear interpolation. The global sine-like shape is preserved, and the noise remains
+visible.
+At the same time, performance is greatly improved: only 200 points are rendered instead of 50,000,
+a reduction by a factor of 250.
+
+Note that decimation does not alter nor duplicate the original dataset.
+You can zoom in and out using the chart's built-in zoom tool to reveal or hide more details
+dynamically.
+
+For example, the following image shows the chart zoomed in on a specific area:
+
+
+
+ Selecting an area to zoom in
+
+
+And here is the result after zooming in:
+
+
+
+ Chart after zoom
+
+
+As you zoom in, more details are revealed, while still adhering to the 200-point limit, ensuring
+smooth performance and responsive interactions.
+
## Adding annotations {data-source="gui:doc/examples/charts/advanced_annotations.py"}
You can add text annotations on top of a chart using the *annotations* property of
diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_average-d.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_average-d.png
new file mode 100644
index 000000000..343c152d5
Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_average-d.png differ
diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_average-l.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_average-l.png
new file mode 100644
index 000000000..8c4b104cf
Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_average-l.png differ
diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-d.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-d.png
new file mode 100644
index 000000000..4d24d6dc0
Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-d.png differ
diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-l.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-l.png
new file mode 100644
index 000000000..86dd2fa09
Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-l.png differ
diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom1-d.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom1-d.png
new file mode 100644
index 000000000..5bf2899ea
Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom1-d.png differ
diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom1-l.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom1-l.png
new file mode 100644
index 000000000..adf1c50c1
Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom1-l.png differ
diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom2-d.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom2-d.png
new file mode 100644
index 000000000..ff82c7435
Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom2-d.png differ
diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom2-l.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom2-l.png
new file mode 100644
index 000000000..42c6f229a
Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom2-l.png differ
diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_none-d.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_none-d.png
new file mode 100644
index 000000000..69f45b9ba
Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_none-d.png differ
diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_none-l.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_none-l.png
new file mode 100644
index 000000000..df5242da1
Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_none-l.png differ
diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_subsampling-d.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_subsampling-d.png
new file mode 100644
index 000000000..d0ee2ea44
Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_subsampling-d.png differ
diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_subsampling-l.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_subsampling-l.png
new file mode 100644
index 000000000..5d234e90f
Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_subsampling-l.png differ
diff --git a/docs/userman/gui/pages/builder.md b/docs/userman/gui/pages/builder.md
index d072c3531..ff56365c4 100644
--- a/docs/userman/gui/pages/builder.md
+++ b/docs/userman/gui/pages/builder.md
@@ -2,16 +2,13 @@
title: The Page Builder API
---
-The Page Builder API is a set of classes located in the
-[`taipy.gui.builder`](../../../refmans/reference/pkg_taipy/pkg_gui/pkg_builder/index.md) package
-that lets users create Taipy GUI pages entirely from Python code.
+The Page Builder API is a set of classes located in the `taipy.gui.builder^` package that lets users
+create Taipy GUI pages entirely from Python code.
This package contains a class for every visual element available in Taipy, including those
defined in [extension libraries](../extension/index.md).
-To access the Page Builder classes, you must import the
-[`taipy.gui.builder`](../../../refmans/reference/pkg_taipy/pkg_gui/pkg_builder/index.md) package in
-your script.
+To access the Page Builder classes, you must import the `taipy.gui.builder^` package in your script.
# Generating a new page
diff --git a/tools/_setup_generation/step_viselements.py b/tools/_setup_generation/step_viselements.py
index 2e0418e05..173328573 100644
--- a/tools/_setup_generation/step_viselements.py
+++ b/tools/_setup_generation/step_viselements.py
@@ -128,7 +128,7 @@ def __generate_toc_file(self, tocs: Dict[str, VEToc]):
md_file.write(md_template)
@staticmethod
- def __get_navigation_section(category: str, prefix:str) -> str:
+ def __get_navigation_section(category: str, prefix: str) -> str:
if category == "blocks":
return "Blocks"
if prefix == "core_":
@@ -238,8 +238,8 @@ def __generate_element_doc(self, element_type: str, category: str):
raise ValueError(
f"Couldn't locate first header in documentation for element '{element_type}'"
)
- before_properties = match.group(1)
- after_properties = match.group(2) + element_documentation[match.end() :]
+ before_properties = match[1]
+ after_properties = match[2] + element_documentation[match.end() :]
# Chart hook
if element_type == "chart":
@@ -288,7 +288,6 @@ def __generate_builder_api(self) -> None:
py_content = py_content[: m.start(0) + 1]
def generate(self, category, base_class: str) -> str:
-
element_types = self.categories[category]
def build_doc(property: str, desc, indent: int):
@@ -382,16 +381,17 @@ def __init__(self, [arguments]) -> None:
element_md_location = (
"corelements" if desc["prefix"] == "core_" else "generic"
)
- if m := (re.search
- (r"(\[`(\w+)`\]\()\2\.md\)", short_doc)):
+ if m := (re.search(r"(\[`(\w+)`\]\()\2\.md\)", short_doc)):
short_doc = (
short_doc[: m.start()]
+ f"{m[1]}../../../../../refmans/gui/viselements/{element_md_location}/{m[2]}.md)"
+ short_doc[m.end() :]
)
- element_md_page = (f"[`{element_type}`](../../../../../../refmans/gui/viselements/{element_md_location}"
- f"/{element_type}.md)")
+ element_md_page = (
+ f"[`{element_type}`](../../../../../../refmans/gui/viselements/{element_md_location}"
+ f"/{element_type}.md)"
+ )
buffer.write(
template.replace("[element_type]", element_type)
.replace("[element_md_page]", element_md_page)
@@ -414,7 +414,9 @@ def __init__(self, [arguments]) -> None:
# Special case for charts: we want to insert the chart gallery that
# is stored in the file whose path is in self.charts_home_html_path
- # This should be inserted before the first level 1 header
+ # This should be inserted before the first header.
+ # Simultaneously, we build a list of chart types to point to type pages as text.
+ # This should be inserted before the "Styling" header.
def __chart_page_hook(
self, element_documentation: str, before: str, after: str, charts_md_dir: str
) -> tuple[str, str]:
@@ -432,12 +434,12 @@ def __chart_page_hook(
chart_gallery = "\n" + chart_gallery[match.end() :]
SECTION_RE = re.compile(r"^([\w-]+):(.*)$")
chart_sections = ""
- for line in match.group(1).splitlines():
+ for line in match[1].splitlines():
if match := SECTION_RE.match(line):
type = match.group(1)
- chart_sections += f"- [{match.group(2)}](charts/{type}.md)\n"
+ chart_sections += f"\n- [{match.group(2)}](charts/{type}.md)"
+ # Generate chart type documentation page from template, if possible
template_doc_path = f"{charts_md_dir}/{type}.md_template"
- # Generate chart type documentation page if possible
if os.access(template_doc_path, os.R_OK):
with open(template_doc_path, "r") as template_doc_file:
documentation = template_doc_file.read()
@@ -454,9 +456,16 @@ def __chart_page_hook(
raise ValueError(
"Couldn't locate first header1 in documentation for element 'chart'"
)
+ styling_match = re.search(
+ r"\n# Styling\n", after, re.MULTILINE | re.DOTALL
+ )
+ if not styling_match:
+ raise ValueError(
+ "Couldn't locate \"Styling\" header1 in documentation for element 'chart'"
+ )
return (
- match.group(1) + chart_gallery + before[match.end() :],
- after + chart_sections,
+ match[1] + chart_gallery + before[match.end() :],
+ after[: styling_match.start()] + chart_sections + "\n\n" + after[styling_match.start() :]
)
def __process_element_md_file(self, type: str, documentation: str) -> str:
diff --git a/tools/postprocess.py b/tools/postprocess.py
index 2edbc7098..e11fc6864 100644
--- a/tools/postprocess.py
+++ b/tools/postprocess.py
@@ -199,8 +199,10 @@ def on_post_build(env):
raise e
# Remove useless spaces for improved processing
- html_content = re.sub(r"[ \t]+", " ", re.sub(r"\n\s*\n+", "\n\n", html_content))
- html_content = html_content.replace("\n\n", "\n")
+ # This breaks the code blocks - so needs to avoid the
elements before
+ # we bring it back.
+ #html_content = re.sub(r"[ \t]+", " ", re.sub(r"\n\s*\n+", "\n\n", html_content))
+ #html_content = html_content.replace("\n\n", "\n")
html_content = html_content.replace(
'