diff --git a/docs/refmans/gui/viselements/generic/chart.md_template b/docs/refmans/gui/viselements/generic/chart.md_template index d5d528271..d90c9e6ee 100644 --- a/docs/refmans/gui/viselements/generic/chart.md_template +++ b/docs/refmans/gui/viselements/generic/chart.md_template @@ -176,39 +176,44 @@ for more details. ## Large datasets -Displaying large datasets poses several challenges, both from a technical and user experience -standpoint. These challenges can impact performance, usability, and the overall effectiveness of -data presentation. +Displaying large datasets presents several challenges from both technical and user experience +perspectives. These challenges can impact performance, usability, and the overall effectiveness of +data visualization. The most prominent issues are: - Performance Issues
- - Loading Time: Large datasets can significantly increase the display time of your application, - leading to a poor user experience. - - Memory Usage: Keeping large amounts of data in memory can strain client devices, potentially - leading to browser crashes or sluggish performance. - - Network Performance: Transmitting large datasets over the network can be slow and expensive in - terms of bandwidth, especially for users on limited data plans or slow connections. + - Loading Time: Large datasets can significantly increase application load times, leading to a + poor user experience; + - Memory Usage: Storing large datasets in memory can strain client devices, potentially causing + browser crashes or slow performance; + - Network Performance: Transmitting large datasets over the network can be slow and + resource-intensive, especially for users with limited bandwidth or slow connections. - User Experience Challenges
- - Designing visualizations that scale well with the size of the data and remain informative and - actionable can be challenging. + Designing visualizations that effectively scale with large datasets while remaining informative + and actionable can be difficult. - Technical Limitations
- Browsers have limitations on how much data they can efficiently process and display, which can - restrict the amount of data that can be shown at one time. + Browsers have inherent limits on how much data they can efficiently process and display, + restricting the amount of data that can be shown at once. -The chart control provides different classes that can help deliver relevant representations of large -data sets by applying a technical method called *data decimation*.
-Data decimation involves reducing the volume of data to be sent to the browser and displayed without -significantly losing the usefulness of the original data. The goal is to make visualization more -efficient while retaining the essential characteristics of the data. +The chart control in Taipy GUI provides several classes that can help manage and visualize large +datasets by using a technique called *data decimation*.
+Data decimation reduces the amount of data sent to the browser and displayed without significantly +losing its value, making visualization more efficient while preserving essential characteristics. -The [`taipy.gui.data`](../../../../refmans/reference/pkg_taipy/pkg_gui/pkg_data/index.md) package -defines implementations of different decimation algorithms in classes that inherit the `Decimator^` -class. +The `taipy.gui.data^` package offers various implementations of decimation algorithms through +classes that inherit from the `Decimator^` class. -To use these algorithms and manage the challenges posed by representing large datasets, you must -instantiate the decimator class that best matches the dataset you are dealing with and set the -instance to the [`decimator`](#p-decimator) property of the chart control that represents the data. +To leverage these algorithms and handle large datasets efficiently: + +- Instantiate the decimator class that best suits your dataset. +- Assign the instance to the [`decimator`](#p-decimator) property of the chart control representing + the data. + +An [advanced example](charts/advanced.md#large-datasets) demonstrates when and how decimation can +be used.
+This documentation also contains an [article](../../../../tutorials/articles/decimator/index.md) +that provides details on this feature. ## The *rebuild* property {data-source="gui:doc/examples/charts/example_rebuild.py"} @@ -299,7 +304,8 @@ name to select the charts on your page and apply style. ## [Stylekit](../../../../userman/gui/styling/stylekit.md) support -The [Stylekit](../../../../userman/gui/styling/stylekit.md) provides a specific class that you can use to style charts: +The [Stylekit](../../../../userman/gui/styling/stylekit.md) provides a specific class that you can +use to style charts: * *has-background*
When the chart control uses the *has-background* class, the rendering of the chart diff --git a/docs/refmans/gui/viselements/generic/charts/advanced.md_template b/docs/refmans/gui/viselements/generic/charts/advanced.md_template index 858d4b7b7..847227caf 100644 --- a/docs/refmans/gui/viselements/generic/charts/advanced.md_template +++ b/docs/refmans/gui/viselements/generic/charts/advanced.md_template @@ -1,4 +1,3 @@ - # Advanced topics Taipy exposes advanced features that the Plotly library provides. This section @@ -90,6 +89,185 @@ Here is what the resulting plot looks like:
Unbalanced data sets
+## Large datasets {data-source="gui:doc/examples/charts/advanced_large_datasets.py"} + +When binding a chart component to a large dataset, performance becomes a critical consideration. +Large datasets can consume significant system resources, resulting in slower rendering times, +reduced responsiveness, and overall degraded performance, especially in interactive +applications.
+These issues can negatively impact the user experience, particularly when users need to interact +with or manipulate the chart. + +Consider a scenario where an application needs to visualize large datasets, allowing users to +trigger calculations and make decisions based on the data.
+Below is a basic example of a large dataset represented in Python: +```python +x_values = ... +Y_values = ... +data = pd.DataFrame({"X": x_values, "Y": y_values}) +``` + +In this example, *x_values* could be a sequence of 50,000 integers, while *y_values* is generated +from a noisy log-sine function, also containing 50,000 samples. + +To represent this data using Taipy GUI, you can define a `chart` control as follows: +!!! taipy-element + default={data} + type=markers + x=X + y=Y + +The initial chart display with 50,000 data points will look like this: +
+ + +
Initial dataset
+
+ +As seen in the chart, with 50,000 data points, it becomes difficult to interpret the information due +to over-plotting. Furthermore, rendering such a large dataset takes significant time, and the entire +dataset must be transmitted to the frontend, further affecting performance. + +To improve both application performance and chart readability, it is necessary to reduce the number +of data points rendered. This can be achieved through techniques such as downsampling, aggregation, +or filtering, which can limit the volume of data without losing critical insights + +

Solution 1: Linear interpolation

+ +One approach to handle large datasets is to apply linear interpolation. This technique reduces the +number of data points by approximating the values between points along a straight line, effectively +downsampling the dataset. + +Using Python and the [NumPy](https://numpy.org/) package, this solution can be implemented +easily and efficiently: +```python linenums="1" +x_values = x_values[::100] +y_values = y_values.reshape(-1, 100) +y_values = np.mean(y_values, axis=1) +``` + +- line 1: The original *x_values* array is reduced by selecting one value for every 100 points. +- line 2: The *y_values* array is reshaped, grouping every 100 consecutive data points. +- line 3: The mean of each group of 100 points is calculated, resulting in a smaller dataset. + +This results in a dataset with 100 times fewer points, while still preserving the overall shape of +the original data. + +Here is the updated chart using the downsampled (interpolated) dataset: +
+ + +
Downsampled dataset
+
+ +While linear interpolation significantly reduces the dataset size and improves chart performance, +it has some limitations. Specifically, it smooths out the data, which can obscure important details, +such as sharp changes or high-frequency fluctuations.
+With this technique, the noise present in the original sine function has been smoothed away, which +might not be desirable for data analysts or scientists who need to observe finer data +characteristics. + +

Solution 2: Sub-sampling

+ +Sub-sampling is a simple technique where a representative subset of the original data is selected, +for example, by picking every 100th data point. This directly reduces the number of points, +enhancing performance. + +With [NumPy](https://numpy.org/) and Python's slicing syntax, sub-sampling is straightforward: +```python +x_values = x_values[::100] +y_values = y_values[::100] +``` +Both *x_values* and *y_values* are reduced by selecting every 100th element from the original +arrays. + +As a result, the dataset *data* is reduced to 1/100th of its original size, while still retaining +the overall shape of the data. + +Here is the chart with the sub-sampled dataset: +
+ + +
Sub-sampled dataset
+
+ +However, sub-sampling has its limitations. It may skip over significant trends or abrupt changes in +the data, especially if key points are not selected. While it's a quick and efficient solution, it +may result in the loss of critical details in datasets with high-frequency variations or sudden +transitions.
+As you can see, only a very few number of noisy data points remain after the sampling, which is +expected. + +

Solution 3: Decimation

+ +Decimation is a more refined approach to reducing dataset size while preserving essential +information and trends. By selectively removing data points based on specific criteria, such as +frequency content or statistical significance, decimation balances performance with data integrity. + + +The `chart` control has a [*decimator*](../chart.md#p-decimator) property that accepts an instance +of a subclass of `Decimator^`. This class transforms the dataset declared in the +[*data*](../chart.md#p-data) property, reducing the number of points to be displayed +while ensuring that key data features are retained. + + +To use decimation, you must instantiate a decimator: +```python +decimator = MinMaxDecimator(200) +``` + +In this case, the decimator limits the displayed points to 200, which is an extreme reduction, but +it highlights the effect. For practical usage, typical values range from 1000 to 2000, depending on +the horizontal resolution of the screen: it's often unnecessary to render more points than a monitor +can display. + +Several decimator types are available, including `MinMaxDecimator^`, `LTTB^`, `RDP^`, and +`ScatterDecimator^`. Each of these implements different algorithms, better suited for specific +shapes of data. + +To apply the decimator to the chart, set the *decimator* variable to the +[*decimator*](../chart.md#p-decimator) property: +!!! taipy-element + default={data} + type=markers + x=X + y=Y + decimator={decimator} + +Here is the resulting chart after applying decimation: +
+ + +
Decimation applied
+
+ +The chart retains more of the dataset's characteristics compared to simpler methods like +sub-sampling or linear interpolation. The global sine-like shape is preserved, and the noise remains +visible.
+At the same time, performance is greatly improved: only 200 points are rendered instead of 50,000, +a reduction by a factor of 250. + +Note that decimation does not alter nor duplicate the original dataset.
+You can zoom in and out using the chart's built-in zoom tool to reveal or hide more details +dynamically. + +For example, the following image shows the chart zoomed in on a specific area: +
+ + +
Selecting an area to zoom in
+
+ +And here is the result after zooming in: +
+ + +
Chart after zoom
+
+ +As you zoom in, more details are revealed, while still adhering to the 200-point limit, ensuring +smooth performance and responsive interactions. + ## Adding annotations {data-source="gui:doc/examples/charts/advanced_annotations.py"} You can add text annotations on top of a chart using the *annotations* property of diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_average-d.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_average-d.png new file mode 100644 index 000000000..343c152d5 Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_average-d.png differ diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_average-l.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_average-l.png new file mode 100644 index 000000000..8c4b104cf Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_average-l.png differ diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-d.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-d.png new file mode 100644 index 000000000..4d24d6dc0 Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-d.png differ diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-l.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-l.png new file mode 100644 index 000000000..86dd2fa09 Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-l.png differ diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom1-d.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom1-d.png new file mode 100644 index 000000000..5bf2899ea Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom1-d.png differ diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom1-l.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom1-l.png new file mode 100644 index 000000000..adf1c50c1 Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom1-l.png differ diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom2-d.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom2-d.png new file mode 100644 index 000000000..ff82c7435 Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom2-d.png differ diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom2-l.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom2-l.png new file mode 100644 index 000000000..42c6f229a Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_decimation-zoom2-l.png differ diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_none-d.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_none-d.png new file mode 100644 index 000000000..69f45b9ba Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_none-d.png differ diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_none-l.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_none-l.png new file mode 100644 index 000000000..df5242da1 Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_none-l.png differ diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_subsampling-d.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_subsampling-d.png new file mode 100644 index 000000000..d0ee2ea44 Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_subsampling-d.png differ diff --git a/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_subsampling-l.png b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_subsampling-l.png new file mode 100644 index 000000000..5d234e90f Binary files /dev/null and b/docs/refmans/gui/viselements/generic/charts/advanced_large_datasets_subsampling-l.png differ diff --git a/docs/userman/gui/pages/builder.md b/docs/userman/gui/pages/builder.md index d072c3531..ff56365c4 100644 --- a/docs/userman/gui/pages/builder.md +++ b/docs/userman/gui/pages/builder.md @@ -2,16 +2,13 @@ title: The Page Builder API --- -The Page Builder API is a set of classes located in the -[`taipy.gui.builder`](../../../refmans/reference/pkg_taipy/pkg_gui/pkg_builder/index.md) package -that lets users create Taipy GUI pages entirely from Python code. +The Page Builder API is a set of classes located in the `taipy.gui.builder^` package that lets users +create Taipy GUI pages entirely from Python code. This package contains a class for every visual element available in Taipy, including those defined in [extension libraries](../extension/index.md). -To access the Page Builder classes, you must import the -[`taipy.gui.builder`](../../../refmans/reference/pkg_taipy/pkg_gui/pkg_builder/index.md) package in -your script. +To access the Page Builder classes, you must import the `taipy.gui.builder^` package in your script. # Generating a new page diff --git a/tools/_setup_generation/step_viselements.py b/tools/_setup_generation/step_viselements.py index 2e0418e05..173328573 100644 --- a/tools/_setup_generation/step_viselements.py +++ b/tools/_setup_generation/step_viselements.py @@ -128,7 +128,7 @@ def __generate_toc_file(self, tocs: Dict[str, VEToc]): md_file.write(md_template) @staticmethod - def __get_navigation_section(category: str, prefix:str) -> str: + def __get_navigation_section(category: str, prefix: str) -> str: if category == "blocks": return "Blocks" if prefix == "core_": @@ -238,8 +238,8 @@ def __generate_element_doc(self, element_type: str, category: str): raise ValueError( f"Couldn't locate first header in documentation for element '{element_type}'" ) - before_properties = match.group(1) - after_properties = match.group(2) + element_documentation[match.end() :] + before_properties = match[1] + after_properties = match[2] + element_documentation[match.end() :] # Chart hook if element_type == "chart": @@ -288,7 +288,6 @@ def __generate_builder_api(self) -> None: py_content = py_content[: m.start(0) + 1] def generate(self, category, base_class: str) -> str: - element_types = self.categories[category] def build_doc(property: str, desc, indent: int): @@ -382,16 +381,17 @@ def __init__(self, [arguments]) -> None: element_md_location = ( "corelements" if desc["prefix"] == "core_" else "generic" ) - if m := (re.search - (r"(\[`(\w+)`\]\()\2\.md\)", short_doc)): + if m := (re.search(r"(\[`(\w+)`\]\()\2\.md\)", short_doc)): short_doc = ( short_doc[: m.start()] + f"{m[1]}../../../../../refmans/gui/viselements/{element_md_location}/{m[2]}.md)" + short_doc[m.end() :] ) - element_md_page = (f"[`{element_type}`](../../../../../../refmans/gui/viselements/{element_md_location}" - f"/{element_type}.md)") + element_md_page = ( + f"[`{element_type}`](../../../../../../refmans/gui/viselements/{element_md_location}" + f"/{element_type}.md)" + ) buffer.write( template.replace("[element_type]", element_type) .replace("[element_md_page]", element_md_page) @@ -414,7 +414,9 @@ def __init__(self, [arguments]) -> None: # Special case for charts: we want to insert the chart gallery that # is stored in the file whose path is in self.charts_home_html_path - # This should be inserted before the first level 1 header + # This should be inserted before the first header. + # Simultaneously, we build a list of chart types to point to type pages as text. + # This should be inserted before the "Styling" header. def __chart_page_hook( self, element_documentation: str, before: str, after: str, charts_md_dir: str ) -> tuple[str, str]: @@ -432,12 +434,12 @@ def __chart_page_hook( chart_gallery = "\n" + chart_gallery[match.end() :] SECTION_RE = re.compile(r"^([\w-]+):(.*)$") chart_sections = "" - for line in match.group(1).splitlines(): + for line in match[1].splitlines(): if match := SECTION_RE.match(line): type = match.group(1) - chart_sections += f"- [{match.group(2)}](charts/{type}.md)\n" + chart_sections += f"\n- [{match.group(2)}](charts/{type}.md)" + # Generate chart type documentation page from template, if possible template_doc_path = f"{charts_md_dir}/{type}.md_template" - # Generate chart type documentation page if possible if os.access(template_doc_path, os.R_OK): with open(template_doc_path, "r") as template_doc_file: documentation = template_doc_file.read() @@ -454,9 +456,16 @@ def __chart_page_hook( raise ValueError( "Couldn't locate first header1 in documentation for element 'chart'" ) + styling_match = re.search( + r"\n# Styling\n", after, re.MULTILINE | re.DOTALL + ) + if not styling_match: + raise ValueError( + "Couldn't locate \"Styling\" header1 in documentation for element 'chart'" + ) return ( - match.group(1) + chart_gallery + before[match.end() :], - after + chart_sections, + match[1] + chart_gallery + before[match.end() :], + after[: styling_match.start()] + chart_sections + "\n\n" + after[styling_match.start() :] ) def __process_element_md_file(self, type: str, documentation: str) -> str: diff --git a/tools/postprocess.py b/tools/postprocess.py index 2edbc7098..e11fc6864 100644 --- a/tools/postprocess.py +++ b/tools/postprocess.py @@ -199,8 +199,10 @@ def on_post_build(env): raise e # Remove useless spaces for improved processing - html_content = re.sub(r"[ \t]+", " ", re.sub(r"\n\s*\n+", "\n\n", html_content)) - html_content = html_content.replace("\n\n", "\n") + # This breaks the code blocks - so needs to avoid the
 elements before
+                    # we bring it back.
+                    #html_content = re.sub(r"[ \t]+", " ", re.sub(r"\n\s*\n+", "\n\n", html_content))
+                    #html_content = html_content.replace("\n\n", "\n")
 
                     html_content = html_content.replace(
                         '