Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config plot libraries #705

Merged
merged 4 commits into from
Oct 31, 2023
Merged

Config plot libraries #705

merged 4 commits into from
Oct 31, 2023

Conversation

bsab
Copy link
Contributor

@bsab bsab commented Oct 30, 2023

Summary by CodeRabbit

New Features:

  • Introduced support for multiple data visualization libraries including Matplotlib, Seaborn, and Plotly, enhancing the flexibility of data visualization.
  • Added a new optional field data_viz_library in the configuration, allowing users to specify their preferred visualization library.
  • Implemented a factory function to get the appropriate instance for a visualization library type, improving code modularity and maintainability.

Refactor:

  • Updated the __len__ method in the SmartDataFrame class for better code readability, without altering its functionality.

Tests:

  • Added new test cases for the visualization library feature, ensuring its robustness and reliability.

Style:

  • Added a new line {viz_library_type} in the prompt templates for better context and information.

sabatino.severino added 2 commits October 30, 2023 11:43
…lication settings that allows users to define their preferred data visualization library (matplotlib, seaborn, or plotly).

With this update, I've eliminated the need for the user to specify in every prompt which library to use, thereby simplifying their interaction with the application and increasing its versatility.
…ferred data visualization library (matplotlib, seaborn, or plotly), simplifying interactions and enhancing the application's versatility.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 30, 2023

Walkthrough

The changes introduced in this commit primarily focus on enhancing the data visualization capabilities of the pandasai library. New classes representing different visualization libraries (Matplotlib, Plotly, Seaborn) have been added, along with a factory function to instantiate the appropriate class. The configuration schema has been updated to include a new optional field for specifying the visualization library. Additionally, minor changes have been made to the test cases and some existing classes to accommodate these new features.

Changes

File(s) Summary
pandasai/assets/.../generate_python_code.tmpl Added a new line {viz_library_type} for additional context.
pandasai/helpers/viz_library_types/__init__.py Introduced a new factory function viz_lib_type_factory for visualization library type instantiation.
pandasai/helpers/viz_library_types/_viz_library_types.py
pandasai/helpers/viz_library_types/base.py
Added new classes for different visualization libraries (Matplotlib, Plotly, Seaborn) inheriting from the BaseVizLibraryType abstract class. Introduced a new enumeration VisualizationLibrary.
pandasai/schemas/df_config.py Added a new optional field data_viz_library in the Config class.
pandasai/smart_dataframe/__init__.py
pandasai/smart_datalake/__init__.py
Updated classes to accommodate the new visualization library features.
tests/prompts/test_generate_python_code_prompt.py
tests/test_smartdataframe.py
Updated test cases to include the new visualization library features.

Poem

🍂 As the leaves fall and the season changes, 🍁

Our code too evolves with new ranges. 📈

With colors of Matplotlib, Seaborn, and Plotly, 🎨

Our data now dances, lively and jolly. 💃🕺

So here's to the autumn of code, 🥂

As we hop down this exciting road. 🐇🛣️

Let's celebrate these changes, big and small, 🎉

In the grand scheme, they matter all. 🌐


Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • If you reply to a review comment from CodeRabbit, the bot will automatically respond.
  • To engage with CodeRabbit bot directly around the specific lines of code in the PR, mention @coderabbitai in your review comment
  • Note: Review comments are made on code diffs or files, not on the PR overview.
  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai help to get help.
  • @coderabbitai resolve to resolve all the CodeRabbit review comments.

Note: For conversation with the bot, please use the review comments on code diffs or files.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.json

@bsab bsab marked this pull request as ready for review October 30, 2023 11:20
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 1

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 36e307c and b9ddba2.
Files selected for processing (9)
  • pandasai/assets/prompt_templates/generate_python_code.tmpl (1 hunks)
  • pandasai/helpers/viz_library_types/init.py (1 hunks)
  • pandasai/helpers/viz_library_types/_viz_library_types.py (1 hunks)
  • pandasai/helpers/viz_library_types/base.py (1 hunks)
  • pandasai/schemas/df_config.py (2 hunks)
  • pandasai/smart_dataframe/init.py (1 hunks)
  • pandasai/smart_datalake/init.py (9 hunks)
  • tests/prompts/test_generate_python_code_prompt.py (6 hunks)
  • tests/test_smartdataframe.py (4 hunks)
Files skipped from review due to trivial changes (4)
  • pandasai/assets/prompt_templates/generate_python_code.tmpl
  • pandasai/helpers/viz_library_types/base.py
  • pandasai/schemas/df_config.py
  • pandasai/smart_dataframe/init.py
Additional comments: 22
pandasai/helpers/viz_library_types/__init__.py (1)
  • 1-63: The viz_lib_type_factory function is well implemented. It uses a dictionary to map the string representation of the visualization library to the corresponding class, which is a good practice for a factory function. The function also handles the case where an unknown viz_lib_type is passed and logs a warning message, which is a good practice for error handling. The function defaults to MatplotlibVizLibraryType if no viz_lib_type is provided or if an unknown viz_lib_type is passed, which is a reasonable default choice. The function also logs a debug message indicating which visualization library type is going to be used, which can be helpful for debugging. Overall, the function is well designed and implemented.
tests/prompts/test_generate_python_code_prompt.py (7)
  • 26-47: The test cases are well defined and cover a variety of scenarios. The use of parameterized tests is a good practice as it allows for easy addition of new test cases in the future.

  • 50-63: The docstring provides a clear explanation of the test cases and the expected behavior. This is good for maintainability and understanding the purpose of the test.

  • 74-78: The variables are set correctly for the test. The use of set_var is a good practice as it allows for easy modification of the test setup.

  • 90-97: > Note: This review was outside of the patch, so it was mapped to the patch with the greatest overlap. Original lines [80-97]

The expected prompt content is well defined and matches the expected output of the function. This is good for ensuring the function behaves as expected.

  • 123-129: The setup for the test_advanced_reasoning_prompt test is correct. The use of a fake LLM and a SmartDataframe is a good practice for isolating the test from external dependencies.

  • 137-140: The variables are set correctly for the test. The use of set_var is a good practice as it allows for easy modification of the test setup.

  • 152-159: > Note: This review was outside of the patch, so it was mapped to the patch with the greatest overlap. Original lines [142-159]

The expected prompt content is well defined and matches the expected output of the function. This is good for ensuring the function behaves as expected.

pandasai/helpers/viz_library_types/_viz_library_types.py (3)
  • 1-88: The code looks good overall. However, there's a minor issue with the template_hint property in the PlotlyVizLibraryType and SeabornVizLibraryType classes. The hint suggests using the 'agg' backend for rendering plots, which is specific to matplotlib and may not apply to Plotly or Seaborn. Please verify and update the hints accordingly.
* 64-70: -    @property
-    def template_hint(self):
-        return """When a user requests to create a chart, utilize the Python plotly 
-        library to generate high-quality graphics that will be saved 
-        directly to a file. 
-        If you import matplotlib use the 'agg' backend for rendering plots."""

+    @property
+    def template_hint(self):
+        return """When a user requests to create a chart, utilize the Python plotly 
+        library to generate high-quality graphics that will be saved 
+        directly to a file."""


* 77-83: -    @property
-    def template_hint(self):
-        return """When a user requests to create a chart, utilize the Python Seaborn 
-        library to generate high-quality graphics that will be saved 
-        directly to a file. 
-        If you import matplotlib use the 'agg' backend for rendering plots."""

+    @property
+    def template_hint(self):
+        return """When a user requests to create a chart, utilize the Python Seaborn 
+        library to generate high-quality graphics that will be saved 
+        directly to a file."""
pandasai/smart_datalake/__init__.py (7)
  • 28-34: The import of viz_lib_type_factory is correctly placed in the import section.

  • 46-52: The import of VisualizationLibrary is correctly placed in the import section.

  • 70-74: The _viz_lib attribute is correctly initialized to None.

  • 121-129: The _viz_lib attribute is correctly set based on the data_viz_library configuration.

  • 198-207: The _load_data_viz_library method is correctly called if the data_viz_library configuration is set.

  • 358-364: The viz_lib_type_factory is correctly used to initialize viz_lib_helper.

  • 375-379: The "viz_library_type" key is correctly added to the default_values dictionary.

tests/test_smartdataframe.py (4)
  • 28-32: The import statements are correctly placed and the imported modules are used in the test cases.

  • 222-226: This comment seems to be a part of a conversation or a prompt for a test case. However, it's not clear how it's used in the test. Please ensure that it's used correctly or remove it if it's not necessary.

  • 288-292: This comment is identical to the one in lines 222-226. If it's not used, consider removing it to avoid redundancy.

  • 1052-1129: The test case test_run_passing_viz_library_type is well written. It tests the behavior of the SmartDataframe class when a visualization library type is passed. It uses parameterized inputs to test different visualization library types and checks if the expected prompt matches the last prompt.

Comment on lines 222 to 240

self._llm = llm

def _load_data_viz_library(self, data_viz_library: str):
"""
Load the appropriate instance for viz library type to use.

Args:
data_viz_library (enum): TODO

Raises:
TODO
"""

self._data_viz_library = VisualizationLibrary.DEFAULT.value
if data_viz_library in (item.value for item in VisualizationLibrary):
self._data_viz_library = data_viz_library

def add_middlewares(self, *middlewares: Optional[Middleware]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _load_data_viz_library method correctly sets the _data_viz_library attribute based on the data_viz_library argument. However, the method's docstring needs to be updated to reflect its functionality and the possible exceptions it might raise.

def _load_data_viz_library(self, data_viz_library: str):
    """
    Load the appropriate instance for viz library type to use.

    Args:
        data_viz_library (str): The name of the visualization library to use.

    Raises:
        ValueError: If the provided visualization library is not supported.
    """
    self._data_viz_library = VisualizationLibrary.DEFAULT.value
    if data_viz_library in (item.value for item in VisualizationLibrary):
        self._data_viz_library = data_viz_library

@gventuri gventuri merged commit d20204d into Sinaptik-AI:main Oct 31, 2023
9 checks passed
gventuri added a commit that referenced this pull request Nov 1, 2023
* feat(pipeline): Add pipeline to generate synthetic dataframe

* chore(pipeline): maintain documentation and other flows

* feat(Pipeline): test case for pipeline

* feat(cache): adding cache in pipeline context and fix leftovers

* chore(pipeline): rename and add dependency

* update poetry lock file

* refactor: minor changes from the code review

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* chore: update pipeline_context.py

* chore: use PandasAI logger instead of default one

* refactor: prompt for synthetic data now accepts the amount params

* remove extra print statement

* 'Refactored by Sourcery' (#703)

Co-authored-by: Sourcery AI <>

* chore(pipeline): improve pipeline usage remove passing config to pipeline

* feat: config plot libraries (#705)

* In this commit, I introduced a new configuration parameter in our application settings that allows users to define their preferred data visualization library (matplotlib, seaborn, or plotly).
With this update, I've eliminated the need for the user to specify in every prompt which library to use, thereby simplifying their interaction with the application and increasing its versatility.

* This commit adds a configuration parameter for users to set their preferred data visualization library (matplotlib, seaborn, or plotly), simplifying interactions and enhancing the application's versatility.

* viz_library_type' in test_generate_python_code_prompt.py, resolved failing tests

---------

Co-authored-by: sabatino.severino <qrxqfspfibrth6nxywai2qifza6jmskt222howzew43risnx4kva>
Co-authored-by: Gabriele Venturi <[email protected]>

* build: use ruff for formatting

* feat: add add_message method to the agent

* Release v1.4.3

* feat: workspace env (#717)

* fix(chart): charts to save to save_chart_path

* refactor sourcery changes

* 'Refactored by Sourcery'

* refactor chart save code

* fix: minor leftovers

* feat(workspace_env): add workspace env to store cache, temp chart and config

* add error handling and comments

---------

Co-authored-by: Sourcery AI <>

---------

Co-authored-by: Gabriele Venturi <[email protected]>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Co-authored-by: Sab Severino <[email protected]>
gventuri added a commit that referenced this pull request Nov 7, 2023
* feat: config plot libraries (#705)

* In this commit, I introduced a new configuration parameter in our application settings that allows users to define their preferred data visualization library (matplotlib, seaborn, or plotly).
With this update, I've eliminated the need for the user to specify in every prompt which library to use, thereby simplifying their interaction with the application and increasing its versatility.

* This commit adds a configuration parameter for users to set their preferred data visualization library (matplotlib, seaborn, or plotly), simplifying interactions and enhancing the application's versatility.

* viz_library_type' in test_generate_python_code_prompt.py, resolved failing tests

---------

Co-authored-by: sabatino.severino <qrxqfspfibrth6nxywai2qifza6jmskt222howzew43risnx4kva>
Co-authored-by: Gabriele Venturi <[email protected]>

* build: use ruff for formatting

* feat: add add_message method to the agent

* Release v1.4.3

* feat: workspace env (#717)

* fix(chart): charts to save to save_chart_path

* refactor sourcery changes

* 'Refactored by Sourcery'

* refactor chart save code

* fix: minor leftovers

* feat(workspace_env): add workspace env to store cache, temp chart and config

* add error handling and comments

---------

Co-authored-by: Sourcery AI <>

* fix: hallucinations was plotting when not asked

* Release v1.4.4

* feat(sqlConnector): add direct config run sql at runtime

* feat(DirectSqlConnector): add sql test cases

* fix: minor leftovers

* fix(orders): check examples of different tables

* 'Refactored by Sourcery'

* chore(sqlprompt): add description only when we have it

---------

Co-authored-by: Sab Severino <[email protected]>
Co-authored-by: Gabriele Venturi <[email protected]>
Co-authored-by: Sourcery AI <>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Set the preferred visualization library
2 participants