Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize memory usage for TSDataset._merge_exog #596

Merged
merged 9 commits into from
Feb 20, 2025
Merged

Optimize memory usage for TSDataset._merge_exog #596

merged 9 commits into from
Feb 20, 2025

Conversation

brsnw250
Copy link
Collaborator

@brsnw250 brsnw250 commented Feb 3, 2025

Before submitting (must do checklist)

  • Did you read the contribution guide?
  • Did you update the docs? We use Numpy format for all the methods and classes.
  • Did you write any new necessary tests?
  • Did you update the CHANGELOG?

Performance graphs (before)

Time

image

image

Memory

image

Proposed Changes

  • Updated exog variables addition to the dataset.
  • Reworked reggressors checks
  • Memory saved: 6.8G -> 5.1G (1.7G -- 25%)
  • Memory consumption is more uniform accross the timeline
  • Speed up: 3:59 -> 0:22 (~11x)

Performance graphs (after)

Time

image

image

Memory

image

Code to reproduce

from etna.datasets import TSDataset, generate_ar_df

df = generate_ar_df(n_segments=100_000, periods=100, start_time="2000-01-01")

df_exog = generate_ar_df(n_segments=100_000, periods=110, start_time="2000-01-01")
df_exog.rename(columns={"target": "exog_0"}, inplace=True)
df_exog["cap"] = 101

for i in range(1, 10):
    df_exog[f"exog_{i}"] = df_exog["exog_0"].values + i

print("Creating dataset")
ts = TSDataset(df=df, df_exog=df_exog, known_future="all", freq="D")
print("Done.")

Closing issues

closes #403

@brsnw250 brsnw250 self-assigned this Feb 3, 2025
Copy link

github-actions bot commented Feb 3, 2025

🚀 Deployed on https://deploy-preview-596--etna-docs.netlify.app

@github-actions github-actions bot temporarily deployed to pull request February 3, 2025 12:31 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 4, 2025 07:53 Inactive
Copy link

codecov bot commented Feb 4, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 90.42%. Comparing base (7508561) to head (36cb957).
Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #596   +/-   ##
=======================================
  Coverage   90.41%   90.42%           
=======================================
  Files         259      259           
  Lines       18022    18036   +14     
=======================================
+ Hits        16295    16309   +14     
  Misses       1727     1727           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@github-actions github-actions bot temporarily deployed to pull request February 17, 2025 09:59 Inactive
@d-a-bunin d-a-bunin self-requested a review February 18, 2025 07:51
@brsnw250 brsnw250 requested a review from d-a-bunin February 18, 2025 12:41
@github-actions github-actions bot temporarily deployed to pull request February 19, 2025 12:02 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 19, 2025 13:33 Inactive
CHANGELOG.md Outdated
@@ -45,7 +45,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- **Breaking:** Bump minimum `scipy` version to 1.12 ([#599](https://github.com/etna-team/etna/pull/599))
- **Breaking:** Bump minimum `optuna` version to 4.0 ([#599](https://github.com/etna-team/etna/pull/599))
- **Breaking:** Bump minimum `statsforecast` version to 2.0 ([#599](https://github.com/etna-team/etna/pull/599))
-
- Optimized performance of exogenous variables addition to the dataset ([#596](https://github.com/etna-team/etna/pull/596))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimized -> Optimize

@github-actions github-actions bot temporarily deployed to pull request February 19, 2025 15:14 Inactive
@github-actions github-actions bot temporarily deployed to pull request February 19, 2025 15:19 Inactive
@brsnw250 brsnw250 merged commit 797cfe3 into master Feb 20, 2025
14 checks passed
@d-a-bunin d-a-bunin deleted the issue-403 branch February 20, 2025 07:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize memory usage for TSDataset._merge_exog
2 participants