[Core] predict_insample with multivariate models gives NaN values on last input_size #1269

carusyte · 2025-02-19T09:47:33Z

What happened + What you expected to happen

It's unclear what the expected result of predict_insample should look like. In previous version, it provided all predicted values based on the training data provided during fit. However using the latest commit from main branch, the last input_size sample prediction is NaN instead.

Versions / Dependencies

Main branch with commit 939056c

Reproduction script

import pandas as pd
import numpy as np
import logging
import torch
from neuralforecast.core import NeuralForecast
from neuralforecast.models import TSMixerx

# Prep dummy data
start_date = "2024-06-01"
end_date = "2025-02-19"
date_range = pd.date_range(start=start_date, end=end_date, freq="B")
np.random.seed(0)
df = pd.DataFrame(
    {
        "unique_id": "dummy",
        "ds": date_range,
        "y": np.random.randn(len(date_range)),
        "val1": np.random.randn(len(date_range)),
        "val2": np.random.rand(len(date_range)),
    }
)

logging.getLogger("pytorch_lightning").setLevel(logging.ERROR)
torch.set_float32_matmul_precision("medium")

horizon = 10
input_size = 30
val_size = 50
models = [
    TSMixerx(
        h=horizon,
        input_size=input_size,
        n_series=1,
        max_steps=100,
        random_seed=0,
        hist_exog_list=["val1", "val2"],
    ),
]

nf = NeuralForecast(
    models=models,
    freq="B",
    local_scaler_type="robust",
)
nf.fit(df=df, val_size=val_size)

Y_hat_insample = nf.predict_insample(step_size=horizon)
# Y_hat_insample = nf.predict_insample()

print(Y_hat_insample)

Issue Severity

High: It blocks me from completing my task.

The text was updated successfully, but these errors were encountered:

marcopeix · 2025-02-19T15:24:37Z

The issue happens only with multivariate models. Univariate models work fine. I need to investigate further.

marcopeix · 2025-02-19T20:18:29Z

Note that this behaviour is also observed in previous versions for multivariate models. Our tests did not include multivariate models, so we missed it.

carusyte · 2025-02-20T00:57:25Z

Note that this behaviour is also observed in previous versions for multivariate models. Our tests did not include multivariate models, so we missed it.

You're right, sorry for my incorrect descriptions. In fact previously, predict_insample may encounter error as recorded in this issue #1056, and I nearly forgot that I applied a stopgap from this #1056 (comment) just to continue my work. I'm not quite sure if this is the correct fix, anyway.

carusyte · 2025-02-21T08:29:26Z

You may notice that the windows tensor shape began to lose some elements here in

neuralforecast/neuralforecast/common/_base_multivariate.py

Lines 199 to 204 in 939056c

    
           if step == "predict": 
        
               predict_step_size = self.predict_step_size 
        
               cutoff = -self.input_size - self.test_size 
        
               temporal = batch["temporal"][:, :, cutoff:]

Whereas in the same function of _base_windows.py, it applied padding to the temporal:

neuralforecast/neuralforecast/common/_base_windows.py

Lines 212 to 225 in 939056c

    
           if step == "predict": 
        
               initial_input = temporal.shape[-1] - self.test_size 
        
               if ( 
        
                   initial_input <= self.input_size 
        
               ):  # There is not enough data to predict first timestamp 
        
                   padder_left = nn.ConstantPad1d( 
        
                       padding=(self.input_size - initial_input, 0), value=0.0 
        
                   ) 
        
                   temporal = padder_left(temporal) 
        
               predict_step_size = self.predict_step_size 
        
               cutoff = -self.input_size - self.test_size 
        
               temporal = temporal[:, :, cutoff:]

marcopeix · 2025-02-21T18:59:23Z

Yes, saw that too! Seems that it fixes the issue. I'm testing if it impacts multivariate models' performance at all before pushing a fix. Thanks for pointing it out!

marcopeix · 2025-02-24T21:14:29Z

This is now fixed by the merge of #1023

carusyte added the bug label Feb 19, 2025

marcopeix changed the title ~~[Core] predict_insample gives NaN values on trailing sample data~~ [Core] predict_insample with multivariate models gives NaN values on last input_size Feb 19, 2025

marcopeix linked a pull request Feb 21, 2025 that will close this issue

FIX: Insample predictions with multivariate model #1271

Closed

marcopeix self-assigned this Feb 21, 2025

marcopeix closed this as completed Feb 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Core] predict_insample with multivariate models gives NaN values on last input_size #1269

[Core] predict_insample with multivariate models gives NaN values on last input_size #1269

carusyte commented Feb 19, 2025

marcopeix commented Feb 19, 2025

marcopeix commented Feb 19, 2025 •

edited

Loading

carusyte commented Feb 20, 2025

carusyte commented Feb 21, 2025

marcopeix commented Feb 21, 2025

marcopeix commented Feb 24, 2025

[Core] predict_insample with multivariate models gives NaN values on last input_size #1269

[Core] predict_insample with multivariate models gives NaN values on last input_size #1269

Comments

carusyte commented Feb 19, 2025

What happened + What you expected to happen

Versions / Dependencies

Reproduction script

Issue Severity

marcopeix commented Feb 19, 2025

marcopeix commented Feb 19, 2025 • edited Loading

carusyte commented Feb 20, 2025

carusyte commented Feb 21, 2025

marcopeix commented Feb 21, 2025

marcopeix commented Feb 24, 2025

marcopeix commented Feb 19, 2025 •

edited

Loading