-
Notifications
You must be signed in to change notification settings - Fork 43
Open
Description
Hi Vitor,
I finished your book and I enjoyed it. Thanks for a great book. However I do have a question, hope you can help me with it? Forecasint for future is the goal of learning. Is it right? Why the book only check model prediction with testing dataset which is known? The book never teach how to do forecast for future which is quite frustrating?
For example, in Chapter 4, using linear regression with pytorch lightning. After training the mode, how to add some lines to predict future? Can you help me add some lines to show me how to do forecasting for future? Thank you so much!
import numpy as np
import pandas as pd
from pytorch_forecasting import TimeSeriesDataSet
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import torch
from torch import nn
N_LAGS = 7
HORIZON = 1
BATCH_SIZE = 10
num_vars = mvtseries.shape[1] + 1
print('number of variable is ',num_vars)
tareget_variable="Incoming Solar"
def create_training_set(
data: pd.DataFrame,
n_lags: int,
horizon: int,
test_size: float = 0.2,
batch_size: int = 16):
data["target"] = data[tareget_variable]
data["time_index"] = np.arange(len(data))
data["group_id"] = 0 # Assuming a single group for simplicity
time_indices = data["time_index"].values
train_indices, _ = train_test_split(time_indices,test_size=test_size,shuffle=False)
train_indices, _ = train_test_split(train_indices,test_size=0.1,shuffle=False)
train_df = data.loc[data["time_index"].isin(train_indices)]
train_df_mod = train_df.copy()
target_scaler = StandardScaler()
target_scaler.fit(train_df_mod[["target"]])
train_df_mod["target"] = target_scaler.transform(train_df_mod[["target"]])
train_df_mod = train_df_mod.drop(tareget_variable, axis=1)
feature_names = [
col for col in data.columns
if col != "target" and col != tareget_variable
]
# this example is only to create training dataset and training dataloader
# in the future, we also need testing and validation dataset and dataloader
training_dataset = TimeSeriesDataSet(
train_df_mod,
time_idx="time_index",
target="target",
group_ids=["group_id"],
max_encoder_length=n_lags,
max_prediction_length=horizon,
time_varying_unknown_reals=feature_names,
scalers={name: StandardScaler()
for name in feature_names},
)
loader = training_dataset.to_dataloader(batch_size=batch_size,shuffle=False)
return loader
class LinearRegressionModel(nn.Module):
def __init__(self, input_dim, output_dim):
super(LinearRegressionModel, self).__init__()
self.linear = nn.Linear(input_dim, output_dim)
def forward(self, X):
X = X.view(X.size(0), -1)
return self.linear(X)
data_loader = create_training_set(
data=mvtseries,
n_lags=N_LAGS,
horizon=HORIZON,
batch_size=BATCH_SIZE,
test_size=0.3
)
model = LinearRegressionModel(N_LAGS * num_vars, HORIZON)
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
# 3) Training loop
num_epochs = 200
for epoch in range(num_epochs):
for batch in data_loader:
x, y = batch
X = x["encoder_cont"].squeeze(-1)
y_pred = model(X)
y_pred = y_pred.squeeze(1)
# use y[0] since y is designed for multiple forecast horizon for each sample, here we only have one horizontal forecast, 2nd one is None
y_actual = y[0].squeeze(1)
loss = criterion(y_pred, y_actual)
loss.backward()
optimizer.step()
optimizer.zero_grad()
if epoch % 10 == 0:
print(f"epoch: {epoch + 1}, loss = {loss.item():.4f}")
Metadata
Metadata
Assignees
Labels
No labels