Missing Data issues in using TimeSeriesDataSet
#2024
-
|
Hi All, Specifically, my code looks like this: assert data.isnull().sum().sum() == 0
std_per_group = data.groupby("group_id")["target"].std()
assert std_per_group[std_per_group< 0.001].shape[0] == 0
training_dataset = TimeSeriesDataSet(
data,
time_idx="time_idx",
target="target",
group_ids=["group_id"],
static_categoricals=static_categoricals,
time_varying_unknown_reals=time_varying_unknown_reals,
time_varying_known_categoricals=time_varying_known_categoricals,
time_varying_known_reals=time_varying_known_reals,
max_encoder_length=max_encoder_length,
max_prediction_length=max_prediction_length,
target_normalizer=GroupNormalizer(transformation="softplus", groups=["group_id"]),
add_relative_time_idx=True,
add_target_scales=True,
add_encoder_length=True,
allow_missing_timesteps=True
)Then I get the error message: "ValueError: 197 (4.92%) of target values were found to be NA or infinite (even after encoding). NA values are not allowed Much appreciated if anyone could help me address this problem. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
the NA values appear after encoding because of the GroupNormalizer with softplus transformation. when softplus encounters certain values (like zeros or very small numbers), the transformation can produce NaN or inf. possible fixes:
print("zeros in target:", (data["target"] == 0).sum())
print("negative values:", (data["target"] < 0).sum())
print("min value:", data["target"].min())softplus of very small/zero values can cause issues
target_normalizer=GroupNormalizer(
groups=["group_id"],
transformation=None, # no transformation
# or
transformation="log1p", # safer than softplus for zeros
)
data["target"] = data["target"] + 1e-6
for group in data["group_id"].unique():
group_data = data[data["group_id"] == group]
if group_data["target"].min() <= 0:
print(f"Group {group} has non-positive values")
from pytorch_forecasting import EncoderNormalizer
target_normalizer=EncoderNormalizer()the error happens after encoding because GroupNormalizer applies the softplus transformation which fails on certain values. check your target distribution for zeros or negatives. |
Beta Was this translation helpful? Give feedback.
the NA values appear after encoding because of the GroupNormalizer with softplus transformation.
when softplus encounters certain values (like zeros or very small numbers), the transformation can produce NaN or inf.
possible fixes:
softplus of very small/zero values can cause issues