-
Notifications
You must be signed in to change notification settings - Fork 122
Description
mlforecaster fails with "Unable to obtain: 96 lags_opt values" despite sufficient statistics data
EMHASS Version
v0.16.1
Environment
- Home Assistant OS (Proxmox)
- EMHASS installed as HA addon
use_websocket: trueoptimization_time_step: 30
Description
When using load_forecast_method: mlforecaster with a trigger-based template sensor (firing every 30 minutes) as sensor_power_load_no_var_loads and var_model, the naive-mpc-optim action consistently fails with:
ERROR in forecast: Unable to obtain: 96 lags_opt values from sensor: power load no var loads,
check optimization_time_step/freq and historic_days_to_retrieve/days_to_retrieve parameters
This occurs despite:
- The sensor having 3.5+ days of long-term statistics (well exceeding the required 96 × 30-min slots = 48 hours)
- The sensor appearing correctly in Developer Tools → Statistics with continuous data
historic_days_to_retrieve: 2being set in both config and runtime paramsvar_modelandsensor_power_load_no_var_loadsboth pointing to the same correct sensor- The ML model having been successfully trained against the same sensor (R² ~0.29 with 2 days data)
- Switching to
load_forecast_method: naiveworking perfectly with identical config
Sensor Setup
The load sensor is a trigger-based template sensor firing every 30 minutes:
template:
- trigger:
- trigger: time_pattern
minutes: "/30"
sensor:
- name: "House Load 30min Average"
unit_of_measurement: "W"
device_class: power
state_class: measurement
state: >
{{ states('sensor.alphaess_current_house_load') | float(0) }}This sensor:
- Has
state_class: measurement - Appears in Developer Tools → Statistics
- Has continuous data from creation date
- Returns sensible wattage values
Relevant Config
{
"load_forecast_method": "mlforecaster",
"sensor_power_load_no_var_loads": "sensor.house_load_30min_average",
"sensor_power_photovoltaics": "sensor.pv_30min_average",
"var_model": "sensor.house_load_30min_average",
"historic_days_to_retrieve": 2,
"num_lags": 48,
"optimization_time_step": 30,
"sklearn_model": "KNeighborsRegressor",
"use_websocket": true,
"sensor_replace_zero": [
"sensor_power_photovoltaics",
"sensor_power_load_no_var_loads"
],
"sensor_linear_interp": [
"sensor.pv_30min_average",
"sensor.house_load_30min_average"
]
}Runtime Parameters Passed to naive-mpc-optim
{
"pv_power_forecast": [...],
"load_cost_forecast": [...],
"prod_price_forecast": [...],
"extra_var_model": [...],
"soc_init": 0.36,
"prediction_horizon": 96,
"historic_days_to_retrieve": 2,
"days_to_retrieve": 2
}ML Training Call (succeeds)
{
"var_load": "sensor.house_load_30min_average",
"historic_days_to_retrieve": 2,
"num_lags": 48,
"sklearn_model": "KNeighborsRegressor",
"extra_var_model": ["sensor.current_apparent_temperature"]
}Training completes successfully in under 1 second via the statistics API (no REST fallback).
Logs
Successful ML training (for reference)
INFO in retrieve_hass: Statistics data retrieval took 0.76 seconds
INFO in machine_learning_forecaster: Training a KNeighborsRegressor model
INFO in machine_learning_forecaster: Elapsed time for model fit: 0.01
INFO in machine_learning_forecaster: Prediction R2 score of fitted model on test data: 0.289
Failed naive-mpc-optim with mlforecaster
INFO in retrieve_hass: Statistics data retrieval took 0.08 seconds
WARNING in retrieve_hass: Unable to find all the sensors in sensor_replace_zero parameter
WARNING in retrieve_hass: Confirm sure all sensors in sensor_replace_zero are sensor_power_photovoltaics and/or sensor_power_load_no_var_loads
INFO in forecast: Retrieving data from hass for load forecast using method = mlforecaster
INFO in retrieve_hass: Statistics data retrieval took 0.07 seconds
WARNING in retrieve_hass: Unable to find all the sensors in sensor_replace_zero parameter
WARNING in retrieve_hass: Confirm sure all sensors in sensor_replace_zero are sensor_power_photovoltaics and/or sensor_power_load_no_var_loads
ERROR in forecast: Unable to obtain: 96 lags_opt values from sensor: power load no var loads,
check optimization_time_step/freq and historic_days_to_retrieve/days_to_retrieve parameters
Successful naive-mpc-optim with naive forecaster (identical config, same run)
INFO in retrieve_hass: Statistics data retrieval took 0.08 seconds
INFO in forecast: Retrieving data from hass for load forecast using method = naive
INFO in retrieve_hass: Statistics data retrieval took 0.07 seconds
INFO in web_server: >> Performing naive-mpc-optim...
INFO in optimization: Total value of the Cost function = 2.27
INFO in retrieve_hass: Successfully posted to sensor.p_batt_forecast = -5000.0
INFO in retrieve_hass: Successfully posted to sensor.soc_batt_forecast = 44.32
INFO in retrieve_hass: Successfully posted to sensor.optim_status = Optimal
Persistent sensor_replace_zero Warning
Throughout all testing, the sensor_replace_zero warning fires on every run despite the config containing the correct key names (sensor_power_photovoltaics and sensor_power_load_no_var_loads). This warning persists even when set_zero_min: false is passed at runtime. It is unclear whether this warning is related to the lags failure or is a separate issue.
What Was Ruled Out
- Insufficient data: Sensor has 3.5+ days of statistics, confirmed in Developer Tools → Statistics
- Wrong sensor: Both
var_modelandsensor_power_load_no_var_loadscorrectly point tosensor.house_load_30min_average - High-frequency sensor timeout: Previous attempts using the raw AlphaESS source sensor (
sensor.alphaess_current_house_load, updating every ~5 seconds) caused REST API hangs. The 30-min template sensor was created specifically to avoid this. - JSON syntax errors: Payload parses correctly, confirmed by runtime params appearing in logs
- Model/sensor name mismatch: Model was retrained after each sensor change
- historic_days_to_retrieve too high: Tested with values of 2 and 3, both fail identically
Hypothesis
The mlforecaster lags check may be using a different code path than the training call to retrieve the last-window data, possibly looking up the sensor via sensor_power_load_no_var_loads config key rather than var_model, and either retrieving a different sensor or failing to obtain sufficient resampled rows despite the statistics being present. The naive forecaster succeeds with identical config, suggesting the issue is isolated to the mlforecaster lags retrieval logic.
Workaround
Running with load_forecast_method: naive works correctly and produces valid optimisation results.