[WIP] Initial implementation of choice-only models#903
[WIP] Initial implementation of choice-only models#903digicosmos86 wants to merge 27 commits intomainfrom
Conversation
…tor functions for choice-only models in ssm-simulator
…ithub.com/lnccbrown/HSSM into 886-implement-the-first-choice-only-model
…ound certain operations
There was a problem hiding this comment.
Pull request overview
This PR introduces initial support for choice-only models in HSSM by adding a softmax inverse-temperature likelihood and corresponding model configs, plus updates to HSSM/distribution utilities and new tests.
Changes:
- Add a new analytical likelihood (
softmax_inv_temperature) and default configs for 2- and 3-logit variants. - Extend HSSM and distribution utilities to handle choice-only response shapes and model-building paths.
- Add unit/slow tests covering new configs and basic sampling for choice-only models.
Reviewed changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
src/hssm/likelihoods/analytical.py |
Adds softmax_inv_temperature likelihood implementation. |
src/hssm/modelconfig/_softmax_inv_temperature_config.py |
Provides shared default config factory for softmax inverse-temperature models. |
src/hssm/modelconfig/softmax_inv_temperature_2_config.py |
Adds default config entry point for the 2-logit model. |
src/hssm/modelconfig/softmax_inv_temperature_3_config.py |
Adds default config entry point for the 3-logit model. |
src/hssm/_types.py |
Registers new supported model names. |
src/hssm/config.py |
Adds Config.is_choice_only property derived from response fields. |
src/hssm/hssm.py |
Adds choice-only handling for response formatting, missing-data logic, and RV fallback behavior. |
src/hssm/distribution_utils/dist.py |
Adds is_choice_only plumbing for RV signature and distribution logp behavior. |
src/hssm/data_validator.py |
Changes DataValidatorMixin to rely on subclass state rather than its own constructor. |
tests/test_modelconfig.py |
Adds tests for the new softmax inverse-temperature configs. |
tests/test_config.py |
Adds assertions for the new is_choice_only behavior. |
tests/test_hssm.py |
Adds a test for choice-only + deadline behavior in HSSM. |
tests/slow/test_choice_only.py |
Adds slow tests for choice-only sampling paths and likelihood shape checks. |
.gitignore |
Adds ignores for assistant/tooling-related files. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| logits_scaled, axis=0, keepdims=False | ||
| ) | ||
|
|
||
| return pt.exp(log_prob_choices) |
There was a problem hiding this comment.
softmax_inv_temperature is documented/used as a log-likelihood, but it currently returns exp(log_prob_choices) (a probability). make_distribution expects the log-likelihood (logp), and will apply exp() again in the lapse mixture, which will yield incorrect values / potential overflow. Return log_prob_choices (and consider renaming variables/docs accordingly).
| return pt.exp(log_prob_choices) | |
| return log_prob_choices |
| data | ||
| 1D array of responses (choices). | ||
| beta | ||
| A scaler for the softmax temperature (0, inf). |
There was a problem hiding this comment.
Docstring typo: “A scaler for the softmax temperature” should be “A scalar …”.
| A scaler for the softmax temperature (0, inf). | |
| A scalar for the softmax temperature (0, inf). |
| missing_data: bool | ||
| missing_data_value: float | ||
| is_choice_only: bool | ||
|
|
There was a problem hiding this comment.
Removing DataValidatorMixin.__init__ is a breaking change for existing direct instantiations (the repository’s tests/test_data_validator.py constructs DataValidatorMixin(...)). Either restore a minimal/backward-compatible __init__ (even if only for tests) or update the tests and any downstream usage to instantiate a concrete class that sets the required attributes.
| def __init__( | |
| self, | |
| data: pd.DataFrame | None = None, | |
| response: list[str] | None = None, | |
| choices: list[int] | None = None, | |
| n_choices: int | None = None, | |
| extra_fields: list[str] | None = None, | |
| deadline: bool = False, | |
| deadline_name: str | None = None, | |
| missing_data: bool = False, | |
| missing_data_value: float = -999.0, | |
| is_choice_only: bool = False, | |
| ) -> None: | |
| """Initialize DataValidatorMixin with optional arguments. | |
| This minimal initializer is provided for backward compatibility with | |
| direct instantiation (e.g., in tests). Subclasses are free to override | |
| this method and set these attributes themselves. | |
| """ | |
| self.data = data if data is not None else pd.DataFrame() | |
| self.response = response if response is not None else [] | |
| self.choices = choices if choices is not None else [] | |
| # If n_choices is not provided, infer from choices if available. | |
| if n_choices is not None: | |
| self.n_choices = n_choices | |
| else: | |
| self.n_choices = len(self.choices) if self.choices is not None else 0 | |
| self.extra_fields = extra_fields | |
| self.deadline = deadline | |
| self.deadline_name = deadline_name if deadline_name is not None else "deadline" | |
| self.missing_data = missing_data | |
| self.missing_data_value = missing_data_value | |
| self.is_choice_only = is_choice_only |
| assert lk_analytical["default_priors"]["beta"] == { | ||
| "name": "HalfNormal", | ||
| "mu": 0.0, | ||
| "sigma": 1.0, | ||
| } |
There was a problem hiding this comment.
The test asserts default_priors["beta"] contains a mu key for a HalfNormal prior, but the config (and other model configs) define HalfNormal with only sigma. This makes the test fail and may also be an invalid parameterization depending on the prior backend. Adjust the assertion to match the config’s prior spec (or change the config consistently across models).
| New in 0.2.12: when model is choice-only and has deadline, the response | ||
| is not in the form of c(...). |
There was a problem hiding this comment.
The docstring says that for choice-only models with a deadline the response is “not in the form of c(...)”, but the implementation returns c(response, deadline) in that case. Please update the docstring to match the actual behavior (or adjust the behavior if the doc is correct).
| New in 0.2.12: when model is choice-only and has deadline, the response | |
| is not in the form of c(...). | |
| New in 0.2.12: when the model is choice-only and has no deadline, the | |
| response is returned as a single variable name (e.g., ``"choice"``) | |
| instead of in the form ``c(...)``. In all other cases, the response is | |
| returned in ``c(...)`` format. |
| lapse : optional | ||
| A bmb.Prior object representing the lapse distribution. | ||
| is_choice_only : bool | ||
| Whether the model is a choice-only model. This parameter overrides |
There was a problem hiding this comment.
Docstring for is_choice_only is incomplete (ends with “This parameter overrides” without saying what it overrides). Please complete or remove the fragment so the generated docs/readers aren’t left with a partial sentence.
| Whether the model is a choice-only model. This parameter overrides | |
| Whether the model is a choice-only model. |
| data = pd.DataFrame( | ||
| { | ||
| "response": np.random.choice([-1, 1], size=n_samples), |
There was a problem hiding this comment.
generate_synthetic_data uses np.random.choice without a fixed seed, making the test output non-deterministic across runs/environments. Consider using a local RNG with a fixed seed (e.g., np.random.default_rng(…)) for reproducibility.
| data = pd.DataFrame( | |
| { | |
| "response": np.random.choice([-1, 1], size=n_samples), | |
| rng = np.random.default_rng(123) | |
| data = pd.DataFrame( | |
| { | |
| "response": rng.choice([-1, 1], size=n_samples), |
| extra_fields: list[np.ndarray] | None = None, | ||
| fixed_vector_params: dict[str, np.ndarray] | None = None, | ||
| params_is_trialwise: list[bool] | None = None, | ||
| is_choice_only: bool = False, | ||
| ) -> type[pm.Distribution]: | ||
| """Make a `pymc.Distribution`. | ||
|
|
There was a problem hiding this comment.
make_distribution now accepts is_choice_only, but the flag isn’t propagated when rv is a callable or a string (the internal make_hssm_rv(...) calls still use the default is_choice_only=False). This will build an RV with the wrong output signature for choice-only models unless the RV is provided as a class. Pass is_choice_only=is_choice_only into those make_hssm_rv calls.
… sampling from choice-only models
|
Superseded by #919 |
This PR introduces choice only models. The following needs to be changed in order for this to happen:
softmax_inv_temperaturelikelihood function for arbitrary numbers of logits in the parametersoftmax_inv_temperature_2andsoftmax_inv_temperature_3model configs, and added these toSupportedModelsDataValidatorMixin does not need to keep internal states. The states are kept in HSSM class itself