generated from kubernetes/kubernetes-template-project
-
Notifications
You must be signed in to change notification settings - Fork 109
Open
Description
cc @imreddy13 @kannon92 @andreyvelich
Introduction
Currently, the JobSet spec (specifically the Pod / Job template) is immutable. If a user needs to change a configuration in a running JobSet, they must delete and recreate the object. This interrupts running workloads and causes the loss of progress.
I would love to gather feedback from the community on making JobSets mutable:
- Have you heard similar requests from users where mutation of JobSets is required?
- What are your thoughts on allowing JobSets to become mutable in general?
- What do you think about an "opportunistic" update strategy (applying changes only during natural restarts)?
Real use case
Here is a real use case that I got from a big user:
- Setup: The user has multiple JobSets running simultaneously and many more scheduled for when capacity becomes available
- Trigger: The user develops a new Pod template (e.g., to fix a bug in the worker container or apply an optimization)
- Current limitation: Currently, the user can only apply this new template to newly submitted JobSets. To fix existing ones, they have to recreate them, killing healthy running JobSets, which loses progress
- Goal: The user wants to update the template for the currently running JobSets as well
- Constraint: The user does not want the interrupt the running Jobs
- Desired behavior: Instead, the user wants to use natural JobSet restarts as an opportunity to change the Pod template. If a Job fails and the JobSet controller recreates it, it should come back up with the new Pod template
Potential solution
One thing that I thought is the introduction of an updatePolicy for the Pod / Job template, similar to the idea of the existing failurePolicy, startupPolicy, and successPolicy. It could potentially include updateStrategies such as:
Never: (Current behavior) The validating webhook blocks updates to the templateOpportunistic: The webhook allows updates to the spec. The JobSet controller applies the updated Job template only when recreating the child Jobs during a restart. It does not force the deletion of running Jobs
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
Untriaged