Decouple backup ready condition for snapshot leases and remove support for deprecated fields in etcd status#820
Conversation
715260b to
d0f3163
Compare
|
/retest |
1 similar comment
|
/retest |
|
/assign |
seshachalam-yv
left a comment
There was a problem hiding this comment.
Please remove Cluster Size and Backup Ready.
etcd-druid/api/v1alpha1/etcd.go
Line 53 in d0f3163
etcd-druid/api/v1alpha1/etcd.go
Line 51 in d0f3163
Add FullSnapshotBackupReady and DeltaSnapshotBackupReady as part of print columns
|
@seshachalam-yv Nice find, thanks. Done |
…cated fields in etcd.status
9036d63 to
b0b9725
Compare
|
/test pull-etcd-druid-e2e-kind |
unmarshall
left a comment
There was a problem hiding this comment.
I think we need to pair program this. Current code needs a LOT of changes.
| // +kubebuilder:printcolumn:name="Ready Replicas",type=integer,JSONPath=`.status.readyReplicas`,priority=1 | ||
| // +kubebuilder:printcolumn:name="Backup Ready",type=string,JSONPath=`.status.conditions[?(@.type=="BackupReady")].status` | ||
| // +kubebuilder:printcolumn:name="Full Backup Ready",type=string,JSONPath=`.status.conditions[?(@.type=="FullSnapshotBackupReady")].status` | ||
| // +kubebuilder:printcolumn:name="Delta Backup Ready",type=string,JSONPath=`.status.conditions[?(@.type=="DeltaSnapshotBackupReady")].status` |
There was a problem hiding this comment.
Should we call them Delta Backup Ready or Delta Snapshot Backup Ready? Similarly for full snapshot.
| - `AllMembersReady`: indicates readiness of all members of the etcd cluster. | ||
| - `Ready`: indicates overall readiness of the etcd cluster in serving traffic. | ||
| - `BackupReady`: indicates health of the etcd backups, i.e., whether etcd backups are being taken regularly as per schedule. This condition is applicable only when backups are enabled for the etcd cluster. | ||
| - `FullSnapshotBackupReady`: indicates health of the full snapshot backups, i.e whether full snapshot backups are being taken regularly as per schedule. This condition is applicable only when backups are enabled for the etcd cluster. |
There was a problem hiding this comment.
in which situation will this be set to false? If one full snapshot is missed or more than one missed?
| - `Ready`: indicates overall readiness of the etcd cluster in serving traffic. | ||
| - `BackupReady`: indicates health of the etcd backups, i.e., whether etcd backups are being taken regularly as per schedule. This condition is applicable only when backups are enabled for the etcd cluster. | ||
| - `FullSnapshotBackupReady`: indicates health of the full snapshot backups, i.e whether full snapshot backups are being taken regularly as per schedule. This condition is applicable only when backups are enabled for the etcd cluster. | ||
| - `DeltaSnapshotBackupReady`: indicates health of the delta snapshot backups, i.e whether delta snapshot backups are being taken regularly as per schedule. This condition is applicable only when backups are enabled for the etcd cluster. |
There was a problem hiding this comment.
What is the criteria for this value to be false?
| ### Verify the Etcd cluster | ||
|
|
||
| To obtain information regarding the newly instantiated etcd cluster, perform the following step, which gives details such as the cluster size, readiness status of its members, and various other attributes. | ||
| To obtain information regarding the newly instantiated etcd cluster, perform the following step, which gives details such as the current & ready replicas, readiness status of its members, Full and Delta backup status and various other attributes. |
There was a problem hiding this comment.
| To obtain information regarding the newly instantiated etcd cluster, perform the following step, which gives details such as the current & ready replicas, readiness status of its members, Full and Delta backup status and various other attributes. | |
| To obtain information regarding the newly instantiated etcd cluster, perform the following step, which gives details such as the current & ready replicas, readiness status of its members, Full Snapshot and Delta Snapshot backup status and various other attributes. |
| } | ||
|
|
||
| var FullSnapshotBackupReadyCheckResult, DeltaSnapshotBackupReadyCheckResult Result = nil, nil | ||
| for _, result := range b.results { |
There was a problem hiding this comment.
by using the loop variable result you are shadowing the variable result defined at the start of Check method. Please avoid shadowing.
| ) | ||
| fullSnapErr = a.cl.Get(ctx, types.NamespacedName{Name: getFullSnapLeaseName(&etcd), Namespace: etcd.ObjectMeta.Namespace}, fullSnapLease) | ||
| incrSnapErr = a.cl.Get(ctx, types.NamespacedName{Name: getDeltaSnapLeaseName(&etcd), Namespace: etcd.ObjectMeta.Namespace}, deltaSnapLease) | ||
| fullSnapErr = f.cl.Get(ctx, types.NamespacedName{Name: getFullSnapLeaseName(&etcd), Namespace: etcd.ObjectMeta.Namespace}, fullSnapLease) |
There was a problem hiding this comment.
if there is an error getting the full snapshot lease then is the error being ignored immediately but handled much later? If there is an error you should exit soon with error instead of continuing with the computations.
| } | ||
| } | ||
|
|
||
| func (d *deltaSnapshotBackupReadyCheck) Check(ctx context.Context, etcd druidv1alpha1.Etcd) Result { |
There was a problem hiding this comment.
there is a LOT of code repetition and this code needs to be rewritten.
| condition.AllMembersReadyCheck, | ||
| condition.ReadyCheck, | ||
| condition.BackupReadyCheck, | ||
| condition.FullSnapshotBackupReadyCheck, |
There was a problem hiding this comment.
you have removed BackupReadyCheck here but you have also changed and kept the implementation of BackupReady in check_backup_ready.go. Any reason for that?
| })() | ||
|
|
||
| results := make([]condition.Result, 0, len(ConditionChecks)) | ||
| results := make([]condition.Result, 0, len(ConditionChecks)+1) |
There was a problem hiding this comment.
why is this conditions checks + 1?
| for r := range resultCh { | ||
| results = append(results, r) | ||
| } | ||
| backupReadyChecker := condition.BackupReadyCheck(c.cl, results) |
There was a problem hiding this comment.
i really do not understand this. You removed it from ConditionChecks but introduce this here? why?
|
The Gardener project currently lacks enough active contributors to adequately respond to all PRs.
You can:
/lifecycle rotten |
|
The Gardener project currently lacks enough active contributors to adequately respond to all PRs.
You can:
/close |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@anveshreddy18: The following tests failed, say
Full PR test history. Your PR dashboard. Command help for this repository. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
The Gardener project currently lacks enough active contributors to adequately respond to all PRs.
You can:
/lifecycle stale |
How to categorize this PR?
/area usability monitoring
/kind enhancement cleanup
What this PR does / why we need it:
This PR splits the existing
BackupReadycondition which talks about both full and delta snapshot health as a single condition into two new conditionsFullSnapshotBackupReadyandDeltaSnapshotBackupReady, one for each type of snapshot i.eFullandDeltarespectively. This helps in better understanding of status of each backup type thus helping debug issues and analyze more effectively. The existingBackupReadycondition will still be present, which aggregates both the individual snapshot conditions into one.This PR also removes the support for few fields in etcd status which are in deprecated state for enough time. The fields that are removed are
ServiceName,ClusterSize,UpdatedReplicasandLabelSelector. These fields were essentially taken fromstatefulsetand not necessarily needed to be present underetcd.statushence they were deprecated and this PR stops the support.This PR partially fixes this issue for streamlining etcd status
Which issue(s) this PR fixes:
Special notes for your reviewer:
Release note: