feat: secure kepler's /metrics endpoint with kube rbac proxy#554
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## v1alpha1 #554 +/- ##
============================================
- Coverage 29.80% 28.35% -1.46%
============================================
Files 23 25 +2
Lines 2124 3164 +1040
============================================
+ Hits 633 897 +264
- Misses 1478 2248 +770
- Partials 13 19 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
9bd8aad to
521f0d9
Compare
|
@sthaha Unit tests have been added in deployment_test.go. e2e tests have been added. Only change left is moving kube-rbac-proxy-image to the csv which will be added tomorrow asap. |
e75df84 to
45110c4
Compare
| - image: quay.io/sustainable_computing_io/kepler-reboot:v0.0.10 | ||
| name: kepler-reboot | ||
| replaces: kepler-operator.v0.17.0 | ||
| replaces: kepler-operator.v0.0.3-refactorsecurity |
| func newKubeRBACProxyContainer(pmi *v1alpha1.PowerMonitorInternal) corev1.Container { | ||
| return corev1.Container{ | ||
| Name: KubeRBACProxyContainerName, | ||
| Image: "quay.io/brancz/kube-rbac-proxy:v0.19.0", |
There was a problem hiding this comment.
fixed. Image can be added via csv just like the kepler-reboot image
608d711 to
a402e36
Compare
| Image string `json:"image"` | ||
|
|
||
| // +kubebuilder:validation:MinLength=3 | ||
| // +kubebuilder:default="quay.io/brancz/kube-rbac-proxy:v0.19.0" |
There was a problem hiding this comment.
We can remove the default for here?
| } | ||
|
|
||
| pmis := &v1alpha1.PowerMonitorInternalList{} | ||
| err := r.List(ctx, pmis) |
There was a problem hiding this comment.
err:= r.Client.List(ctx,pmis)?
There was a problem hiding this comment.
they both work the same but the precommit checks fail if using r.Client instead of r.List. I believe it's because r.List is more idiomatic than go given the fact client.Client is embedded field. So I am forced by precommit to use r.List.
There was a problem hiding this comment.
It's not ideal ... and consider these comments as nit
| } | ||
|
|
||
| pmis := &v1alpha1.PowerMonitorInternalList{} | ||
| err := r.List(ctx, pmis) |
There was a problem hiding this comment.
err := r.Client.List(ctx,pmis)
There was a problem hiding this comment.
I agree with r.Client.List() but like @KaiyiLiu1234 said, lets fix the pre-commit ( in a totally separate pr) :)
There was a problem hiding this comment.
It's golangci-lint which is expects r.List instead of r.Client.List
a402e36 to
5dbf2f9
Compare
5dbf2f9 to
cd35db5
Compare
| c = c.Watches(&corev1.Secret{}, | ||
| handler.EnqueueRequestsFromMapFunc(r.mapSecretToPowerMonitorRequests), | ||
| builder.WithPredicates( | ||
| predicate.GenerationChangedPredicate{}, |
There was a problem hiding this comment.
should we have GenerationChanged ?
There was a problem hiding this comment.
As in is the preference to switch to ResourceVersionChanged (which should cover all changes)?
There was a problem hiding this comment.
Wait didn't we want to use Generationchanged so we did not have to use Resourceversionchanged which may result in many reconciliations for being to sensitive?
There was a problem hiding this comment.
No, lets keep it this way .. I didn't realize secrets (certs) may be updated externally. Could you please add a comment why GenerationChanged is needed despite of having AnnotationChanged ?
There was a problem hiding this comment.
Added the comment
| } | ||
| config.Authorization.StaticRules = append(config.Authorization.StaticRules, newRule) | ||
| } | ||
| yamlData, _ := yaml.Marshal(&config) |
There was a problem hiding this comment.
lets not ignore (any) error .. can we propagate it or log it some how if it isn't easy, lets file an issue and the address this in a generic way.
There was a problem hiding this comment.
LGTM! Good job @KaiyiLiu1234 !!
Note: minor comments to be be fixed in follow up PRs ..
@KaiyiLiu1234 could you please create tracking issue for all unresolved comments?
| Args: []string{ | ||
| fmt.Sprintf("--secure-listen-address=0.0.0.0:%d", SecurePort), | ||
| fmt.Sprintf("--upstream=http://127.0.0.1:%d", PowerMonitorDSPort), | ||
| fmt.Sprintf("--auth-token-audiences=%s.%s.svc", pmi.Name, pmi.Namespace()), |
There was a problem hiding this comment.
Do we need auth-token-audiences? With this kuberbac expects user token's with audience power-monitor.power-monitor.svc to be present in order to access
There was a problem hiding this comment.
Yes this is important to have. The purpose of the audience is to deliberately limit the token to the intended service (in this case our power-monitor service). We do not want the token to be used for any other purposes.
There was a problem hiding this comment.
Doesn't it make sense to have this if we had more than one service? And we want SA to be for a specific service
There was a problem hiding this comment.
Users can provide many service accounts they want kepler to allow access to. Each service account will need to pass a token (for authorization). Users can have many service accounts for many different target services (not just power-monitor service). This can help with accidental use of a token or with stolen tokens on the user side.
Power Monitor CR has been updated with a Security Mode to protect kepler's /metrics endpoint. When Mode is switched to 'rbac', the power monitor daemonset will be updated with kube rbac proxy sidecar which will enforce which service accounts can access /metrics based on static configuration and clusterrole/role rules. TLS is also enabled using kube rbac proxy and certs produced by openshift. Power Monitor API has been updated to include 'allowedSANames' which will allow users to request certain service accounts to have access. Signed-off-by: Kaiyi Liu <kaliu@redhat.com>
cd35db5 to
44a6e70
Compare
718e00f
into
sustainable-computing-io:v1alpha1
Power Monitor CR has been updated with a Security Mode to protect kepler's /metrics endpoint. When Mode is switched to 'rbac', the power monitor daemonset will be updated with kube rbac proxy sidecar which will enforce which service accounts can access /metrics based on static configuration and clusterrole/role rules. TLS is also enabled using kube rbac proxy and certs produced by openshift. Power Monitor API has been updated to include 'allowedSANames' which will allow users to request certain service accounts to have access.