Skip to content

Conversation

@VinozzZ
Copy link
Contributor

@VinozzZ VinozzZ commented Jul 12, 2022

This PR implements the ability to allow users to configure plugins they need in agent-config resource.

Current workflow:

It creates a new agent config controller to manage the lifecycle of the plugin installation. The controller dispatch jobs to the porter agent to install plugins onto a persistent volume through a pvc (temporary persistent volume claim) managed by the agent config controller. Once the installation is done, the controller sets labels, including all installed plugins and the agent config name, on the persistent volume. Then the controller will create a new pvc. The new pvc will use the hash of all the plugins information as its name and the plain text version of all the plugin information as its label.
When a new agent action is created, it will query using the plugin hash derived from its agent config and mount a plugin pvc using that name.
If no existing plugin pvc exits, the agent action will wait for the pvc to be created before kubernetes schedules it.
When a plugin volumes is no longer needed, the current approach is to allow k8s GC to do the work for us. All the operator is doing when a agent config is deleted is to make sure to remove itself from the pvc's and pv's OwnerReferences value. Once no other resource is referencing a volume, GC will know to delete it permanently.
To make sure we are not deleting any volumes that still could being used, when a volume is being reused by multiple agent config, the controller will make sure to add all agent config into the owner reference before marking the agent config status has complete.

Couple issues that may need to be worked out before this feature is complete:

  • - Since currently, the operator doesn't have the capability to clean up pods after each job has been finished, the above cleanup workflow for plugin volumes only removes the deleted agent config from the plugin volume's owner reference list. we need to validate that when all the pods are deleted, the plugin volumes will be deleted too. Delete temporary resources when agent completes #54

    Resolution: I would like to work on this as a separate PR and just have the current deletion behavior as only removing OwnerReference.

  • - Another issue that the current code has not solved is how to install multiple porter plugins with one agent job. From my discovery, porter currently does not seem to allow installation for multiple plugins with one command. Ability to install multiple plugins defined in one agent config #123

    Resolution: This will be a separate PR in porter to implement this new feature. For now, only kubernetes plugins will be installed. Users will only able to define kubernetes plugin and configure its version and feedURL

  • - We probably should set default plugin configuration in both the installer bundle and the controller code. If someone were to delete the default agent config defined by the installer bundle, the controller can still set a default kubernetes pluin in the code

  • - We need to add integration tests for delete behavior

  • - This feature is also missing document. For example, where all the default values for plugin configs are set Add document for plugin install feature #122

@carolynvs carolynvs added the playground 🙈 Pull request to try stuff out and see if it builds. Don't look! label Aug 5, 2022
@VinozzZ VinozzZ force-pushed the install-plugin-agent branch from 9f74f91 to c9459f5 Compare September 22, 2022 15:13
@VinozzZ VinozzZ force-pushed the install-plugin-agent branch 3 times, most recently from 28795f0 to 2a6432f Compare September 30, 2022 18:42
@codecov
Copy link

codecov bot commented Sep 30, 2022

Codecov Report

Merging #108 (f733fe2) into main (4c2faac) will decrease coverage by 2.71%.
The diff coverage is 68.45%.

@@            Coverage Diff             @@
##             main     #108      +/-   ##
==========================================
- Coverage   78.30%   75.58%   -2.72%     
==========================================
  Files          12       13       +1     
  Lines        1014     1536     +522     
==========================================
+ Hits          794     1161     +367     
- Misses        138      242     +104     
- Partials       82      133      +51     
Flag Coverage Δ
unit-tests 75.58% <68.45%> (-2.72%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
api/v1/agentaction_types.go 44.44% <0.00%> (-22.23%) ⬇️
controllers/porter_resource.go 77.61% <ø> (ø)
api/v1/agentconfig_types.go 58.97% <54.28%> (-14.36%) ⬇️
controllers/agentconfig_controller.go 71.04% <71.04%> (ø)
controllers/agentaction_controller.go 83.29% <84.61%> (+2.38%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@VinozzZ VinozzZ force-pushed the install-plugin-agent branch 2 times, most recently from 99c682c to 0b6a224 Compare September 30, 2022 20:51
Signed-off-by: Yingrong Zhao <[email protected]>
Signed-off-by: Yingrong Zhao <[email protected]>
Signed-off-by: Yingrong Zhao <[email protected]>
Signed-off-by: Yingrong Zhao <[email protected]>
Signed-off-by: Yingrong Zhao <[email protected]>
Signed-off-by: Yingrong Zhao <[email protected]>
Signed-off-by: Yingrong Zhao <[email protected]>
Signed-off-by: Yingrong Zhao <[email protected]>
@VinozzZ VinozzZ force-pushed the install-plugin-agent branch 3 times, most recently from 493c18f to e2c7bab Compare December 5, 2022 21:30
@VinozzZ VinozzZ force-pushed the install-plugin-agent branch from e2c7bab to db360f2 Compare December 6, 2022 16:57
@VinozzZ VinozzZ marked this pull request as ready for review December 6, 2022 17:44
Copy link
Member

@carolynvs carolynvs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't have time to do a full review but here is some initial feedback. I can take another review after the logic in reconcile is split up into functions.

return getRetryLabelValue(a.Annotations)
}

func (a *AgentAction) IsAgentConfig() bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document this function and maybe tweak the name to better fit what it does? I don't think it is checking if an action is an agent config. Is it checking if we are running an action on behalf of an agent config?

metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

const AnnotationAgentCfgPluginsHash = "agent-config-plugins-hash"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please add godocs for any exported fields and functions? I've made an issue for us to figure out how to enforce that in CI (#131)

var input []byte

for _, p := range c.Plugins {
if p.Name == "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a validation error that we should handle. Is that something we look for elsewhere?


type PluginList []Plugin
type Plugin struct {
Name string `json:"name,omitempty" mapstructure:"name,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can annotate this field and any other required fields with a comment so that Kubernetes requires that it is filled out. // +kubebuilder:validation:Required

https://book.kubebuilder.io/reference/markers/crd-validation.html

But I think we are configured to require all fields by default, unless the field is nullable.

I'm mostly bringing it up because of that check you have above that looks if the plugin name is empty which made me think you were running into problems with validation not being strong enough. If you need to require a field, let's use the CRD validation so that we don't need checks like if plugin.Name == "" throughout the implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I did not know this. Thank you for mentioning it.
I think I did the checks in the code is mostly due to issues I ran into with tests. Do you know where in the codebase I can find the logic that all fields are required by default?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I don't. I just read it in the kubebuilder documentation that I linked in my last comment above.

return q
}

func (c *AgentConfigSpec) GetPVCName(namespace string) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since in the future we may end up mounting more volumes into an agent, how about we make this a bit more specific and whenever we just generally reference a pvc / volume, rename it to be specific to plugins?

ac.Status.PorterResourceStatus = value
}

// GetPVCName returns an string that's the hash using plugins spec and the AgentConfig's namepsace and name.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// GetPVCName returns an string that's the hash using plugins spec and the AgentConfig's namepsace and name.
// GetPVCName returns a string that's the hash using plugins spec and the AgentConfig's namespace and name.

{Name: "kubernetes", Version: "v1.0.0", FeedURL: "https://test"},
},
}
assert.Equal(t, "922e7fa0a39ba2abcc6456da47290a00", c.GetPluginsPVCName("default"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feedback we got when Porter generates resources, like when we push the invocation image to the registry first to get the repository digest, that having an easy way to tell it's porter's is helpful.

Would it be possible to prefix the pvc name with porter-DIGEST, so that people can quickly see that they belong to porter? I realize that we have the labels too, this just makes it easier to visually tell while running simple commands with kubectl.


for _, volume := range action.Spec.Volumes {
volumes = append(volumes, volume)
if !action.IsAgentConfig() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment here? It's not immediately clear what's going on and an explanation would help a lot. I can't really tell from the code when we mount this volume.

if err != nil {
return ctrl.Result{}, err
}
if readyPVC != nil && tempPVC == nil && !isDeleted(agentCfg) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend moving all of this into multiple functions. If you look over the other reconcile functions in the other controller, I've tried real hard to keep them short and readable because otherwise this function is overwhelming and difficult to test/maintain.

// TODO: once porter has ability to install multiple plugins with one command, we will allow users
// to install multiple plugins. Currently, only the first item defined in the plugin list will be
// installed.
updatedCfg := setDefaultPlugins(agentCfg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-adding comment from last review since I think it was missed:

I recommend moving most of the detailed logic from Reconcile into multiple functions. If you look over the other reconcile implementations in the other controller, I've tried real hard to keep them short/readable, sticking to the high level status transitions, because otherwise this function is overwhelming and difficult to test/maintain.

@VinozzZ VinozzZ requested review from carolynvs and sgettys December 21, 2022 17:55
return porterv1.AgentConfigSpecAdapter{}, errors.Wrapf(err, "cannot retrieve agent configuration %s specified by the agent action", action.Spec.AgentConfig.Name)

}
isCfgReady = err == nil && instCfg.Status.Ready
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Signed-off-by: Yingrong Zhao <[email protected]>
@VinozzZ VinozzZ requested a review from sgettys January 10, 2023 17:22
Copy link
Member

@carolynvs carolynvs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is a giant feature. I had no idea when we first started that it would require so much careful logic and design to work properly.

Thank you for taking the time to think through all the workflow and edge cases. This is amazing and I'm really excited to get it merged so everyone can use it! 💖

// AgentConfigStatus defines the observed state of AgentConfig
type AgentConfigStatus struct {
PorterResourceStatus `json:",inline"`
// The current status of the .
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this comment is incomplete

PorterResourceStatus `json:",inline"`
// The current status of the .
// +kubebuilder:validation:Type=boolean
Ready bool `json:"ready,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I recommend removing omitempty, so that when people look at the status they see all the fields and their current values. As it is, this (hiding ready when false) requires the user to know the schema and that ready defaults to false.

Version string `json:"version,omitempty" mapstructure:"version,omitempty"`
}

// AgentConfigSpecAdapter is a wrapper of AgentConfigSpec with a list representation of plugins configuration.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring seems a bit misleading since it says that the adapter exists to just provide Plugins as a list instead of a map. But based on our previous discussions around the difficulties in defaulting the plugins, and how you are using the original field to protect people from accidentally directly accessing the fields instead of using the getter functions, makes me think that there's a lot more behind why you split this into it's own data structure.

Can you add a bit of that to the doc so that people can follow why this struct exists and how they should use it?

Comment on lines +381 to +392
if p.FeedURL != "" {
plugins = append(plugins, fmt.Sprintf("_%s", cleanURL(p.FeedURL)))
}
if p.URL != "" {
plugins = append(plugins, fmt.Sprintf("_%s", cleanURL(p.URL)))
}
if p.Mirror != "" {
plugins = append(plugins, fmt.Sprintf("_%s", cleanURL(p.Mirror)))
}
if p.Version != "" {
plugins = append(plugins, fmt.Sprintf("_%s", p.Version))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a suggestion, but it may be worthwhile for maintenance of this code and the function above to split these lines into a separate function and then use the value for both the label and the hash.

// GetLabels returns a value that is safe to use
// as a label value and represents the plugin configuration used
// to trigger reconciliation.
// labels are restricted to alphanumeric and .-_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one more restriction on label values, they must be 63 characters or less

require.True(t, apierrors.IsNotFound(err), "expected the agent config was deleted")

// Verify that reconcile doesn't error out after it's deleted
triggerReconcile()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

phew! This is a mega test 💯, looks great and just shows how much logic is tied up in the AgentConfigController!

Comment on lines 325 to 330
porterRepository: ghcr.io/getporter/porter-agent
porterVersion: v1.0.2
serviceAccount: porter-agent
volumeSize: 64Mi
pullPolicy: Always
installationServiceAccount: installation-agent
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we omit all of these and just specify a plugin?

var porterAgentImgRepository = "ghcr.io/getporter/dev/porter-agent-kubernetes"
var porterAgentImgVersion = porterVersion
var (
porterAgentImgRepository = "ghcr.io/getporter/porter-agent"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WOOOOOO! 🎉 I love seeing this.

)

var _ = Describe("AgentConfig delete", func() {
Context("when an existing AgentConfig is delete", func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Context("when an existing AgentConfig is delete", func() {
Context("when an existing AgentConfig is deleted", func() {

debug: true
debug-plugins: true
default-secrets: "kubernetes-secrets"
verbosity: "debug"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops! Thanks for fixing that 👍

Signed-off-by: Yingrong Zhao <[email protected]>
Copy link
Member

@carolynvs carolynvs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Ship it 🚀

@VinozzZ VinozzZ merged commit 78f73e3 into getporter:main Jan 11, 2023
@VinozzZ VinozzZ deleted the install-plugin-agent branch January 11, 2023 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants