Skip to content

Volumes, Tasks and PipelineResources overlap #1272

@vtereso

Description

@vtereso

Background

Tekton Pipelines/Tasks separate functionality from configuration by acting as reusable components for work. Work is actualized/instantiated in the correspondingPipelineRun/TaskRun objects (noted as Run objects onward for the sake of brevity). Run objects store the work configuration through both PipelineResources and parameters.

Goals

  • Understand the true intent/purpose behind PipelineResources
  • Maybe remove PipelineResources
  • Potentially augment volume handling in Tekton

Introduction

According to the documentation:

PipelinesResources in a pipeline are the set of objects that are going to be used as inputs to a Task and can be output by a Task.

This does not not explain their purpose, but rather their current embodiment.
To the case of resuability, it makes a lot of sense to structure things in a functional way (e.g. f(x)=y) where Pipelines/Tasks are the functions that ingest configuration. In contrast, it seems that PipelineResources seem to fall somewhere in-between.

PipelineResource Role

Although each kind of PipelineResource does something different, I believe it is reasonable to consider them as syntactically sweet mounts. In general, mounts are really helpful because they allow some foreign information to be attached to a pod/container.

Without mounts, there would only be two options:

  • Create wrapper images around base images that expect configuration to be injected
  • Have functionality and configuration coupled, which is less than ideal

In actuality, PipelineResources append steps to tasks and handle volumes. Since Tasks are supposed to be reusable (e.g. the catalog), it seems strange to add more carpentry to manipulate their definition. More strange, each PipelineResource does something unique rather than being a normalized operation, which also lends itself to an extensibility problem. PipelineResources would likely be more clear if their responsilities were divided between appending steps and handling volumes. It seems like the orchestration of the volume is the important piece, where the step appending is somewhat less so. PipelineResources can be reorganized as Tasks that could be composed together.

Let's look at a few different PipelineResources:

Git Resource

Git resource represents a git repository, that contains the source code to be built by the pipeline. Adding the git resource as an input to a Task will clone this repository and allow the Task to perform the required actions on the contents of the repo.

Pull Request Resource

Adding the Pull Request resource as an input to a Task will populate the workspace with a set of files containing generic pull request related metadata such as base/head commit, comments, and labels.

...

Adding the Pull Request resource as an output of a Task will update the source control system with any changes made to the pull request resource during the pipeline.

In just these two use cases (although there are more), PipelineResources do help simplify mounting. However, there is a bit of quirkiness between PipelineResources. Some resources like GitResources seem to only be input resources, while PullRequestResources are both, but likely never as inputs to other Tasks. In any case, they could very operate as Tasks rather than being shipped around between Tasks. This sort of behavior is outlined here.

All of the currently supported PipelineResources can be seen here, which is likely to grow especially with consideration for integrations with the notifications proposal. I think it's important to make a distinction on the overlap of responsibilities between Volumes, Tasks, and PipelineResources before we invest further.

PipelineResources Problems

As mentioned in the background section, Run objects take parameters and PipelineResources.
Since these are distinct objects, PipelineResources need to be created ahead of time (separately), which is an inconvenience.

As a byproduct, there is currently a proposal to allow for PipelineResource embedding into the PipelineRun (although it remains a distinct k8s resource) to address this as well as resource littering. The aforementioned extensability problem is also a concern. Further, PipelineResources can be tampered with/deleted. This introduces the cookie-cutter/templating problem, where Run objects utilizing PipelineResources (at least as current) cannot determine whether they have been modified during or between runs.

Potential Solution

Add some logic into the Tekton API to declaratively handle volumes to facilitate data between steps/Tasks in a reusable way.

There are multiple ways to do this, but ultimately this gets as making Tekton much more simple in a few different ways:

  • Simplified yaml
  • Cutting down on interpolation to only params.
  • Reduce load on reconciler
  • Clear separation of resources and more catalog contributions

This also has me wonder: If images and other fields coulds be overrided by Runs, would interpolation be necessary at all? With no interpolation (maybe this is a stretch), tools like Kustomize (not that I am familiar with it) or the sort would be able to be the engine to edit resources rather than it being done internally.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/apiIndicates an issue or PR that deals with the API.kind/designCategorizes issue or PR as related to design.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions