The future of multi-tenancy #6710
Replies: 2 comments 1 reply
-
Why does this always come up as the silver bullet? Containerization/Kubernetes is the solution only when you also tightly integrate with this ecosystem. Only having a simple way to run side-by-side on the same machine is not the reason for multi-tenancy (at least for me). And simply putting stuff in containers doesn't mean it runs well in this environment. So, just to paint the picture, at the height of educast.nrw we had 17 tenants (with hope to scale further). The main reason we decided to use multi-tenancy back then and the reason why I now also use it at shio solutions for tales.media is to avoid the overhead each individual Opencast cluster comes with. And this in multiple ways:
I don't care for multi-tenancy if the overhead is minimal. We don't yet run Tobira, but I imagine it's very low on resource usage and has other properties that are very advantageous for containerized workloads. For Opencast to fall in this category, it would need to have the following properties:
We have run Opencast in containers for a very long time (started in 2.1.x times) and have obviously seen advantages in doing so. But Opencast is not the "ideal" workload for today's cluster solutions. Improving on these properties would make it more reasonable to run single-tenant systems. Now, some comments on the mentioned problems:
Kubernetes' scheduler supports multi-resource scheduling and has explicit support for GPUs with additional components. I personally have not yet configured a Kubernetes Cluster with GPU support (educast.nrw was shut down before we came to it), but my University has a managed Kubernetes Cluster with GPU workloads. However, non-datacenter GPUs might be a problem (NVIDIA doesn't want you to run them anyway).
Well, you can make dynamic scaling work, but making use of all available hardware resources is challenging, as mentioned above.
I had some ideas around this as well. Basically, pushing everything forward to Kubernetes jobs. Or something like the GitLab CI systems with multiple implementations for runner environments. In my research, I actually work on a prototype system for cloud and edge-native multimedia workflows. I thought of making this more production-ready after my PhD and having an Opencast operation that pushes workflows to this system 🙈.
Yes.
No. Kubernetes already has good documentation, but a production-ready Kubernetes cluster is hard anyway. There are way too many environments, from on-prem to cloud to edge, to provide a common documentation. I don't see Opencast Matrix giving advice on setting up Kubernetes. We also have to consider that getting familiar with Kubernetes to the point of managing workloads is one thing. Actually, setting up and maintaining a Kubernetes cluster is a whole different thing. I'm therefore very torn on forcing the whole community in this direction.
Asset manager is actually separated into different folders and you can now use different S3 buckets. But until recently, assets could be hard linked between orgs. Having one process connect to different databases sounds somehow wrong to me, but I don't really have strong arguments against it. It's more like not going all the way to getting rid of multi-tenancy. |
Beta Was this translation helpful? Give feedback.
-
|
Moving discussion to |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello community,
as part of the data model discussion, multi-tenancy came up again. We talked about it a bit, identified a few things that could be improved, but it was also questioned if we can potentially get rid of it completely (gasp!). Therefore we decided to get feedback from the community on this.
If you run an Opencast with multi-tenancy, please take the time to read this post and write a reply. In particular, please describe your use case and explain in detail why you use multi-tenancy as opposed to just having multiple Opencasts (potentially in a Kubernetes cluster). Knowing the exact reasons helps us to find a good solution for everyone. Note: also consider the XY problem and try to find your root reasons.
Problems & reasons to get rid of it
Reason for multi-tenancy
And yet, some adopters use multi-tenancy. Usually one tenant per department of your university/organization, or as a "Opencast as a service" provider, one tenant per organization/university.
Possible paths forward
Containerization and Kubernetes is the obvious solution. There are some potential problems though (which we need to actually dig into instead of just saying "sounds about right" and giving up on the idea):
But if we don't want to go the full way and lift that logic completely out of Opencast, we could still improve Opencast's implementation of it and fix lots of problems by lifting the multi-tenancy logic up a bit. The core problem is that multi-tenancy is handled at the very bottom by every piece of code. For example, events have an
organizationcolumn in the DB and all code handling events has to potentially check that field to filter for events of only a specific tenant. And that's easy to forget.We could instead have one database (not DBMS!) per tenant. And also have one separate folder in the storage for each tenant. Then almost all code can pretend like there is only one tenant, not scattering multi-tenancy logic everywhere and also vastly improving the isolation between tenants.
Beta Was this translation helpful? Give feedback.
All reactions