Rough notes for standing up Dataverse for Notch8: Docker Compose (lab), Kubernetes / Helm, optional GitHub Actions deploys, and a shared learnings table. Extend into a full runbook as you validate each environment.
Reference: Source — github.com/notch8/dataverseup. An active deployment from this approach is at demo-dataverseup.notch8.cloud (seeded demo content for smoke tests).
- Target: Dataverse v6.10 on AWS by April 7, 2026 — functional demo, not necessarily production-hardened.
- Deliverable: Working deployment and documented process + learnings (this document).
See repository README.md — docker compose up after .env and secrets/ from examples.
The following sections describe how to install the dataverseup Helm chart from this repository.
Prerequisites: Helm 3, a Kubernetes cluster, kubectl configured, a PostgreSQL database reachable from the cluster (in-cluster or managed), and a StorageClass for any PVCs you enable.
Chart path: charts/dataverseup
From the repository root, installs or upgrades the chart with --install, --atomic, --create-namespace, and a default --timeout 30m0s (Payara first boot is slow).
./bin/helm_deploy RELEASE_NAME NAMESPACE
Pass extra Helm flags with HELM_EXTRA_ARGS (values file, longer timeout, etc.). If you pass a second --timeout in HELM_EXTRA_ARGS, it overrides the default (Helm uses the last value).
HELM_EXTRA_ARGS="--values ./your-values.yaml --wait --timeout 45m0s" ./bin/helm_deploy my-release my-namespace- Dataverse (
gdcc/dataverse) — Payara on port 8080; Service may expose 80 → target 8080 for Ingress compatibility. - Optional bootstrap Job (
gdcc/configbaker) — usually a Helm post-install hook (bootstrapJob.helmHook: true).bootstrapJob.mode: oneShotrunsbootstrapJob.commandonly (default:bootstrap.sh dev— FAKE DOI,dataverseAdmin, etc.).bootstrapJob.mode: composemirrors local Docker Compose: wait for the API, run configbaker with a writable token file onemptyDir, thenapply-branding.shandseed-content.sh(fixtures baked into a ConfigMap). Tune waits withbootstrapJob.composeand allow a longerbootstrapJob.timeoutwhen seeding. - Optional dedicated Solr (
internalSolr) — a new Solr Deployment/Service in the same release and namespace as Dataverse (not wiring into someone else’s shared “cluster Solr”). DefaultsolrInit.modeisstandalone: the Dataverse pod waits for that Solr core before starting. UsesolrInit.mode: cloudonly when Dataverse talks to SolrCloud + ZooKeeper you operate separately. - Optional S3 —
awsS3.enabledmounts AWS credentials and ships the S3 init script.
-
Navbar SVG — Enable
brandingNavbarLogos.enabledso an init container copiesbranding/docroot/logos/navbar/logo.svgfrom the chart onto/dv/docroot/logos/navbar/logo.svg(needsdocrootPersistenceor the chart’s emptyDir docroot fallback). MatchLOGO_CUSTOMIZATION_FILEinbranding/branding.envto the web path (e.g./logos/navbar/logo.svg). -
Admin settings (installation name, footer, optional custom header/footer CSS paths) — Edit
branding/branding.envin the repo. The chart embeds it in the…-bootstrap-chainConfigMap whenbootstrapJob.mode: compose. The post-install Job runsapply-branding.sh, which PUTs those settings via the Dataverse Admin API using the admin token from configbaker. -
Custom HTML/CSS files — Add them under
branding/docroot/branding/in the repo, setHEADER_CUSTOMIZATION_FILE, etc. inbranding.envto/dv/docroot/branding/..., and ship those files into the pod (extravolumeMounts/configMapor bake into an image). The stock chart does not mount the wholebranding/docroot/branding/tree on the main Deployment; compose only shipsbranding.envand the logo viabrandingNavbarLogos. -
After
helm upgrade— The post-install hook does not re-run. To re-apply branding, usebootstrapJob.compose.postUpgradeBrandingSeedJobwith a Secret holdingDATAVERSE_API_TOKEN, or runscripts/apply-branding.shlocally/cron withDATAVERSE_INTERNAL_URLand a token.
The chart does not install PostgreSQL by default. Supply DB settings with extraEnvVars and/or extraEnvFrom (recommended: Kubernetes Secret for passwords).
Enable internalSolr.enabled, solrInit.enabled, keep solrInit.mode: standalone, and supply solrInit.confConfigMap. Leave solrInit.solrHttpBase empty — the chart sets the Solr admin URL to the in-release Service (http://<release>-solr.<namespace>.svc.cluster.local:8983). Point your app Secret at that same host/port and core (see table below). You do not need an existing Solr installation in the cluster.
Local docker-compose.yml and this chart both target official Solr 9 (solr:9.10.1) and IQSS conf/solr files vendored under repo config/ (refresh from IQSS develop or a release tag as in the root README.md).
| Docker Compose | Helm (internalSolr + solrInit) |
|
|---|---|---|
| Solr image pin | solr:9.10.1 |
internalSolr.image / solrInit.image default solr:9.10.1 |
| Default core name | collection1 (see scripts/solr-initdb/01-ensure-core.sh) |
dataverse (solr-precreate in internal-solr-deployment.yaml) |
| App Solr address | SOLR_LOCATION=solr:8983 (host:port) |
With internalSolr.enabled, the chart sets DATAVERSE_SOLR_HOST, DATAVERSE_SOLR_PORT, DATAVERSE_SOLR_CORE, SOLR_SERVICE_*, and SOLR_LOCATION to the in-release Solr Service and solrInit.collection (default dataverse). The GDCC ct profile otherwise defaults to host solr and core collection1, which breaks Kubernetes installs if unset. |
Compose only copies schema.xml and solrconfig.xml into the core after precreate. SolrCloud (solrInit.mode: cloud) still needs a full conf tree or solr-conf.tgz (including lang/, stopwords.txt, etc.) for solr zk upconfig — see Solr prerequisites.
- Standalone (default, with
internalSolr): the initContainer waits for/solr/<core>/admin/pingviacurl; the defaultsolr:9.10.1image is sufficient. This matches launching a solo Solr with the chart instead of consuming a shared cluster Solr Service. - Cloud / ZooKeeper (optional): set
solrInit.mode: cloudandsolrInit.zkConnectwhen Dataverse uses SolrCloud you run elsewhere. The same container runssolr zk upconfig; use a Solr major compatible with that cluster. OverridesolrInit.image,solrInit.solrBin, andsolrInit.securityContextif you use a vendor image (e.g. legacy Bitnami).
-
Create namespace
kubectl create namespace <ns> -
Database Provision Postgres and a database/user for Dataverse. Note the service DNS name inside the cluster (e.g.
postgres.<ns>.svc.cluster.local). -
Solr configuration ConfigMap (if using
solrInit/internalSolr) Dataverse needs a full Solr configuration directory for its version — notschema.xmlalone. Build a ConfigMap whose keys are the files under that conf directory (or a singlesolr-conf.tgzas produced by your packaging process). See Solr prerequisites. -
Application Secret (example name
dataverse-app-env) PreferstringDatafor passwords. Include at least the variables the GDCC image expects for JDBC and Solr (mirror what you use in Docker Compose.env). Typical keys include:DATAVERSE_DB_HOST,DATAVERSE_DB_USER,DATAVERSE_DB_PASSWORD,DATAVERSE_DB_NAMEPOSTGRES_SERVER,POSTGRES_PORT,POSTGRES_DATABASE,POSTGRES_USER,POSTGRES_PASSWORD,PGPASSWORD- Solr:
SOLR_LOCATIONorDATAVERSE_SOLR_HOST/DATAVERSE_SOLR_PORT/DATAVERSE_SOLR_CORE(match your Solr deployment) - Public URL / hostname:
DATAVERSE_URL,hostname,DATAVERSE_SERVICE_HOST(used by init scripts and UI) - Optional:
DATAVERSE_PID_*for FAKE DOI (see default chart comments and container demo docs)
-
Values file Start from
charts/dataverseup/values.yamland override with a small values file of your own. At minimum for a first install:persistence.enabled: true(file store)extraEnvFrompointing at your Secret- If using dedicated in-chart Solr:
internalSolr.enabled,solrInit.enabled,solrInit.confConfigMap,solrInit.mode: standalone(default). OmitsolrInit.solrHttpBaseto use the auto-derived in-release Solr Service URL bootstrapJob.enabled: truefor first-time seeding
-
Lint and render
helm lint charts/dataverseup -f your-values.yaml helm template dataverseup charts/dataverseup -f your-values.yaml > /tmp/manifests.yaml -
Install
Using the wrapper (from repo root):
HELM_EXTRA_ARGS="--values ./your-values.yaml --wait" ./bin/helm_deploy <release> <namespace>
Raw Helm (equivalent shape):
helm upgrade --install <release> charts/dataverseup -n <ns> -f your-values.yaml --wait --timeout 45m
-
Smoke tests
kubectl get pods -n <ns>- Bootstrap job logs (if enabled):
kubectl logs -n <ns> job/...-bootstrap - API: port-forward or Ingress →
GET /api/info/versionshould return 200 - UI login (default bootstrap admin from configbaker dev profile — change before any shared environment)
-
Helm test (optional)
helm test <release> -n <ns>
The .github/workflows/deploy.yaml job uses the GitHub Environment named by the environment workflow input (e.g. demo). It must match ops/<environment>-deploy.tmpl.yaml. The Prepare kubeconfig and render deploy values step runs envsubst only for selected secrets (DB_PASSWORD, SMTP_PASSWORD) and GITHUB_RUN_ID. Public URLs, ingress, in-cluster Solr/Dataverse Service DNS, :SystemEmail / JavaMail From addresses (demo: both support@notch8.com), S3 bucket name, and Postgres identifiers are plain literals in that file — edit them there when the environment changes (they must match your Helm release/namespace, e.g. demo-dataverseup).
Secrets (typical, per Environment): DB_PASSWORD, KUBECONFIG_FILE (base64), optional mail: SMTP_PASSWORD (SendGrid API key for demo). Demo values fix system_email and no_reply_email to support@notch8.com in ops/demo-deploy.tmpl.yaml (no SYSTEM_EMAIL / NO_REPLY_EMAIL secrets required).
Repository or Environment variables (optional):
| Variable | Purpose | Default if unset |
|---|---|---|
DEPLOY_TOOLBOX_IMAGE |
Job container image |
dtzar/helm-kubectl:3.9.4 |
HELM_CHART_PATH |
Path passed to helm / bin/helm_deploy |
./charts/dataverseup |
HELM_APP_NAME |
app.kubernetes.io/name for kubectl rollout status |
github.event.repository.name |
DEPLOY_ROLLOUT_TIMEOUT |
Rollout wait | 10m |
DEPLOY_BOOTSTRAP_JOB_TIMEOUT |
Bootstrap Job wait | 25m |
Default Helm release and namespace are <environment>-<repository.name> (e.g. demo-dataverseup). Override with workflow inputs k8s_release_name / k8s_namespace when needed.
For demo, SMTP host, ports, auth flags, and support@notch8.com addresses live only in ops/demo-deploy.tmpl.yaml — the workflow does not pass GitHub Variables for those; change the template (or add ${VAR} placeholders and extend envsubst) if you need per-environment overrides.
Migrating or renaming a release: Update the literals in ops/<environment>-deploy.tmpl.yaml (ingress hosts, dataverse_* / DATAVERSE_* / hostname, solrHttpBase, SOLR_*, DATAVERSE_URL, awsS3.bucketName, DB names, etc.) so they match the new Helm release and namespace; then align Postgres, S3, TLS, and running workloads.
-
Chart: set
mail.enabled: trueso010-mailrelay-set.shis in the ConfigMap. Pod env must include the names the script reads:system_email,mailhost,mailuser,no_reply_email,smtp_password,smtp_port,socket_port,smtp_auth,smtp_starttls,smtp_type,smtp_enabled(seeops/demo-deploy.tmpl.yamlandscripts/init.d/010-mailrelay-set.sh).smtp_enabledmust not befalse/0/no, andsystem_emailmust be non-empty or the script no-ops. -
GitHub Environment (demo): Add Secret
SMTP_PASSWORD(SendGrid API key).ops/demo-deploy.tmpl.yamlsetsDATAVERSE_MAIL_SYSTEM_EMAIL/DATAVERSE_MAIL_SUPPORT_EMAILtosupport@notch8.comandDATAVERSE_MAIL_MTA_*for SendGrid (587 + STARTTLS +apikeyuser). Since Dataverse 6.2+, outbound mail uses these MicroProfile settings (SMTP/Email in the Dataverse Guide);010-mailrelay-set.sh(Payara JavaMail +:SystemEmail) is not sufficient alone. SetDATAVERSE_MAIL_DEBUGtotruein the template temporarily for verbose mail logs (dataverse.mail.debug). Restart pods after changing mail env (MTA session is cached). -
Verify config: After the pod is ready, check the setting:
curl -sS "https://<your-host>/api/admin/settings/:SystemEmail"(use a superuser API token if the endpoint requires auth on your version). In the UI, use Contact / support mail, Forgot password, or user signup (if enabled) and confirm delivery (and provider dashboard / spam folder). -
Logs:
kubectl logs -n <ns> deploy/<release>-dataverseup(or your deployment name) and look for010-mailrelay/asadminerrors during startup. -
Docker Compose (local): Set the same variable names in
.env(see.env.exampleSMTP block). They are passed through on the dataverse service. Restart dataverse and taildocker compose logs -f dataverseduring Payara init. -
If the UI says the message was sent but nothing arrives:
010-mailrelay-set.shruns only when Payara starts. ChangingSMTP_PASSWORDin GitHub without rolling/restarting Dataverse pods leaves the old JavaMail session (or none). After fixing secrets, redeploy or delete pods so init runs again.- Confirm rendered values: open the generated
ops/<env>-deploy.yaml(CI artifact or runenvsubstlocally) and checksmtp_passwordis non-empty (not the literal${SMTP_PASSWORD}) andsystem_email/no_reply_emailaresupport@notch8.comfor demo. - SendGrid: use the full API key as
SMTP_PASSWORD,mailuser=apikey, port 587,smtp_auth/smtp_typeas inops/demo-deploy.tmpl.yaml. In SendGrid Activity, see bounces, blocks, and “unauthenticated” sends. Verify the From address (no_reply_email) via Single Sender or domain authentication. - Payara: successful sends often log nothing at default levels. Exec into the Dataverse container and run:
asadmin --user "$ADMIN_USER" --passwordfile "$PASSWORD_FILE" list-javamail-resources
You should seemail/notifyMailSession. If it is missing, init did not configure mail (wrong env,smtp_enabled, or script failed).
Domain logs (GDCC base image): Payara lives under/opt/payara/appserver, not/opt/payara/glassfishalone. Typical file:
/opt/payara/appserver/glassfish/domains/domain1/logs/server.log
Or discover:find /opt/payara -name server.log 2>/dev/null. Much application output also goes tokubectl logson the main container (dataverseup). - Network: from the pod,
nc -zv smtp.sendgrid.net 587(or yourmailhost) must succeed if the cluster egress allows it.
-
Why you don’t see Rails-style “Sent mail” lines: ActionMailer logs each delivery at INFO by default. Dataverse 6.2+ uses
dataverse.mail.*/DATAVERSE_MAIL_*; withDATAVERSE_MAIL_DEBUG=trueyou get the supported verbose logging described in the installation guide (set inops/demo-deploy.tmpl.yaml, then roll pods — disable after debugging).Lower-level JavaMail trace (optional):
-Dmail.debug=trueonJVM_OPTSstill works for raw SMTP wire logs but can leak credentials; preferDATAVERSE_MAIL_DEBUGfirst.Closest thing to a delivery log: SendGrid Activity (accept / bounce / block).
Set ingress.enabled: true, ingress.className to your controller (e.g. nginx, traefik), and hosts/TLS to match your DNS. Payara serves HTTP on 8080; the Service fronts it on port 80 so Ingress backends stay HTTP.
If you terminate TLS or expose the app on a non-default host port, keep DATAVERSE_URL and related hostname settings aligned with the URL users and the app use.
The chart embeds the S3 and mail relay scripts from scripts/init.d/ via symlinks under charts/dataverseup/files/init.d/. Edit scripts/init.d/006-s3-aws-storage.sh or scripts/init.d/010-mailrelay-set.sh once; both Compose mounts and Helm ConfigMaps stay aligned. helm package resolves symlink content into the tarball.
Set initdFromChart.enabled: true in values to include all files/init.d/*.sh in the same ConfigMap (compose parity with mounting ./scripts/init.d). Keep INIT_SCRIPTS_FOLDER (or the image default) pointed at /opt/payara/init.d. Review MinIO- and webhook/trigger scripts (repo scripts/triggers/, Compose → /opt/payara/triggers) before enabling in a cluster that does not mount those paths.
-
Set
awsS3.enabled: true,awsS3.existingSecret,bucketName,region, andprofilein values. The IAM principal behind the Secret needs S3 access to that bucket. For Amazon S3, leaveendpointUrlempty so Payara’s init script does not setcustom-endpoint-url(a regionalhttps://s3….amazonaws.comURL there commonly causes upload failures). SetendpointUrlonly for MinIO or other S3-compatible endpoints. -
Create a generic Secret in the same namespace as the Helm release, before pods that mount it start. Key names must match
awsS3.secretKeys(defaults below): the values are the raw file contents of~/.aws/credentialsand~/.aws/config.credentials— ini format; the profile block header (e.g.[default]or[my-profile]) must matchawsS3.profile.config— ini format; forprofile: defaultuse[default]withregion = .... For a named profile use[profile my-profile]and the same region asawsS3.regionunless you know you need otherwise.
-
Examples (replace
NAMESPACE, keys, region, and secret name if you changedexistingSecret):NS=NAMESPACE kubectl create namespace "$NS" --dry-run=client -o yaml | kubectl apply -f - kubectl -n "$NS" create secret generic aws-s3-credentials \ --from-file=credentials="$HOME/.aws/credentials" \ --from-file=config="$HOME/.aws/config" \ --dry-run=client -o yaml | kubectl apply -f -
Inline
[default]user (no local files):kubectl -n "$NS" create secret generic aws-s3-credentials \ --from-literal=credentials="[default] aws_access_key_id = AKIA... aws_secret_access_key = ... " \ --from-literal=config="[default] region = us-west-2 " \ --dry-run=client -o yaml | kubectl apply -f -
If you use temporary credentials (assumed role / STS), add a line to the credentials profile:
aws_session_token = .... Rotate before expiry or automate renewal. -
After creating or updating the Secret, restart the Dataverse Deployment (or delete its pods) so the volume is remounted. The chart sets
AWS_SHARED_CREDENTIALS_FILEandAWS_CONFIG_FILEto the mounted paths.
Note: The Java AWS SDK inside the app may not perform the same assume-role chaining as the AWS CLI from a complex config file. Prefer putting direct user keys or already-assumed temporary keys in the Secret for the app, or use EKS IRSA (service account + role) instead of long-lived keys if your platform supports it.
The Native API returns HTTP 400 with that message when the request reached Dataverse but writing to the configured store failed. This is not a bug in the seed script’s jsonData shape.
With dataverse.files.storage-driver-id=S3 (see scripts/init.d/006-s3-aws-storage.sh):
- IAM — The principal in your
aws-s3-credentialsSecret needs at leasts3:PutObject,s3:GetObject,s3:DeleteObject, ands3:ListBucketon the target bucket (and prefixes Dataverse uses). MissingPutObjectoften surfaces exactly as this generic message; the real error is in server logs. - Bucket and region —
awsS3.bucketNameandawsS3.regionmust match the bucket. For Amazon S3, keepawsS3.endpointUrlempty; do not point it athttps://s3.<region>.amazonaws.comunless you are on a non-AWS S3-compatible store. - Credentials mounted — After creating or rotating the Secret, restart Dataverse pods so
AWS_SHARED_CREDENTIALS_FILE/AWS_CONFIG_FILEpoint at the new files. - Logs — Check the Dataverse Deployment pod logs for nested exceptions (
AccessDenied,NoSuchBucket, SSL, etc.).
- Bump
image.tag/Chart.appVersiontogether with Dataverse release notes. - Reconcile Solr conf ConfigMap when Solr schema changes.
- When upgrading internal Solr across a major Solr version (e.g. 8 → 9), use a fresh Solr data volume (new PVC or wipe
internalSolrpersistence) so cores are recreated; same idea as Compose (see rootREADME.md). - After bumping
solrInit/internalSolrimages, re-test SolrCloud installs (solr zk+ collection create) in a non-production cluster if you usesolrInit.mode: cloud. - If
bootstrapJob.helmHookis true, the bootstrap Job runs on post-install only, not on every upgrade (by design).
Append rows as you go (Compose, cluster, CI, etc.):
| Date | Environment | Note |
|---|---|---|
- Running Dataverse in Docker (conceptual parity with container env)
- Application image
- Solr prerequisites