Every Jenkins estate decays the same way: each team copies a Jenkinsfile from the repo next door, mutates it, and within a year you have four hundred snowflakes and no way to roll out a CVE patch without a four-hundred-PR campaign. The fix is a platform, not a template. One versioned shared library owns the pipeline logic, repos call a single entrypoint, and the controller is rebuildable from code. This guide builds that platform end to end: library layout, custom DSL, versioning, self-onboarding folders, JCasC, ephemeral Kubernetes agents, secrets, and unit tests for the Groovy itself.
1. Structure the shared library: vars, src, resources
A Jenkins shared library is a Git repo with a fixed, magic directory layout. Jenkins only recognizes three top-level directories, and each has a distinct role:
pipeline-library/
vars/ # global variables -> custom DSL steps
standardPipeline.groovy # exposes step standardPipeline(...)
standardPipeline.txt # help text shown in Snippet Generator
dockerBuild.groovy
notifySlack.groovy
src/ # Groovy classes on the classpath (org.foo.*)
com/kloudvin/ci/
BuildConfig.groovy
Semver.groovy
resources/ # non-Groovy files, loaded with libraryResource
com/kloudvin/ci/
pod-templates/jnlp-maven.yaml
sonar-project.properties.tmpl
The contract that matters: every .groovy file in vars/ becomes a global step named after the file. vars/standardPipeline.groovy defines a call() method, and pipelines invoke it as standardPipeline { ... }. That is the entire trick behind a custom DSL.
// vars/standardPipeline.groovy
def call(Map config = [:]) {
// config is the closure-populated map from the Jenkinsfile
def cfg = new com.kloudvin.ci.BuildConfig(config)
pipeline {
agent {
kubernetes {
yaml libraryResource("com/kloudvin/ci/pod-templates/jnlp-maven.yaml")
}
}
options {
timeout(time: cfg.timeoutMinutes, unit: 'MINUTES')
buildDiscarder(logRotator(numToKeepStr: '30'))
disableConcurrentBuilds()
}
stages {
stage('Build') { steps { container('maven') { sh 'mvn -B clean package' } } }
stage('Test') { steps { container('maven') { sh 'mvn -B test' } } }
stage('Scan') { when { expression { cfg.scanEnabled } }
steps { sonarScan(cfg) } }
stage('Publish') { when { branch 'main' }
steps { dockerBuild(cfg) } }
}
post {
always { junit testResults: '**/surefire-reports/*.xml', allowEmptyResults: true }
failure { notifySlack(status: 'FAILED', config: cfg) }
}
}
}
Keep
vars/files thin. They are orchestration glue; anything with real logic (parsing, version math, config validation) belongs insrc/as a unit-testable class. Avarsfile that grows past ~80 lines is asrcclass trying to escape.
The BuildConfig class lives in src/ and gives you a typed, validated config object instead of a bag of untyped map keys:
// src/com/kloudvin/ci/BuildConfig.groovy
package com.kloudvin.ci
class BuildConfig implements Serializable {
String appName
String registry = 'registry.kloudvin.internal'
Integer timeoutMinutes = 30
Boolean scanEnabled = true
BuildConfig(Map cfg) {
this.appName = cfg.appName ?: { throw new IllegalArgumentException('appName is required') }()
if (cfg.registry) this.registry = cfg.registry
if (cfg.timeoutMinutes) this.timeoutMinutes = cfg.timeoutMinutes as Integer
if (cfg.scanEnabled != null) this.scanEnabled = cfg.scanEnabled
}
}
implements Serializableis not optional. Pipeline state is persisted to disk across restarts and resumed; any object that survives across ashstep or stage boundary must serialize. Forgetting this throwsNotSerializableExceptionat the worst possible moment.
2. Write custom DSL steps and a one-line Jenkinsfile
The whole point is that consuming repos do not author pipeline logic. Their Jenkinsfile declares intent and nothing else:
// A consuming repo's entire Jenkinsfile
@Library('kloudvin-pipeline@v3') _
standardPipeline {
appName = 'payments-api'
timeoutMinutes = 45
scanEnabled = true
}
That trailing _ after the @Library annotation is required: the annotation must attach to something, and _ is the idiomatic no-op import. Now every supporting step is its own vars/ file, composed by standardPipeline:
// vars/sonarScan.groovy
def call(com.kloudvin.ci.BuildConfig cfg) {
withSonarQubeEnv('kloudvin-sonar') {
container('maven') {
sh "mvn -B sonar:sonar -Dsonar.projectKey=${cfg.appName}"
}
}
timeout(time: 10, unit: 'MINUTES') {
// qualityGate aborts the build if the gate fails
waitForQualityGate abortPipeline: true
}
}
// vars/dockerBuild.groovy
def call(com.kloudvin.ci.BuildConfig cfg) {
def tag = "${cfg.registry}/${cfg.appName}:${env.GIT_COMMIT.take(12)}"
container('kaniko') {
sh """
/kaniko/executor \
--context=`pwd` \
--dockerfile=Dockerfile \
--destination=${tag} \
--cache=true
"""
}
}
This composition is the source of the platform’s power. Patch dockerBuild.groovy once – add a scan, switch the builder, change the registry – and every repo on that library version gets it on the next build, zero PRs to product repos.
3. Version the library: tags, trusted vs untrusted
A platform you cannot version is a platform you cannot change safely. The @Library('name@ref') annotation pins to any Git ref: a branch, a tag, or a commit SHA. Use semantic tags and let teams opt into a major line:
| Reference style | Example | When to use |
|---|---|---|
| Floating branch | @Library('lib@main') |
Internal platform repos only; you accept breakage |
| Pinned major tag | @Library('lib@v3') |
Default for product repos; moving tag tracks v3.x |
| Exact tag | @Library('lib@v3.4.1') |
Repos that must freeze, e.g. during a compliance window |
Make v3 a moving tag you re-point to the latest v3.x release, so consumers pin to a major line and pick up backward-compatible fixes automatically:
# Cut a patch and advance the major-line pointer
git tag -a v3.4.2 -m "fix: kaniko cache key"
git tag -f v3 v3.4.2 # move the v3 alias forward
git push origin v3.4.2
git push -f origin v3 # consumers on @v3 get this on next build
The security model is the second axis. Libraries configured at the global/folder level by an admin run as trusted – they may call internal Jenkins APIs and @Grab dependencies. Libraries loaded dynamically by a Jenkinsfile via the library step are untrusted and run inside the Groovy sandbox. The rule for a platform:
Configure the golden library as a global trusted library in JCasC, marked implicit load off and allow default version override off. That stops a product repo from pinning an arbitrary fork or an older, unpatched tag. Treat anything a repo can self-declare as untrusted and sandboxed.
4. Template multibranch and organization folders for self-onboarding
You do not want to click “New Item” four hundred times. An Organization Folder (GitHub or Bitbucket) scans an org, and for every repo containing a Jenkinsfile it auto-creates a multibranch project – branches and PRs included. Onboarding a repo becomes “add a Jenkinsfile,” nothing more.
Define it in code through the Job DSL seed job so the folder itself is reproducible:
// jobs/seed-org-folder.groovy (Job DSL)
organizationFolder('kloudvin-services') {
description('Auto-onboards every repo with a Jenkinsfile')
organizations {
github {
repoOwner('kloudvin')
apiUri('https://api.github.com')
credentialsId('github-app-kloudvin')
traits {
gitHubBranchDiscovery { strategyId(1) } // branches
gitHubPullRequestDiscovery { strategyId(1) } // PRs from origin
}
}
}
projectFactories {
workflowMultiBranchProjectFactory { scriptPath('Jenkinsfile') }
}
orphanedItemStrategy {
discardOldItems { daysToKeep(7); numToKeep(20) }
}
triggers { periodicFolderTrigger { interval('1d') } }
}
The GitHub App credential (github-app-kloudvin) matters at scale: a personal token shares one rate-limit bucket across the whole estate and starves at a few hundred repos, while a GitHub App gets per-installation limits and finer scopes.
5. Manage the controller with JCasC and seed jobs
Configuration-as-Code (the configuration-as-code plugin) renders the controller’s entire configuration from YAML, replacing point-and-click setup. Set CASC_JENKINS_CONFIG to a file, directory, or URL; Jenkins applies it on boot and on a reload from Manage Jenkins -> Configuration as Code -> Reload.
# jenkins.yaml -- the controller, as code
jenkins:
systemMessage: "KloudVin CI -- managed by JCasC. Do not configure by hand."
numExecutors: 0 # controller runs no builds; agents only
authorizationStrategy:
roleBased:
roles:
global:
- name: "admin"
permissions: ["Overall/Administer"]
assignments: ["platform-team"]
clouds:
- kubernetes:
name: "k8s"
serverUrl: "https://kubernetes.default"
namespace: "jenkins-agents"
jenkinsUrl: "http://jenkins.jenkins.svc:8080"
containerCapStr: "50"
unclassified:
globalLibraries:
libraries:
- name: "kloudvin-pipeline"
defaultVersion: "v3"
implicit: false
allowVersionOverride: false # repos cannot pin a fork or old tag
retriever:
modernSCM:
scm:
git:
remote: "https://github.com/kloudvin/pipeline-library.git"
credentialsId: "github-app-kloudvin"
jobs:
- script: |
pipelineJob('seed') {
definition {
cps {
script(readFileFromWorkspace('jobs/seed-org-folder.groovy'))
sandbox(false)
}
}
}
Bootstrap order is the subtle part: JCasC applies first and creates the seed job; the seed job runs Job DSL and creates the org folders. Keep jenkins.yaml and jobs/ in one repo, mount it into the controller pod, and the entire Jenkins is a git revert away from any prior state.
6. Ephemeral agents on Kubernetes: pod template and resource tuning
A pool of static agents accrues state between builds and bills you while idle. The Kubernetes plugin instead launches one pod per build and deletes it on completion. The pod template – referenced earlier via libraryResource – defines the containers a build can container('name') { ... } into:
# resources/com/kloudvin/ci/pod-templates/jnlp-maven.yaml
apiVersion: v1
kind: Pod
spec:
containers:
- name: maven
image: maven:3.9-eclipse-temurin-21
command: ["sleep"]
args: ["infinity"]
resources:
requests: { cpu: "500m", memory: "1Gi" }
limits: { cpu: "2", memory: "2Gi" }
- name: kaniko
image: gcr.io/kaniko-project/executor:v1.23.2-debug
command: ["sleep"]
args: ["infinity"]
resources:
requests: { cpu: "500m", memory: "1Gi" }
limits: { cpu: "1", memory: "2Gi" }
Tuning notes from production:
- Always set requests and limits. Without requests the scheduler bin-packs blindly and OOM-kills agents under load; without limits one runaway build starves a node. Memory
limit == request(Guaranteed QoS) for build containers avoids eviction surprises. - Do not declare your own
jnlpcontainer unless you must. The plugin injects the inbound-agentjnlpcontainer automatically; redefining it with the wrong image silently breaks the connection back to the controller. containerCapandpodRetention. Cap concurrent pods (containerCapStr: "50"above) so a thundering herd cannot DoS the cluster, and setpodRetention: neverso failed pods do not pile up.- Idle timeout to zero. With ephemeral pods there is no warm pool to keep – agents scale to zero between builds, and you pay only for active build time.
7. Secure secrets: credentials binding and external Vault
Secrets never live in a Jenkinsfile or a library file. They live in the credentials store, and the platform binds them into the build environment only for the steps that need them, masked in the log. Wrap this in a vars/ step so consumers cannot fumble the binding:
// vars/withRegistryCreds.groovy
def call(Closure body) {
withCredentials([usernamePassword(
credentialsId: 'registry-push',
usernameVariable: 'REG_USER',
passwordVariable: 'REG_PASS')]) {
body() // $REG_USER / $REG_PASS exist only here and are masked in logs
}
}
For anything beyond low-stakes secrets, do not store them in Jenkins at all – broker them from HashiCorp Vault so rotation happens outside the CI system and Jenkins holds only short-lived leases. The HashiCorp Vault plugin authenticates the controller (AppRole or Kubernetes auth) and injects paths per build:
stage('Deploy') {
steps {
withVault(configuration: [vaultUrl: 'https://vault.kloudvin.internal',
vaultCredentialId: 'vault-approle'],
vaultSecrets: [[ path: 'secret/data/ci/payments',
secretValues: [[envVar: 'DB_PASSWORD', vaultKey: 'db_password']] ]]) {
sh 'deploy --db-pass "$DB_PASSWORD"'
}
}
}
Prefer Vault’s Kubernetes auth method over a long-lived AppRole secret-id: the agent pod’s ServiceAccount token becomes the Vault login, so there is no static credential to leak. Pair it with short TTL leases so a compromised build log buys an attacker minutes, not months.
8. Test the pipeline code with Jenkins Pipeline Unit
Pipeline logic is code, and untested code in vars/ fails in production at 2am. The JenkinsPipelineUnit framework mocks the pipeline DSL so you can unit-test vars/ steps and src/ classes on plain JVM CI – no Jenkins required. Wire it into a Gradle/Maven build that runs on every library PR:
// test/com/kloudvin/ci/StandardPipelineSpec.groovy
import com.lesfurets.jenkins.unit.BasePipelineTest
import org.junit.Before
import org.junit.Test
import static org.junit.Assert.assertEquals
class StandardPipelineSpec extends BasePipelineTest {
@Before void setUp() {
super.setUp()
// register mocks for any DSL step the library calls
helper.registerAllowedMethod('sh', [String]) { _ -> }
helper.registerAllowedMethod('libraryResource', [String]) { 'apiVersion: v1' }
helper.registerAllowedMethod('container', [String, Closure]) { _, c -> c() }
}
@Test void buildConfigRejectsMissingAppName() {
try {
new com.kloudvin.ci.BuildConfig([:])
assert false : 'expected IllegalArgumentException'
} catch (IllegalArgumentException e) {
assertEquals('appName is required', e.message)
}
}
@Test void scanStageRunsWhenEnabled() {
def script = loadScript('vars/sonarScan.groovy')
// assert callstack / step invocations via printCallStack()
assertJobStatusSuccess()
}
}
Run it in the library repo’s own pipeline so a bad commit never reaches a moving tag:
./gradlew test # JenkinsPipelineUnit specs, fast, no Jenkins
This closes the loop: the library that everything depends on is itself gated by tests before its tag moves.
Verify
Confirm each layer is wired correctly before declaring the platform live:
# 1. JCasC parsed cleanly (no boot errors, config visible)
curl -s -u "$JENKINS_USER:$JENKINS_TOKEN" \
https://jenkins.kloudvin.internal/configuration-as-code/ | grep -q "Reload"
# 2. The global library is registered at the expected version
curl -s -u "$JENKINS_USER:$JENKINS_TOKEN" \
"https://jenkins.kloudvin.internal/manage/configureTools/" | grep -q "kloudvin-pipeline"
# 3. Library unit tests are green
./gradlew test --console=plain
- A test repo with the one-line
Jenkinsfilebuilds end to end and produces a tagged image. - Moving the
v3tag to a new patch causes consuming repos to pick it up on their next build, with no PR to those repos. - An agent pod appears in
kubectl get pods -n jenkins-agentsduring a build and is gone within the retention window after it completes. - A secret bound via
withCredentialsshows as****in the build log, never plaintext.
Enterprise scenario
A platform team running ~600 microservice repos on a single Jenkins controller hit a hard wall during a Log4Shell-class incident. The vulnerable logging dependency was baked into the Docker build step that every team had copied into its own Jenkinsfile. There was no central step to patch – the “fix” was a 600-repo PR campaign that would have taken weeks while the window stayed open.
The constraint: they could not break in-flight releases, and several regulated repos were frozen under a change-control window and legally could not take the new behavior until their next window. A flag-day forced upgrade was off the table.
They solved it by collapsing the Docker logic into a single dockerBuild step in the shared library and switching every repo to the one-line standardPipeline entrypoint pinned to a moving major tag. The patched builder shipped behind that tag, so unfrozen repos picked it up on their next build automatically – no PRs. Frozen repos stayed safe by pinning an exact tag until their window opened:
// Frozen, change-controlled repos -- pinned to an exact patch, opt in later
@Library('kloudvin-pipeline@v3.4.1') _
standardPipeline { appName = 'ledger-core' }
Crucially, JCasC had set allowVersionOverride: false on the global library, so no repo could silently pin a stale fork and dodge the fix indefinitely – the platform team could see every repo’s effective version and drive the laggards. What would have been a multi-week, multi-hundred-PR scramble became a single tagged library release plus a short list of frozen repos to track. That is the entire economic argument for the platform.