Skip to content

Task Cache


The cache is designed to allow agents to reuse various files during task execution that would normally need to be downloaded from external resources.

For example, when building a Java application using maven or gradle, required libraries are downloaded from artifact repositories (mavencentral). These libraries can be saved on the machine where the agent is running and reused in tasks that require them.

How Caching Works

The foundation of caching for the current agent version, which uses Docker to execute pipeline tasks, is a shared project volume (Docker Volume). Each agent instance (virtual machine running the agent) creates its own volume for a specific project. This volume is mounted to each container launched to execute a particular project task. Each volume can store multiple archives obtained from specific cache configurations.

Cache archives are distinguished by names that are hashes of strings consisting of cache keys. The sequence of keys in the configuration doesn't matter because before generating the hash, the key list is sorted. If no keys are specified for the cache name, the default value default is used.

Example cache configurations:

  • Without cache keys (uses default as the key):
cache:
  paths:
    - .m2/repository/
  • With cache keys maven and build:
cache:
  key:
    - maven
    - build
  paths:
    - .m2/repository/

Task Cache Workflow

When starting each task, the agent checks for available cache for the project. If the required cache archive is found, it's unpacked into the task's working directory. After completing all task scripts, the agent identifies files and folders suitable for caching, packages them into an archive, and stores it in the appropriate volume, replacing the old archive.

Disabling Cache for Specific Tasks

If you define cache globally for the entire pipeline, each job uses the same definition. You can override this behavior per job.

To completely disable it for a job, use an empty list:

job:
  cache: []

You can configure cache separately for a task: if a task specifies cache + paths (paths: keyword with non-empty path array), this configuration overrides the global one for the job. If not specified, the global configuration (if any) is used.

Manual Cache Reset

A manual cache reset button is available only to users with Administrator or higher privileges.

After using this function, the cache will be reset for all new tasks created after pressing the button. Tasks created before using this function will continue using the old cache.

Each manual cache clearing updates the internal cache name. Old cache isn't deleted automatically. You can manually delete these files from the agent storage.

Cache

Configuring Distributed Cache

By default, cache is stored locally on agents. However, there's also distributed cache - a mechanism allowing multiple agents to share the same cache.

Distributed cache configuration connects the agent to object storage (currently s3-compatible storage is supported), which serves as the cache storage.

To configure distributed cache, specify these parameters in application.properties:

Parameter Type Required Description
runner.cache.type string Required Storage type. Currently only s3 is supported. If not specified or another value is provided, cache will be stored locally only
runner.cache.s3.server-address string Required Server address, e.g. https://storage.yandexcloud.net
runner.cache.s3.access-key string Required Access key ID
runner.cache.s3.secret-key string Required Secret key
runner.cache.s3.bucket-name string Required Bucket name. Must be pre-created in object storage
runner.cache.s3.region string Required Server region, e.g. ru-central1
runner.cache.s3.path string Optional Object prefix
runner.cache.s3.shared boolean Optional Determines whether cache stored for a project under a specific key will be shared among different agents. If false, agents with different UUIDs will have different caches. Default is false
runner.cache.s3.max-uploaded-archive-size integer Optional Maximum size of archive uploaded to S3 (in bytes). Default is 5 GB. Must be between 10 MB and 5 GB. If outside these limits, cache will be stored locally

Example application.properties configuration:

# required parameters
runner.cache.type=s3
runner.cache.s3.server-address=https://storage.yandexcloud.net
runner.cache.s3.access-key=ACCESS_KEY
runner.cache.s3.secret-key=SECRET_KEY
runner.cache.s3.bucket-name=BUCKET_NAME
runner.cache.s3.region=ru-central1
# optional parameters
runner.cache.s3.shared=true
runner.cache.s3.path=distributed-runner-cache/shared
# sets maximum cache size uploaded to s3 to 2 GB
runner.cache.s3.max-uploaded-archive-size=26843545608

The cache path follows this format: {serverAddress}/{bucketName}/{path}/runner/{runnerUUID}/project/{projectUUID}/{cacheKey}

If runner.cache.s3.path isn't specified, /{path} is omitted from the path.

If runner.cache.s3.shared = true, /runner/{runnerUUID} is omitted from the path.

The account with specified keys must have read/write/create permissions for objects in the Bucket.

Automatic Translation!

This page was automatically translated. The text may contain inaccuracies