Vulkan Pipeline Cache

Leverage Vulkan pipeline cache to reduce pipeline initialization latency.

1. Introduction

Creating a Vulkan pipleine(render pipeline/compute pipeline) is expensive because it involves:

  • Shader compilation.
  • Pipeline state initialization, especially for rendering pipeline which includes vertex/fragment processing, tessellation, rasterization, etc.
  • Hardware-specific tuning (register allocation, scheduling, etc.)

In a normal Vulkan compute pipeline creation, Create Pipeline takes the most of the initialization time.

What Vulkan pipeline cache does is to save the pipeline state to a file so that it can be reused between runs of an application.

2. APIs

To create a pipeline cahce object:

1
2
3
4
5
VKAPI_ATTR VkResult VKAPI_CALL vkCreatePipelineCache(
VkDevice device,
const VkPipelineCacheCreateInfo* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkPipelineCache* pPipelineCache);

We need to populate the VkPipelineCacheCreateInfo to create a cached object:

1
2
3
4
5
6
7
typedef struct VkPipelineCacheCreateInfo {
VkStructureType sType;
const void* pNext;
VkPipelineCacheCreateFlags flags;
size_t initialDataSize;
const void* pInitialData;
} VkPipelineCacheCreateInfo;

vkCreateComputePipelines allows us to pass a VkPipelineCache object:

1
2
3
4
5
6
7
VKAPI_ATTR VkResult VKAPI_CALL vkCreateComputePipelines(
VkDevice device,
VkPipelineCache pipelineCache,
uint32_t createInfoCount,
const VkComputePipelineCreateInfo* pCreateInfos,
const VkAllocationCallbacks* pAllocator,
VkPipeline* pPipelines);

To read and save pipeline cache as binary data, we need vkGetPipelineCacheData:

1
2
3
4
5
VKAPI_ATTR VkResult VKAPI_CALL vkGetPipelineCacheData(
VkDevice device,
VkPipelineCache pipelineCache,
size_t* pDataSize,
void* pData);

3. Implementation

Detailed implementation of pipeline cache can be found at: https://github.com/chuzcjoe/CORE/blob/master/vulkan/tests/ComputeGaussianBlurTest.cpp

From my local testing (Mac with M2), for a gaussian blur compute pipeline creation, the latency can be optimized from 10.2ms to 1.2ms. That’s 90% optimization.

4. Pipeline Cache may not working

There are several real-world scenarios where a Vulkan pipeline cache will not be effective, even if you correctly save and reload it. The key principle is: Pipeline cache is driver and environment dependent, and only works when the exact compilation context matches.

For example, in the following cases, pipeline cache will not help with the latency improvement:

  1. Different GPUs.
  2. GPU driver versions mismached.
  3. Different GPU vendors (Nvidia, Qualcomm, AMD, etc).
  4. Any changes in pipeline states(shaders, descriptors, render pass, rasterization state, etc).
  5. And more.

Vulkan enforces strict constraints for pipeline cache effectiveness. Even on the same device, a saved pipeline cache may become invalid due to changes in the device context. Vulkan validates this by checking the following properties:

1
2
3
4
5
6
7
8
9
10
11
typedef struct VkPhysicalDeviceProperties {
uint32_t apiVersion;
uint32_t driverVersion;
uint32_t vendorID;
uint32_t deviceID;
VkPhysicalDeviceType deviceType;
char deviceName[VK_MAX_PHYSICAL_DEVICE_NAME_SIZE];
uint8_t pipelineCacheUUID[VK_UUID_SIZE];
VkPhysicalDeviceLimits limits;
VkPhysicalDeviceSparseProperties sparseProperties;
} VkPhysicalDeviceProperties;

If Vulkan detects any mismatch between the current state and the cached metadata, the pipeline cache will not function as intended.

Author

Joe Chu

Posted on

2026-04-11

Updated on

2026-04-11

Licensed under

Comments