celerity/workflow
v2024-07-22 (draft)
blueprint transform: celerity-2024-07-22
The celerity/workflow
resource type is used to define a workflow that orchestrates the execution of multiple handlers in a blueprint as a series of steps.
Workflows can be deployed to different target environments. Serverless environments will use the cloud provider's workflow service, such as AWS Step Functions, Google Cloud Workflows, or Azure Logic Apps. Containerised and custom server environments will use the Celerity workflow runtime to execute the workflow steps.
Workflows can be used to define complex logic that requires multiple steps to be executed in a specific order. Handlers can be used for states which have the type executeStep
. Workflows can also be used to define error handling and retries for each step to create robust and fault-tolerant applications.
Specification
The specification is the structure of the resource definition that comes under the spec
field of the resource in a blueprint.
The rest of this section lists fields that are available to configure the celerity/workflow
resource followed by examples of different configurations for the resource type, a section outlining the behaviour in supported target environments along with additional documentation.
The specification for a workflow is influenced by the concept of a workflow as a state machine emphasised by Amazon's States Language created for AWS Step Functions.
startAt (required)
The name of the state used to begin execution of the workflow.
type
string
states (required)
A mapping of state names to state configurations that make up the state machine of the workflow. A state configuration resembles a state in a state machine which describes decisions, transitions or the step to be executed when the state is entered.
type
mapping[string, state]
Annotations
Annotations define additional metadata that can determine the behaviour of the resource in relation to other resources in the blueprint or to add behaviour to a resource that is not in its spec.
celerity/vpc
🔗 celerity/workflow
The following are a set of annotations that determine the behaviour of a workflow in relation to a VPC. Appropriate security groups are managed by the VPC to workflow link.
When a VPC is not defined for the container-backed AWS, Google Cloud and Azure target environments, the default VPC for the account will be used.
VPC annotations and links do not have any effect for serverless environments. Serverless workflows are managed by cloud provider services and do not support customer-defined VPC configurations.
When a VPC is not defined for container-backed cloud environments, annotations in the celerity/workflow
will apply to the default VPC for the account.
celerity.workflow.vpc.subnetType
The kind of subnet that the workflow should be deployed to.
type
string
allowed values
public
| private
default value
public
- When a VPC links to a workflow, the workflow will be deployed to a public subnet.
celerity.workflow.vpc.lbSubnetType
The kind of subnet that the load balancer for the workflow should be deployed to. This is only relevant when the workflow is deployed to an environment that requires load balancers to interact with workflows via the API.
type
string
allowed values
public
| private
default value
public
- When a VPC links to a workflow, the load balancer for the workflow will be deployed to a public subnet.
Outputs
id
The ID of the created workflow in the target environment.
type
string
examples
arn:aws:states:us-east-1:123456789012:stateMachine:Example-StateMachine
(AWS Step Functions)
projects/123456789012/locations/us-central1/workflows/example-workflow
(Google Cloud Workflows)
691802b3-351e-4599-8785-2353ce970613
(Celerity Workflow Runtime)
Data Types
state
FIELDS
type (required)
The type of state for the workflow.
executeStep
is for a state that executes a handler.pass
is for a state that passes the input to the output without doing anything, a pass step can inject extra data into the output.parallel
is for a state that executes multiple steps in parallel. Depending on the target environment, parallel execution may be limited by available compute resources. Read more about the limitations of parallel execution in the Celerity workflow runtime.wait
is for a state that waits for a specific amount of time before transitioning to the next state.decision
is for a state that makes a decision on the next state based on the output of a previous state.failure
is for a state that indicates a specific failure state in the workflow, this is a terminal state.success
is for a state that indicates the end of the workflow, this is a terminal state.
field type
string
allowed values
executeStep
| pass
| parallel
| wait
| decision
| failure
| success
description
A human-readable description for the state.
field type
string
inputPath
The input path to use as the root input for the state. This is useful for extracting a subset of the input object to use in the current state. The supported path syntax is described in the Path Syntax section.
When not set, the input to the state is the entire input object from the previous state's output or from the initial input for the workflow.
Input path is only supported for executeStep
, pass
, wait
, decision
, success
and parallel
state types.
field type
string
examples
$.metadata
resultPath
The result path determines the location in the output object where the result of the state will be placed. This is useful for merging results into the input object instead of replacing it. The supported path syntax is described in the Path Syntax section.
The JSON Path Injections section provides more information as to how JSON paths behave in this context.
For example, for an input object:
{
"value": 10
}
A resultPath of $.result
would result in the following output object:
{
"value": 10,
"result": "some result"
}
When not set, the output of the state will be the entire output object passed into the next state.
This field should not be set if the outputPath
field is set.
Result path is only supported for executeStep
, pass
and parallel
state types.
field type
string
examples
$.sum
$.result
outputPath
The output path determines the location of a value of the result to be passed to the next state. This is useful for extracting a subset of the result object to pass to the next state. The supported path syntax is described in the Path Syntax section.
For example, for an input object:
{
"id": "82612fc2-dc3f-4fee-a65b-a8721aec71b1",
"metadata": {
"value": 10
}
}
An outputPath of $.metadata
would result in the following output object:
{
"value": 10
}
When not set, the output of the state will be the entire result object passed into the next state.
This field should not be set if the resultPath
field is set.
Output path is only supported for executeStep
, pass
, wait
, decision
, success
and parallel
state types.
field type
string
examples
$.sum
$.result
payloadTemplate
A template for the payload to be passed into the handler configured to be executed for the current state.
When not set, the entire input object from the previous state's output or from the initial input for the workflow will be passed as the payload to the handler.
A payload template is a mapping structure that can be used to construct a payload. The payload template can contain literal values, nested mappings, arrays, paths to values in the input object and functions applied to input values.
See the Payload Template section for more information on how to construct a payload template.
This field is only supported for executeStep
, pass
, and parallel
state types.
field type
mapping[string, any]
next (conditionally required)
The next state to transition to after the current state is executed.
This field is required for executeStep
, pass
, parallel
and wait
state types if end
is not set to true
.
This field can not be used for failure
, decision
or success
states.
field type
string
end
Marks the state as the end of a workflow or parallel branch.
This is equivalent to adding a success
or failure
terminal state to a workflow or parallel branch.
The field can be used for any state type except for the failure
, decision
or success
states.
field type
string
decisions (conditionally required)
A list of decision rules that determine the next state to transition to based on the output of a previous state.
Each decision rule will be evaluated in order until a rule is found where the output from the previous state matches the rule's condition.
This field is required for the decision
state type.
The field can not be used for any other state types.
field type
array[decisionRule]
result
Result, when provided, is treated as the output as the result of the state,
as if it was the result from a handler execution.
When result
is not set, the output is the input to the pass state that is subject to modification by the resultPath
field.
This field is optional for the pass
state type.
The field can not be used with any other state types.
field type
any
timeout
The maximum amount of time in seconds that the state is allowed to complete execution of a handler.
If the state runs for longer than the timeout, the state will fail with the Timeout
error name.
This field is optional for the executeStep
state type.
The field can not be used with any other state types.
field type
integer
default value
60
waitConfig (conditionally required)
State configuration specific to the wait
state type.
This field is required for the wait
state type, it can not be used with any other state types.
field type
failureConfig (conditionally required)
State configuration specific to the failure
state type.
This field is required for the failure
state type, it can not be used with any other state types.
field type
parallelBranches (conditionally required)
A list of branches to be executed in parallel.
The actual execution model will vary based on the target environment, see the limitations section for more information on how parallel execution is accomplished in the Celerity workflow runtime.
This field is required for the parallel
state type, it can not be used for any other state types.
The output of each branch will be merged into an array with a single element for each branch. The order of the results in the output array will match the order of the branches defined in this field.
field type
array[parallelBranch]
retry
Retry configuration for the state when the handler execution fails.
This field is only supported for executeStep
and parallel
state types.
field type
array[retryConfiguration]
catch
Configuration to provide fallback states for specific error types.
This field is only supported for executeStep
and parallel
state types.
field type
array[catchConfiguration]
decisionRule
FIELDS
and (conditionally required)
A boolean expression consisting of an array of conditions where all must be true to advance to the state specified in the next
field.
This field is required if the or
, not
, and condition
fields are not set.
field type
array[condition]
or (conditionally required)
A boolean expression consisting of an array of conditions where only one has to be true to advance to the state specified in the next
field.
This field is required if the and
, not
, and condition
fields are not set.
field type
array[condition]
not (conditionally required)
A boolean expression consisting of a single condition for which the negation of the condition must be true to advance to the state specified in the next
field.
This field is required if the and
, or
, and condition
fields are not set.
field type
condition (conditionally required)
A single condition that must be true to advance to the state specified in the next
field.
This field is required if the and
, or
, and not
fields are not set.
field type
next (required)
The next state to transition to if the condition is met.
field type
string
condition
FIELDS
inputs (required)
A list of one or more inputs to be evaluated for the condition, the number of inputs provided will depend on the number of arguments expected by the function
field.
An input can be a path or a literal value.
See the Path Syntax section for more information on how to specify paths.
field type
array[string | integer | float | boolean | array | object | null]
examples
["$.value", 20]
["$.result.id", "someConstant"]
[594.34, false, null]
- "someConstant"
- 20
- mapping:
key1: "value1"
key2: "value2"
- ["$.value", 20]
- [1, 3, 5, 10, 12, 15, 20]
function (required)
The function to be used to evaluate the condition.
Functions can take one or more input arguments and always return a boolean value.
The list of available functions for this version of the workflow resource type can be found in the Condition Functions section.
field type
string
examples
is_present
eq
parallelBranch
FIELDS
startAt (required)
The name of the state used to begin execution of the branch.
field type
string
states (required)
A mapping of state names to state configurations that make up a branch of the parallel state. A state configuration resembles a state in a state machine which describes decisions, transitions or the step to be executed when the state is entered.
States in a branch must only have a next
field that points to a state within the same branch.
States outside of a branch can not transition to a state within a branch.
When there is a failure in any branch, due to an unhandled error or by transitioning to a failure state, the entire parallel state will fail.
When a branch transitions to a success state, only that branch will terminate as a result. A success state will pass its input through to the output that may be modified by the inputPath
and outputPath
fields.
State types that are allowed in a parallel branch are executeStep
, pass
, wait
, decision
, failure
, and success
. Nested parallel states are not supported.
State names must be unique within the entire workflow definition so that handlers can be wired up to the correct states via workflow to handler link annotations.
type
mapping[string, state]
waitConfiguration
FIELDS
seconds (conditionally required)
Number of seconds to wait before transitioning to the next state.
This does not need to be hardcoded, it can be a path to a value in the input object.
This is required if the timestamp
field is not set.
field type
string
examples
10
$.waitTime
timestamp (conditionally required)
The timestamp to wait until before transitioning to the next state. The timestamp must be in the format of the RFC3339 profile for the ISO 8601 standard.
The timestamp does not need to be hardcoded, it can be a path to a value in the input object.
This is required if the seconds
field is not set.
field type
string
examples
2024-07-22T12:00:00Z
2024-07-22T12:00:00.123Z
$.expires
failureConfiguration
FIELDS
error
The error name to be used for the failure state. This can be a path or a literal value.
field type
string
examples
ResourceNotFound
$.error
cause
A description of the cause of the failure. This can be a path or a literal value.
field type
string
examples
Order not found in the system
$.cause
retryConfiguration
FIELDS
matchErrors (required)
A list of error names that should be matched in order to trigger a retry.
See the errors section for a list of built-in errors that can be matched against.
Custom error names may also be used to match specific errors thrown by handlers.
field type
array[string]
examples
["Timeout", "ResourceNotFound"]
["*"] # Retry on all errors
interval
A positive integer value representing the number of seconds to wait before the first attempt to retry a failed state.
field type
integer
default value
3
maxAttempts
A positive integer value representing the maximum number of attempts to retry a failed state.
A value of 0
is valid for this field; this indicates that the state should not be retried on a specific error.
field type
integer
default value
5
maxDelay
A positive integer value representing the maximum interval in seconds to wait between retries.
field type
integer
jitter
A boolean value that determines whether to apply jitter to the retry interval. This AWS blog post from 2015 provides a good insight into how works: https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/
The implementation of jitter varies across target environments.
field type
boolean
default value
false
backoffRate
A floating point value representing a multiplier that increases the retry interval on each attempt. This AWS blog post from 2015 provides a good insight into how exponential backoff works: https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/
field type
float
default value
2.0
catchConfiguration
FIELDS
matchErrors (required)
A list of error names that should be matched in order to trigger a retry.
See the errors section for a list of built-in errors that can be matched against.
Custom error names may also be used to match specific errors thrown by handlers.
field type
array[string]
examples
["Timeout", "ResourceNotFound"]
["*"] # Retry on all errors
next (required)
The name of the next state to transition to if the error matches any of the errors in the matchErrors
field.
field type
string
resultPath
The path to the output field to inject the error information into the input object passed to the state defined
in the next
field.
The JSON Path Injections section provides more information as to how JSON paths behave in this context.
For example, for an input object:
{
"value": 10
}
A resultPath of $.errorInfo
would result in the following output object:
{
"value": 10,
"errorInfo": {
"error": "Some error message",
"cause": "Some cause"
}
}
When not set, the error information will be the entire output object passed into the next state.
field type
string
examples
$.errorInfo
Linked From
celerity/vpc
Depending on the target environment, the workflow may be deployed to a VPC for private access. When deploying to serverless environments such as AWS API Step Functions, Google Cloud Workflows, or Azure Logic Apps, the workflow will be managed by the cloud provider and will not be deployed to a VPC.
celerity/handler
When a handler links out to a workflow, it will be configured with permissions and environment variables that enable the handler to trigger the workflow. If a secret store is associated with the handler or the application that it is a part of, the workflow configuration will be added to the secret store instead of environment variables. You can use guides and templates to get an intuition for how to source the configuration and trigger the workflow using the handlers SDK.
Links To
celerity/handler
Handlers are used to execute steps in a workflow. In the context of a workflow, a step is the state of the executeStep
type. The handler is configured with annotations that determine the state in the workflow that will trigger the handler.
celerity/secrets
Secrets can be used to store configuration and sensitive information such as API keys, database passwords, and other credentials that are used by the application. A workflow can link to a secret store to access secrets at runtime, linking an application to a secret store will automatically make secrets accessible to all handlers in the application without having to link each handler to the secret store.
Examples
Simple Workflow
version: 2023-04-20
transform: celerity-2024-07-22
resources:
videoIngestWorkflow:
type: "celerity/workflow"
linkSelector:
byLabel:
application: "videoIngest"
spec:
startAt: "downloadVideo"
states:
downloadVideo:
type: "executeStep"
description: "Download the video from the source URL provided in the input metadata."
inputPath: "$.metadata"
payloadTemplate:
videoUrl: "$.videoUrl"
next: "scanVideo"
scanVideo:
type: "executeStep"
description: "Scans the video for any malicious content."
resultPath: "$.scanResult"
next: "processVideo"
processVideo:
type: "executeStep"
description: "Process the video to create a thumbnail and extract metadata."
inputPath: "$.media"
payloadTemplate:
videoFilePath: "$.downloaded.path"
next: "success"
success:
type: "success"
description: "The video has been successfully processed."
downloadVideoHandler:
type: "celerity/handler"
metadata:
displayName: "Download Video Handler"
annotations:
celerity.handler.workflow.state: "downloadVideo"
labels:
application: "videoIngest"
spec:
handlerName: DownloadVideoHandler-v1
codeLocation: "handlers/video-ingest"
handler: "downloadVideoHandler"
runtime: "nodejs20.x"
memory: 1024
timeout: 60
tracingEnabled: true
scanVideoHandler:
type: "celerity/handler"
metadata:
displayName: "Scan Video Handler"
annotations:
celerity.handler.workflow.state: "scanVideo"
labels:
application: "videoIngest"
spec:
handlerName: ScanVideoHandler-v1
codeLocation: "handlers/video-ingest"
handler: "scanVideoHandler"
runtime: "nodejs20.x"
memory: 2048
timeout: 120
tracingEnabled: true
processVideoHandler:
type: "celerity/handler"
metadata:
displayName: "Process Video Handler"
annotations:
celerity.handler.workflow.state: "processVideo"
labels:
application: "videoIngest"
spec:
handlerName: ProcessVideoHandler-v1
codeLocation: "handlers/video-ingest"
handler: "processVideoHandler"
runtime: "nodejs20.x"
memory: 1024
timeout: 60
tracingEnabled: true
Complex Workflow
version: 2023-04-20
transform: celerity-2024-07-22
resources:
docProcessingWorkflow:
type: "celerity/workflow"
linkSelector:
byLabel:
application: "docProcessor"
spec:
startAt: "fetchDocument"
states:
fetchDocument:
type: "executeStep"
description: "Fetch the document provided in the input data."
inputPath: "$.document"
payloadTemplate:
url: "$.url"
resultPath: "$.downloaded"
retry:
- matchErrors: ["Timeout"]
interval: 5
maxAttempts: 3
jitter: true
backoffRate: 1.5
catch:
- matchErrors: ["*"]
next: "handleError"
resultPath: "$.errorInfo"
next: "scanDocument"
scanDocument:
type: "executeStep"
description: "Scans the document for any malicious content."
resultPath: "$.scanResult"
retry:
- matchErrors: ["Timeout"]
interval: 5
maxAttempts: 3
jitter: true
backoffRate: 1.5
catch:
- matchErrors: ["*"]
next: "handleError"
resultPath: "$.errorInfo"
next: "scanResultDecision"
scanResultDecision:
type: "decision"
description: "Decide the next state based on the scan result."
decisions:
- condition:
function: "eq"
inputs: ["$.scanResult", "Clean"]
next: "documentProcessingDecision"
- condition:
function: "eq"
inputs: ["$.scanResult", "Malicious"]
next: "maliciousContentFound"
documentProcessingDecision:
type: "decision"
description: "Decide which document processing step to execute based on the document type."
inputPath: "$.downloaded"
decisions:
- condition:
function: "string_match"
inputs: ["$.path", "*.pdf"]
next: "processPDF"
- condition:
function: "string_match"
inputs: ["$.path", "*.docx"]
next: "processDOCX"
processPDF:
type: "executeStep"
description: "Process the PDF document to extract text and metadata."
inputPath: "$.downloaded"
payloadTemplate:
filePath: "$.path"
resultPath: "$.extractedDataFile"
retry:
- matchErrors: ["Timeout"]
interval: 5
maxAttempts: 3
jitter: true
backoffRate: 1.5
catch:
- matchErrors: ["*"]
next: "handleError"
resultPath: "$.errorInfo"
next: "waitForProcessing"
processDOCX:
type: "executeStep"
description: "Process the word document to extract text and metadata."
inputPath: "$.downloaded"
payloadTemplate:
filePath: "$.path"
resultPath: "$.extractedDataFile"
retry:
- matchErrors: ["Timeout"]
interval: 5
maxAttempts: 3
jitter: true
backoffRate: 1.5
catch:
- matchErrors: ["*"]
next: "handleError"
resultPath: "$.errorInfo"
next: "waitForProcessing"
waitForProcessing:
type: "wait"
waitConfig:
seconds: "120"
next: "uploadToSystem"
uploadToSystem:
type: "executeStep"
description: "Upload the extracted data to the system."
retry:
- matchErrors: ["Timeout"]
interval: 5
maxAttempts: 3
jitter: true
backoffRate: 1.5
catch:
- matchErrors: ["*"]
next: "handleError"
resultPath: "$.errorInfo"
next: "success"
handleError:
type: "executeStep"
description: "Handle any error that occurred during the workflow, persisting status and error information to the domain-specific database."
inputPath: "$.errorInfo"
next: "failureDecision"
failureDecision:
type: "decision"
description: "Choose the failure state to transition to based on the error type."
inputPath: "$.errorInfo"
decisions:
- condition:
function: "eq"
inputs: ["$.error", "DocumentFetchError"]
next: "fetchFailure"
- condition:
function: "eq"
inputs: ["$.error", "DocumentScanError"]
next: "scanFailure"
- or:
- function: "eq"
inputs: ["$.error", "ExtractPDFError"]
- function: "eq"
inputs: ["$.error", "PDFLoadError"]
next: "processPDFFailure"
- or:
- function: "eq"
inputs: ["$.error", "ExtractDOCXError"]
- function: "eq"
inputs: ["$.error", "DOCXLoadError"]
next: "processDOCXFailure"
- condition:
function: "eq"
inputs: ["$.error", "UploadToSystemError"]
next: "uploadToSystemFailure"
success:
type: "success"
description: "The document has been successfully processed."
fetchFailure:
type: "failure"
description: "The document could not be fetched."
failureConfig:
error: "DocumentFetchError"
cause: "The document could not be fetched from the provided URL."
scanFailure:
type: "failure"
description: "An error occurred while scanning the document."
failureConfig:
error: "DocumentScanError"
cause: "An error occurred while scanning the document."
maliciousContentFound:
type: "failure"
description: "Malicious content was found in the document."
failureConfig:
error: "MaliciousContentFound"
cause: "Malicious content was found in the document."
processPDFFailure:
type: "failure"
description: "An error occurred while processing the PDF document."
failureConfig:
error: "PDFProcessingError"
cause: "An error occurred while processing the PDF document."
processDOCXFailure:
type: "failure"
description: "An error occurred while processing the word document."
failureConfig:
error: "DOCXProcessingError"
cause: "An error occurred while processing the word document."
uploadToSystemFailure:
type: "failure"
description: "An error occurred while uploading the extracted data to the system."
failureConfig:
error: "UploadToSystemError"
cause: "An error occurred while uploading the extracted data to the system."
fetchDocumentHandler:
type: "celerity/handler"
metadata:
displayName: "Fetch Document Handler"
annotations:
celerity.handler.workflow.state: "fetchDocument"
labels:
application: "documentProcessor"
spec:
handlerName: FetchDocumentHandler-v1
codeLocation: "handlers/doc-processor"
handler: "fetch_document"
runtime: "python3.12.x"
memory: 1024
timeout: 60
tracingEnabled: true
scanDocumentHandler:
type: "celerity/handler"
metadata:
displayName: "Scan Document Handler"
annotations:
celerity.handler.workflow.state: "scanDocument"
labels:
application: "documentProcessor"
spec:
handlerName: ScanDocumentHandler-v1
codeLocation: "handlers/doc-processor"
handler: "scan_document"
runtime: "python3.12.x"
memory: 1024
timeout: 60
tracingEnabled: true
processPDFHandler:
type: "celerity/handler"
metadata:
displayName: "PDF Processing Handler"
annotations:
celerity.handler.workflow.state: "processPDF"
labels:
application: "documentProcessor"
spec:
handlerName: ProcessPDFHandler-v1
codeLocation: "handlers/doc-processor"
handler: "process_pdf"
runtime: "python3.12.x"
memory: 4096
timeout: 60
tracingEnabled: true
processDOCXHandler:
type: "celerity/handler"
metadata:
displayName: "Word Document Processing Handler"
annotations:
celerity.handler.workflow.state: "processDOCX"
labels:
application: "documentProcessor"
spec:
handlerName: ProcessDOCXHandler-v1
codeLocation: "handlers/doc-processor"
handler: "process_docx"
runtime: "python3.12.x"
memory: 2048
timeout: 60
tracingEnabled: true
systemUploadHandler:
type: "celerity/handler"
metadata:
displayName: "System Upload Handler"
annotations:
celerity.handler.workflow.state: "uploadToSystem"
labels:
application: "documentProcessor"
spec:
handlerName: SystemUploadHandler-v1
codeLocation: "handlers/doc-processor"
handler: "upload_extracted_data"
runtime: "python3.12.x"
memory: 1024
timeout: 30
tracingEnabled: true
Path Syntax
Paths are used to access values in the input object to the current state in the workflow.
When evaluating a path as an input for a condition, a string beginning with $
is interpreted as a path, otherwise, the value is treated as a string literal.
The path to the input field from the previous state's output that the condition will be evaluated against.
The syntax for the path is $.*
. $
represents the root input object and *
represents any key in the object.
For example, $.value
would match the value of the key value
in the input object.
Nested keys can also be accessed using the dot notation, for example, $.result.id
would match the value of the id
key in the nested result
object.
The JSONPath syntax which is implemented for paths in Celerity workflows.
Errors
There are a built-in set of errors that can be matched against in the retry and catch configuration for a state.
Built-in errors that can be retried include the following:
*
- Matches all errors that can be retried.Timeout
- The handler execution timed out.HandlerFailed
- The handler execution failed.InvalidResultPath
- The result path specified in the state configuration is invalid.InvalidOutputPath
- The output path specified in the state configuration is invalid.BranchesFailed
- One or more branches of a parallel state failed.
Built-in errors that can be caught include the following:
*
- Matches all errors that can be caught.Timeout
- The handler execution timed out.HandlerFailed
- The handler execution failed.InvalidResultPath
- The result path specified in the state configuration is invalid. It's important to note that an invalid result path error that occurs in a catcher can not be caught.InvalidOutputPath
- The output path specified in the state configuration is invalid.BranchesFailed
- One or more branches of a parallel state failed.InvalidInputPath
- The input path specified in the state configuration is invalid.InvalidPayloadTemplate
- The payload template specified in the state configuration is invalid.PayloadTemplateFailure
- An error occurred while constructing the payload using the payload template, often an issue with a template function call.NoDecisionMatched
- No decision rule matched in a decision state.InvalidCondition
- The condition specified in a decision rule is invalid.
These errors will map to the closest corresponding built-in error type in the target environment.
Error Output
When an error is caught by a state, the catcher will yield an object with the following structure in the Celerity workflow runtime:
{
"error": "ErrorName",
"cause": "Error cause message"
}
This can be added to the input object passed to the next state using the resultPath
field.
For example:
catch:
- matchErrors: ["*"]
next: "handleError"
resultPath: "$.errorInfo"
Will result in an object like the following:
{
"otherField": "some value",
"errorInfo": {
"error": "Some error message",
"cause": "Some cause"
}
}
This will be passed into the next state as the input object.
This exact error structure is only available in the Celerity workflow runtime, the error structure will map to the closest corresponding structure in serverless target environments such as AWS Step Functions, Google Cloud Workflows, and Azure Logic Apps.
JSON Path Injections
States and catchers can have a resultPath
field that is used to inject the result of the state or error into the input object passed to the next state.
There are limitations around how JSON paths behave in this context;
JSON path injection only supports injecting values into fields of the root JSON object.
The $.<field>
and $['<field>']
syntax is supported for injecting values into fields of the root object.
The behaviour outlined above is that of the Celerity workflow runtime. In serverless target environments where a cloud service such as AWS Step Functions is used to run the workflow, the behaviour may differ. Where possible, it is best to keep JSON path injections simple, where values are injected into fields in the root object.
Payload Template
The payload template is a mapping structure that can be used to construct a payload to be passed into the handler configured to be executed for the current state.
If a field value starts with $
, the value is treated as a path to a value in the input object to the current state in the workflow.
If a field value starts with func:
, the value is treated as a template function that can take literals or JSON path expressions as arguments. See the Template Functions section for more information on the available template functions.
If neither of the above conditions are met, the value is treated as a literal value.
For example, given an input object:
{
"values": [10, 405, 304, 20, 304, 20],
"flag1": true
}
A payload template of:
payloadTemplate:
value1: "$.values[0]"
restOfValues: "func:remove_duplicates($.values[-5:])"
nestedStructure:
key1: "some value"
key2: 20
flag: "$.flag1"
Will produce a payload object of:
{
"value1": 10,
"restOfValues": [405, 304, 20],
"nestedStructure": {
"key1": "some value",
"key2": 20,
"flag": true
}
}
Template Function Syntax
The syntax for template functions can be simplified to func:<functionName>(<arg1>, <arg2>, ..., <argN>)
in a concise form; func:
is the prefix that indicates the preceding value is a function call.
<functionName>
must consist of only alphanumeric characters and underscores.
<arg1>, <arg2>, ..., <argN>
can consist of scalar literal values, JSON paths to values in the input object, or other function calls.
Scalar literal values refer to string, integer, float, boolean, or null values.
The formalised grammar for the template function syntax that comes after the func:
prefix is as follows:
function call = name , "(" , function args , ")" ;
function args = [ function arg , { "," , function arg } ] ;
function arg = literal | json path | function call ;
literal = bool literal | null literal | float literal | int literal | string literal ;
json path = ? valid json path ? ;
# Lex tokens
string literal = '"' , string chars , '"' ;
bool literal = "true" | "false" ;
null literal = "null" ;
int literal = [ "-" ] , natural number ;
float literal = [ "-" ] , natural number , "." , natural number ;
natural number = { digit }- ;
string chars = { string char } ;
string char = ? utf-8 char excluding quote ? | escaped quote ;
escaped quote = "\" , '"' ;
name = start name char , name chars ;
name chars = { name char } ;
name char = letter | digit | "_" ;
start name char = letter | "_" ;
letter = ? [A-Za-z] ? ;
digit = ? [0-9] ? ;
The above is a representation of the grammar in Extended Backus-Naur form notation.
JSON paths are parsed using the syntax defined in the JSONPath specification.
Template Functions
The following section describes the functions that can be used in payload templates, the parameters they accept and the return types.
format
This function formats a string using the provided arguments.
The use of {}
in the format string will be replaced with the arguments in order.
Parameters:
string
- The format string.
N variadic arguments - The values to replace {}
in the format string.
Returns:
string
- The formatted string.
Examples:
func:format("Hello, {}!", $.name)
jsondecode
This function decodes a JSON string into an object, array or scalar value.
Parameters:
string
- The JSON string to decode.
Returns:
string | integer | float | boolean | object | array
- The decoded JSON value.
Examples:
func:jsondecode("{\"key\": \"value\"}")
jsonencode
This function encodes a value as a JSON string.
Parameters:
string | integer | float | boolean | object | array
- The value to encode as a JSON string.
Returns:
string
- The JSON encoded string.
Examples:
func:jsonencode($.config)
jsonmerge
This function merges 2 json objects together. Any duplicate keys in the second object will overwrite the keys in the first object.
Parameters:
object
- The first object to merge.object
- The second object to merge.
Returns:
object
- The merged object.
Examples:
func:jsonmerge($.object1, $.object2)
b64encode
This function base64 encodes a string.
Parameters:
string
- The string to encode.
Returns:
string
- The base64 encoded string.
Examples:
func:b64encode($.value)
b64decode
This function decodes a base64 encoded string.
Parameters:
string
- The base64 encoded string to decode.
Returns:
string
- The decoded human-readable string.
Examples:
func:b64decode($.encodedValue)
hash
This function hashes some input data using a specified algorithm.
The available hash algorithms are:
- SHA256
- SHA384
- SHA512
MD5 and SHA1 were considered in the original design but were not included due to the insecurity of these algorithms.
Parameters:
string
- The input data to hash.string
- The hash algorithm to use.
Returns:
string
- The hashed data as a hex string.
Examples:
func:hash($.value, "SHA256")
list
This function creates a list from a set of positional arguments.
Parameters:
N positional arguments - The values to create a list from.
Returns:
array
- A list of values.
Examples:
func:list(10, $.values[0], 30, $.otherValue, "Some string")
chunk_list
This function splits a list into chunks of a specified size.
Parameters:
array
- The list to split into chunks.integer
- The size of each chunk.
Returns:
array
- A 2-dimensional array of chunks.
Examples:
func:chunk_list($.values, 5)
list_elem
This function returns an element from a list at a specific index.
Parameters:
-
array
- The list to extract the element from. -
integer
- The index of the element to extract.
Returns:
string | integer | float | boolean | object | array | null
- The element at the specified index.
Examples:
func:list_elem($.values, 2)
remove_duplicates
This function removes duplicates from an array of values. The function will carry out deep equality checks for objects and arrays, performance may be significantly impacted when working with large and complex structures.
Parameters:
array
- The array of values to remove duplicates from.
Returns:
array
- An array of unique values.
Examples:
func:remove_duplicates($.values)
contains
This function checks if a value is present in a list or a substring is present in a string. This will carry out deep equality checks for objects and arrays.
Parameters:
-
string | array
- The list or string to check for the value or substring. -
string | integer | float | boolean | object | array
- The value or substring to check for.
Returns:
boolean
- true
if the value or substring is present, false
otherwise.
Examples:
func:contains($.numbers, 10)
split
This function splits a string into an array of substrings based on a delimiter.
Parameters:
string
- The string to split.string
- The delimiter to split the string by.
Returns:
array
- An array of substrings.
Examples:
func:split($.value, ",")
math_rand
This function generates a random number between a minimum and maximum value. The random number generated is an integer and the provided parameters must be integers.
This will not generate a cryptographically secure random number,
math_rand
should not be used in security-sensitive contexts.
Parameters:
integer
- The minimum value for the random number. (inclusive)integer
- The maximum value for the random number. (exclusive)
Returns:
integer
- A random number between the minimum and maximum values.
Examples:
func:math_rand(0, 100)
math_add
This function will add 2 numbers together.
Parameters:
integer | float
- The first number to add.integer | float
- The second number to add.
Returns:
float
- The result of the addition as a floating point number.
Examples:
func:math_add(10, 20)
math_sub
This function will subtract the second number from the first.
Parameters:
integer | float
- The first number to subtract from.integer | float
- The second number to subtract.
Returns:
float
- The result of the subtraction as a floating point number.
Examples:
func:math_sub(20, 10)
math_mult
This function will multiply 2 numbers together.
Parameters:
integer | float
- The first number to multiply.integer | float
- The second number to multiply.
Returns:
float
- The result of the multiplication as a floating point number.
Examples:
func:math_mult(10, 20)
math_div
This function will divide a number by another.
Parameters:
integer | float
- The first number to divide.integer | float
- The second number to divide by.
Returns:
float
- The result of the division as a floating point number.
Examples:
func:math_div($.result[0].total, 10)
len
This function will return the length of a string or an array.
Parameters:
string | array
- The string or array to get the length of.
Returns:
integer
- The length of the string or array.
Examples:
func:len($.values)
uuid
This function generates a random UUID as per the UUID version 4 specification.
Parameters:
The uuid
function does not take any parameters.
Returns:
string
- An ID in the UUID version 4 format.
Examples:
func:uuid()
nanoid
This function provides a way to generate an ID that is shorter than a UUID while maintaining guarantees of uniqueness.
See the nanoid project repo for more information on how the IDs are generated.
The nanoid
function is not supported in all target environments.
Parameters:
The nanoid
function does not take any parameters.
Returns:
string
- A unique ID in the nanoid format.
Examples:
func:nanoid()
Condition Functions
The following section describes the functions that can be used in conditions for decision states and the parameters they accept. All functions return a boolean value.
is_present
This function checks if the input is present in the input object to the current state in the workflow.
Parameters:
string
- The path to the input field to carry out a presence check. This must be a valid path of the$.*
form laid out in the Path Syntax section.
is_null
This function checks if the input is null
.
Parameters:
any
- The value to check if it isnull
. This can be a path or a literal value.
is_numeric
This function checks if the input is a numeric value. A numeric value can be an integer or a floating-point number.
Parameters:
any
- The value to check if it is numeric. This can be a path or a literal value.
is_string
This function checks if the input is a string.
Parameters:
any
- The value to check if it is a string. This can be a path or a literal value.
is_boolean
This function checks if the input is a boolean value.
Parameters:
any
- The value to check if it is a boolean. This can be a path or a literal value.
is_timestamp
This function checks if the input is a timestamp. Timestamps are strings that must conform to the RFC3339 profile of the ISO 8601 standard.
Parameters:
any
- The value to check if it is a timestamp. This can be a path or a literal value.
eq
This function checks if two values are equal.
If the values are not of the same type, the function will always return false
.
For example, 1
is not considered equal to "1"
.
Parameters:
string | integer | float | boolean | null
- The first value to compare, can be a path or a literal value.string | integer | float | boolean | null
- The second value to compare, can be a path or a literal value.
lt
This function checks if the first value is less than the second value. The values must be numeric, either integers or floating-point numbers.
Parameters:
integer | float
- The first value to compare, can be a path or a literal value.integer | float
- The second value to compare, can be a path or a literal value.
gt
This function checks if the first value is greater than the second value. The values must be numeric, either integers or floating-point numbers.
Parameters:
integer | float
- The first value to compare, can be a path or a literal value.integer | float
- The second value to compare, can be a path or a literal value.
lte
This function checks if the first value is less than or equal to the second value. The values must be numeric, either integers or floating-point numbers.
Parameters:
integer | float
- The first value to compare, can be a path or a literal value.integer | float
- The second value to compare, can be a path or a literal value.
gte
This function checks if the first value is greater than or equal to the second value. The values must be numeric, either integers or floating-point numbers.
Parameters:
integer | float
- The first value to compare, can be a path or a literal value.integer | float
- The second value to compare, can be a path or a literal value.
timestamp_eq
This function checks if two timestamps are equal.
Timestamps are strings that must conform to the RFC3339 profile of the ISO 8601 standard.
Parameters:
string
- The first timestamp to compare, can be a path or a literal value.string
- The second timestamp to compare, can be a path or a literal value.
timestamp_lt
This function checks if the first timestamp is less than the second timestamp.
Timestamps are strings that must conform to the RFC3339 profile of the ISO 8601 standard.
Parameters:
string
- The first timestamp to compare, can be a path or a literal value.string
- The second timestamp to compare, can be a path or a literal value.
timestamp_gt
This function checks if the first timestamp is greater than the second timestamp.
Timestamps are strings that must conform to the RFC3339 profile of the ISO 8601 standard.
Parameters:
string
- The first timestamp to compare, can be a path or a literal value.string
- The second timestamp to compare, can be a path or a literal value.
timestamp_lte
This function checks if the first timestamp is less than or equal to the second timestamp.
Timestamps are strings that must conform to the RFC3339 profile of the ISO 8601 standard.
Parameters:
string
- The first timestamp to compare, can be a path or a literal value.string
- The second timestamp to compare, can be a path or a literal value.
timestamp_gte
This function checks if the first timestamp is greater than or equal to the second timestamp.
Timestamps are strings that must conform to the RFC3339 profile of the ISO 8601 standard.
Parameters:
string
- The first timestamp to compare, can be a path or a literal value.string
- The second timestamp to compare, can be a path or a literal value.
regex_match
This function checks if a string matches a regular expression pattern.
Parameters:
string
- The string to match against the regular expression pattern, can be a path or a literal value.string
- The regular expression pattern to match against.
Regex matching is not supported in all target environments.
The following environments support regex matching:
- Celerity Workflow Runtime
- Azure Logic Apps (Via inline expressions)
- Google Cloud Workflows (Via text.match_regex)
Regular expression syntax across different environments may vary, when switching target environments, you will need to ensure that the regular expression syntax is compatible with the new target environment.
string_match
This function checks if a string matches a template using *
as a wildcard character.
The *
wildcard character can match zero or more characters and can be used multiple times in a template.
Examples of valid templates would be:
*
- Matches any string.prefix.*
- Matches any string that starts withprefix.
.*.log
- Matches any string that ends with.log
.prefix*.log
- Matches any string that starts withprefix
and ends with.log
.
Only the *
character has special meaning in the template, all other characters are treated literally.
If the character *
is required to be matched literally, it must be escaped with a backslash \*
.
If the character \
is required to be matched literally, it must be escaped with another backslash \\
.
If the workflow definition is a part of a blueprint defined in a JSON file, the escaped string \\*
represents *
and \\\\
represents \
.
Parameters:
string
- The string to match against the template, can be a path or a literal value.string
- The template to match against.
String matching is not supported in all target environments.
The following environments support string matching:
- Celerity Workflow Runtime
- AWS Step Functions
- Azure Logic Apps (Via inline expressions)
Target Environments
Celerity::1
In the Celerity::1 local environment, a workflow is deployed as a containerised version of the Celerity workflow runtime in a local container orchestrator. In the local environment, the runtime is backed by an Apache Cassandra NoSQL database for storing workflow execution state. Links from VPCs to APIs are ignored for this environment as the workflow is deployed to a local container network on a developer or CI machine.
AWS
In the AWS environment, workflows are deployed as a containerised version of the Celerity workflow runtime.
Workflows can be deployed to ECS or EKS backed by Fargate or EC2 using deploy configuration for the AWS target environment.
The workflow runtime requires persistence for storing workflow execution state, this is provided by a set of DynamoDB tables that are created when the workflow is deployed. The DynamoDB tables are provisioned with on-demand capacity mode by default, this can be changed to provisioned capacity mode in the deploy configuration.
DynamoDB was chosen as the persistence layer for the workflow runtime in AWS environments as it provides a scalable and highly available NoSQL database that is more than sufficient for the requirements of the workflow runtime. This also removes the need to manage a relational database cluster for each workflow-based application that is deployed.
ECS
When a workflow is first deployed to ECS, a new cluster is created for the workflow. A service is provisioned within the cluster to run the application. A public-facing application load balancer is created to route traffic to the service, if you require private access to the workflow API, the load balancer can be configured to be internal. When domain configuration is provided and the load balancer is public-facing, an ACM certificate is created for the domain and attached to the load balancer, you will need to verify the domain ownership before the certificate can be used.
The service is deployed with an auto-scaling group that will scale the number of tasks running the workflow based on the CPU and memory usage of the tasks. The auto-scaling group will scale the desired task count with a minimum of 1 task and a maximum of 5 tasks by default. If backed by EC2, the auto-scaling group will scale the number instances based on the CPU and memory usage of the instances with a minimum of 1 instance and a maximum of 3 instances by default. Deploy configuration can be used to override this behaviour.
When it comes to networking, ECS services need to be deployed to VPCs; if a VPC is defined in the blueprint and linked to the workflow, it will be used, otherwise the default VPC for the account will be used. The load balancer will be placed in the public subnet by default, but can be configured to be placed in a private subnet by setting the celerity.workflow.vpc.lbSubnetType
annotation to private
. The service for the application will be deployed to a public subnet by default, but can be configured to be deployed to a private subnet by setting the celerity.workflow.vpc.subnetType
annotation to private
.
By default, 2 private subnets and 2 public subnets are provisioned for the associated VPC, the subnets are spread across 2 availability zones, the ECS resources that need to be associated with a subnet will be associated with these subnets, honouring the workflow subnet type defined in the annotations.
The CPU to memory ratio used by default for AWS deployments backed by EC2 is that of the t3.*
instance family. The auto-scaling launch configuration will use the appropriate instance type based on the requirements of the application, these requirements will be taken from the deploy configuration or derived from the handlers configured for the workflow. If the requirements can not be derived, the instance profile will be t3.small
with 2 vCPUs and 2GB of memory.
Fargate-backed ECS deployments use the same CPU to memory ratios allowed for Fargate tasks as per the task definition parameters.
If memory and CPU is not defined in the deploy configuration and can not be derived from the handlers, some defaults will be set. For an EC2-backed cluster, the task housing the containers that make up the service for the workflow will be deployed with 896MB of memory and 0.8 vCPUs. Less than half of the memory and CPU is allocated to the EC2 instance to allow for smooth deployments of new versions of the workflow, this is done by making sure there is enough memory and CPU available to the ECS agent. For a Fargate-backed cluster, the task housing the containers that make up the service for the workflow application will be deployed with 1GB of memory and 0.5 vCPUs.
A sidecar ADOT collector container is deployed with the workflow to collect traces and metrics for the application, this will take up a small portion of the memory and CPU allocated to the application. Traces are always collected for workflow executions, however, they are only collected for handlers when tracing is enabled for the handler.
EKS
When a workflow is first deployed to EKS, a new cluster is created for the workflow unless you specify an existing cluster to use in the deploy configuration.
When using an existing cluster, the cluster must be configured in a way that is compatible with the VPC annotations configured for the workflow as well as the target compute type.
For example, a cluster without a Fargate profile can not be used to deploy a workflow that is configured to use Fargate. Another example would be a cluster with a node group only associated with public subnets not being compatible with a workflow with the celerity.workflow.vpc.subnetType
annotation set to private
.
You also need to make sure there is enough memory and CPU allocated for node group instances to run the application in addition to other workloads running in the cluster.
The cluster is configured with a public endpoint for the Kubernetes API server by default, this can be overridden to be private in the deploy configuration. (VPC links will be required to access the Kubernetes API server when set to private)
For an EKS cluster backed by EC2, a node group is configured with auto-scaling configuration to have a minimum size of 2 nodes and a maximum size of 5 nodes by default. Auto-scaling is handled by the Kubernetes Cluster Autoscaler. The instance type configured for node groups is determined by the CPU and memory requirements defined in the deploy configuration or derived from the handlers of the workflow, if the requirements can not be derived, the instance type will be t3.small
with 2 vCPUs and 2GB of memory.
For an EKS cluster backed by Fargate, a Fargate profile is configured to run the workflow.
Once the cluster is up and running, Kubernetes services are provisioned to run the application, an Ingress service backed by an application load balancer is created to route traffic to the service, if you require private access to the workflow, the load balancer can be configured to be internal. When domain configuration is provided and the load balancer is public-facing, an ACM certificate is created for the domain and attached to the ingress service via annotations, you will need to verify the domain ownership before the certificate can be used.
When it comes to networking, EKS services need to be deployed to VPCs; if a VPC is defined in the blueprint and linked to the workflow, it will be used, otherwise the default VPC for the account will be used.
The ingress service backed by an application load balancer will be placed in the public subnet by default, but can be configured to be placed in a private subnet by setting the celerity.workflow.vpc.lbSubnetType
annotation to private
.
By default, 2 private subnets and 2 public subnets are provisioned for the associated VPC, the subnets are spread across 2 availability zones. For EC2-backed clusters, the EKS node group will be associated with all 4 subnets when celerity.workflow.vpc.subnetType
is set to public
; when celerity.api.vpc.subnetType
is set to private
, the node group will only be associated with the 2 private subnets. The orchestrator will take care of assigning one of the subnets defined for the node group.
For Fargate-backed clusters, the Fargate profile will be associated with the private subnets due to the limitations with Fargate profiles. For Fargate, the workflow will be deployed to one of the private subnets associated with the profile.
celerity.workflow.vpc.subnetType
has no effect for Fargate-based EKS deployments, the workflow application will always be deployed to a private subnet.
If memory and CPU is not defined in the deploy configuration and can not be derived from the handlers, some defaults will be set. For an EC2-backed cluster, the containers that make up the service for the workflow will be deployed with 896MB of memory and 0.8 vCPUs. Less than half of the memory and CPU is allocated to a node that will host the containers to allow for smooth deployments of new versions of the workflow, this is done by making sure there is enough memory and CPU available to the Kubernetes agents. For a Fargate-backed cluster, the pod for the application will be deployed with 1GB of memory and 0.5 vCPUs, for Fargate there are a fixed set of CPU and memory configurations that can be used.
A sidecar ADOT collector container is deployed in the pod with the workflow application to collect traces and metrics for the application, this will take up a small portion of the memory and CPU allocated to the workflow. Traces are always collected for workflow executions, however, they are only collected for handlers when tracing is enabled for the handler.
AWS Serverless
In the AWS Serverless environment, workflows are deployed to AWS Step Functions with AWS Lambda for the handlers.
When tracing is enabled for handlers, an ADOT lambda layer is bundled with and configured to instrument each handler to collect traces and metrics. AWS Step Functions traces are collected in AWS X-Ray, Step Functions specific traces can be collected in to tools like Grafana with plugins that use AWS X-Ray as a data source.
Workflows can be deployed to Step Functions using deploy configuration for the AWS Serverless target environment.
Google Cloud
In the Google Cloud environment, workflows are deployed as a containerised version of the Celerity workflow runtime.
Workflows can be deployed to Cloud Run, as well as Google Kubernetes Engine (GKE) in Autopilot or Standard mode using deploy configuration for the Google Cloud target environment.
The workflow runtime requires persistence for storing workflow execution state, this is provided by a set of Cloud Datastore entities that are created when the workflow is deployed.
Cloud Datastore was chosen as the persistence layer for the workflow runtime in Google Cloud environments as it provides a scalable and highly available NoSQL database that is more than sufficient for the requirements of the workflow runtime. This also removes the need to manage a relational database cluster for each workflow-based application that is deployed.
Cloud Run
Cloud Run is a relatively simple environment to deploy workflows to, the workflow is deployed as a containerised application that is fronted by either an internal or external load balancer.
Autoscaling is configured with the use of Cloud Run annotations through autoscaling.knative.dev/minScale
and autoscaling.knative.dev/maxScale
annotations. The knative autoscaler will scale the number of instances based on the number of requests and the CPU and memory usage of the instances. By default, the application will be configured to scale the number of instances with a minimum of 1 instance and a maximum of 5 instances. Deploy configuration can be used to override this behaviour.
When domain configuration is provided and the load balancer is public-facing and Google-managed, a managed TLS certificate is created for the domain and attached to the load balancer, you will need to verify the domain ownership before the certificate can be used.
For Cloud Run, the workflow will not be associated with a VPC, defining custom VPCs for Cloud Run applications is not supported. Creating and linking a VPC to the workflows will enable the Internal
networking mode in the network ingress settings. celerity.workflow.vpc.subnetType
has no effect for Cloud Run deployments, the application will always be deployed to a network managed by Google Cloud. Setting celerity.workflow.vpc.lbSubnetType
to private
will have the same affect as attaching a VPC to the application, making the application load balancer private. Setting celerity.workflow.vpc.lbSubnetType
to public
will have the same effect as not attaching a VPC to the workflow, making the application load balancer public. public
is equivalent to the "Internal and Cloud Load Balancing" ingress setting.
Memory and CPU resources allocated to the workflow can be defined in the deploy configuration, when not defined, memory and CPU will be derived from the handlers configured for the workflow. If memory and CPU is not defined in the deploy configuration and can not be derived from the handlers, some defaults will be set. The Cloud Run service will be allocated a limit of 1GB of memory and 1 vCPU per instance.
A sidecar OpenTelemetry Collector container is deployed in the service with the workflow to collect traces and metrics, this will take up a small portion of the memory and CPU allocated to the workflow. Traces will always be collected for Workflow executions, however, they are only collected for handlers when tracing is enabled for the handler.
GKE
When a workflow is first deployed to GKE, a new cluster is created for the workflow unless you specify an existing cluster to use in the deploy configuration.
When using an existing cluster, the cluster must be configured in a way that is compatible with the VPC annotations configured for the application as well as the target compute type.
When in standard mode, the cluster will be regional with 2 zones for better availability guarantees. A node pool is created with autoscaling enabled, by default, the pool will have a minimum of 1 node and a maximum of 3 nodes per zone. As the cluster has 2 zones, this will be a minimum of 2 nodes and a maximum of 6 nodes overall. The cluster autoscaler is used to manage scaling and choosing the appropriate instance type to use given the requirements of the workflow service. The minimum and maximum number of nodes can be overridden in the deploy configuration.
When in autopilot mode, Google manages scaling, security and node pools. Based on memory and CPU limits applied at the pod-level, appropriate node instance types will be selected and will be scaled automatically. There is no manual autoscaling configuration when running in autopilot mode, GKE Autopilot is priced per pod request rather than provisioned infrastructure, depending on the nature of your workloads, it could be both a cost-effective and convenient way to run your applications. Read more about autopilot mode pricing.
When domain configuration is provided and the load balancer that powers the Ingress service is public-facing and Google-managed, a managed TLS certificate is created for the domain and attached to the Ingress object, you will need to verify the domain ownership before the certificate can be used.
When it comes to networking, a GKE cluster is deployed as a private cluster, nodes that the pods for the workflow run on only use internal IP addresses, isolating them from the public internet. The Control plane has both internal and external endpoints, the external endpoint can be disabled from the Google Cloud/Kubernetes side. When celerity.workflow.vpc.lbSubnetType
is set to public
, an Ingress service is provisioned using an external Application Load Balancer. When celerity.workflow.vpc.lbSubnetType
is set to private
, an Ingress service is provisioned using an internal Application Load Balancer. The Ingress service is used to route traffic from the public internet to the service running the workflow.
celerity.workflow.vpc.subnetType
has no effect for GKE clusters, the application will always be deployed to a private network, the workflow API is exposed through the ingress service if the ingress is configured to be public.
If memory and CPU is not defined in the deploy configuration and can not be derived from the handlers, some defaults will be set. Limits of 1GB of memory and 0.5 vCPUs will be set for the pods that run the workflow.
The OpenTelemetry Operator is used to configure a sidecar collector container for the workflow to collect traces and metrics. Traces will always be collected for workflow executions, however, they are only collected for handlers when tracing is enabled for the handler.
Google Cloud Serverless
In the Google Cloud Serverless environment, workflows are deployed as Google Cloud Workflows with Google Cloud Functions for the handlers.
For tracing, the built-in Google Cloud metrics and tracing offerings will be used to collect traces and metrics for the handlers. Traces and metrics can be collected into tools like Grafana with plugins that use Google Cloud Trace as a data source. Logs and metrics are captured out of the box for the Cloud Workflows and will be collected in Google Cloud Logging and Monitoring. You can export logs and metrics to other tools like Grafana with plugins that use Google Cloud Logging and Monitoring as a data source.
Workflows can be deployed to Google Cloud Workflows using deploy configuration for the Google Cloud Serverless target environment.
Azure
In the Azure environment, workflows are deployed as a containerised version of the Celerity workflow runtime.
Workflows can be deployed to Azure Container Apps or Azure Kubernetes Service (AKS) using deploy configuration for the Azure target environment.
The workflow runtime requires persistence for storing workflow execution state, this is provided by Azure Cosmos DB resources that are created when the workflow is deployed. The workflow runtime uses the Cassandra API for Cosmos DB.
Cosmos DB was chosen as the persistence layer for the workflow runtime in Azure environments as it provides a scalable and highly available NoSQL database that is more than sufficient for the requirements of the workflow runtime. This also removes the need to manage a relational database cluster for each workflow-based application that is deployed.
Container Apps
Container Apps is a relatively simple environment to deploy applications to, the workflow is deployed as a containerised application that is fronted by either external HTTPS ingress or internal ingress.
Autoscaling is determined based on the number of concurrent HTTP requests for public APIs, for private workflow APIs KEDA-supported CPU and memory scaling triggers are used. By default, the scale definition is set to scale from 1 to 5 replicas per revision, this can be overridden in the deploy configuration.
When domain configuration is provided and the workflow API is configured to be public-facing, a managed TLS certificate is provisioned and attached to the Container App's HTTP ingress configuration. You will need to verify the domain ownership before the certificate can be used.
Container Apps will not be associated with a private network by default, a VNet is automatically generated for you and generated VNets are publicly accessible over the internet. Read about networking for Container Apps. When you define a VPC and link it to the workflow, a custom VNet will be provisioned and the workflow will be deployed to either a private or public subnet based on the celerity.workflow.vpc.subnetType
annotation, defaulting to a public subnet if not specified. When the celerity.workflow.vpc.lbSubnetType
is set to public
, a public HTTPS ingress is provisioned for the API; when set to private
, an internal HTTP ingress is provisioned for the workflow. Azure's built-in zone redundancy is used to ensure high availability of the workflow API.
Memory and CPU resources allocated to the workflow can be defined in the deploy configuration, when not defined, memory and CPU will be derived from the handlers configured for the workflow. If memory and CPU is not defined in the deploy configuration and can not be derived from the handlers, some defaults will be set. The Container App service will be allocated a limit of 1GB of memory and 0.5 vCPU per instance in the consumption plan, see allocation requirements.
The OpenTelemetry Data Agent is used to collect traces and metrics for the application. Traces will always be collected for workflow executions, however, they are only collected for handlers when tracing is enabled for the handler.
AKS
When a workflow is first deployed to AKS, a new cluster is created for the workflow unless you specify an existing cluster to use in the deploy configuration.
When using an existing cluster, it must be configured in a way that is compatible with the VPC annotations configured for the workflow as well as the target compute type.
The cluster is created across 2 availability zones for better availability guarantees. Best effort zone balancing is used with Azure VM Scale Sets. The cluster is configured with an autoscaler with a minimum of 2 nodes and a maximum of 5 nodes
distributed across availability zones as per Azure's zone balancing. The default node size is Standard_D4d_v5
with 4 vCPUs and 16GB of memory, this size is chosen because of the minimum requirements for system Node Pools and in the default configuration a single node pool is shared by the system and user workloads. If the CPU or memory requirements of the application mean the default node size would not be able to comfortably run 2 instances of the workflow application, a larger node size will be selected.
Min and max node count along with the node size can be overridden in the deploy configuration.
When domain configuration is provided and the load balancer that powers the Ingress service is public-facing, a TLS certificate is generated with Let's Encrypt via cert-manager to provision certificates for domains associated with Ingress resources in Kubernetes. The certificate is used by the Ingress object to terminate TLS traffic. You will need to verify the domain ownership before the certificate can be used.
When it comes to networking, the workflow will be deployed with the overlay network model in a public network as per the default AKS access mode. Read about private and public clusters for AKS.
When you define a VPC and link it to the workflow, the application will be deployed as a private cluster using the VNET integration feature of AKS where the control plane will not be made available through a public endpoint. The celerity.workflow.vpc.subnetType
annotation has no effect for AKS deployments as the networking model for Azure with it's managed Kubernetes offering is different from other cloud providers and all services running on a cluster are private by default, exposed to the internet through a load balancer or ingress controller. When celerity.workflow.vpc.lbSubnetType
is set to public
, an Ingress service is provisioned using the nginx ingress controller that uses an external Azure Load Balancer under the hood. When celerity.workflow.vpc.lbSubnetType
is set to private
, the nginx Ingress controller is configured to use an internal Azure Load Balancer, read more about the private ingress controller.
Memory and CPU resources allocated to the workflow pod can be defined in the deploy configuration, if not specified, the workflow will derive memory and CPU from handlers configured for the application. If memory and CPU is not defined in the deploy configuration and can not be derived from the handlers, some defaults will be set. The pod that runs the workflow will be allocated a limit of 1GB of memory and 0.5 vCPUs.
The OpenTelemetry Operator is used to configure a sidecar collector container for the workflow application to collect traces and metrics. Traces will always be collected for workflow executions, however, they are only collected for handlers when tracing is enabled for the handler.
Azure Serverless
In the Azure Serverless environment, workflows are deployed as Azure Logic Apps with Azure Functions for the handlers.
Azure Monitor Metrics and Azure Monitor Logs can be used as sources for traces, logs and metrics for the Logic App. This data can be exported to other tools like Grafana with plugins that use Azure Monitor as a data source. When it comes to the Azure Functions that power the endpoints, traces and metrics go to Application Insights by default, from which you can export logs, traces and metrics to other tools like Grafana with plugins that use Azure Monitor as a data source. OpenTelemetry for Azure Functions is also supported for some languages, you can use the deploy configuration to enable OpenTelemetry for Azure Functions.
Workflows can be deployed to Azure Logic Apps using deploy configuration for the Azure Serverless target environment.
⚠️ Limitations
Celerity Workflows do not support the full set of features that each cloud provider's workflow service provides. The Celerity runtime will provide a subset of features that are common across all cloud providers and are compatible with the Celerity workflow runtime.
- Parallel execution of steps is supported; however, parallel execution is limited by available compute resources when opting to deploy to an environment that runs the Celerity workflow runtime to execute workflows. In the Celerity workflow runtime, parallel execution is achieved by running multiple handlers concurrently using the tokio runtime, tokio is configured to make use of multiple threads but is primarily an asynchronous runtime that uses a concurrency model optimised for efficiently using a single core to carry out I/O bound tasks. Using parallel execution for CPU-bound tasks may not have the desired effect when using the Celerity workflow runtime.
- Only handlers can be executed as steps in a workflow, not containers or third-party services, a workflow needs to be able to run end-to-end within the Celerity Workflow runtime process when deployed to a containerised or custom server environment.
- Nested workflows are not supported. A workflow can only contain handlers as steps, to get around this, you can trigger another workflow from a handler used for one of the steps and wait for the result.
- Mapping or iterating over a list of items as a part of a previous state's output is not supported in the Celerity workflow runtime. You will have to operate in the list of items within the application code of a handler that is executed as a step in the workflow.
- Celerity workflows only support handlers of the
celerity/handler
resource type for theexecuteStep
state type. For integrations with cloud provider services, you will either need to use the cloud provider's workflow service or use a handler that can interact with the cloud provider's API. Celerity provides a library of handler templates for service integrations for cloud providers and other third-party services; see Workflow Integrations for more information.