Temporal architecture
Before reading this, it would be helpful to read Understanding Temporal as it assumes knowledge of the following introductory concepts: Workflow, Activity, Worker, Task Queues, and Event History.
Temporal is a tool that abstracts away a lot of difficulty for anyone managing applications that need to be resilient to failures. This guide provides you with a high level overview of how Temporal works under the hood.
Application Structure Recap
Before diving into the architecture, let's recap what you're responsible for as a developer.
As a developer, you are responsible for the Temporal Application code which includes writing the Activity Definition, Workflow Definition, and the code to configure and start the Workers that coordinate with a Temporal Service to execute your Workflow and Activity code.
As part of the Temporal Platform, we also have the Temporal Service, which runs the Temporal Server with its database and optional components, and is responsible for orchestrating execution. A Temporal application gains its durability, scalability, and reliability from the support provided by the Temporal Service.
Overall architecture diagram
To understand how Temporal works, you can start at the high level architecture view. This shows how all of the different layers interact with each other. Then you can take a deep dive into each layer.
Client application
A Temporal Client is one aspect of your Temporal application and provided by a Temporal SDK. It offers a set of APIs to communicate with a Temporal Service. You instantiate and use it in your own application code to start and manage Workflow Executions.
In the wider platform, there are three kinds of clients that talk to the Temporal Server:
- Temporal's command-line interface (CLI)
- Temporal's web-based user interface (Web UI)
- A Temporal Client embedded into the applications you run
Consider the example of an order processing system.
A Temporal Client lets you:
- Start a Workflow Execution, for example, when a customer places an order.
- Signal a Workflow Execution to update the order if the customer changes their shipping address.
- Query a Workflow Execution to retrieve the current status of the order.
- List Workflow Executions to view all orders being processed.
- Get the result of a Workflow Execution to retrieve the final outcome of the order processing.
Clients are responsible for starting Workflows by sending requests to initiate new Workflow Executions. They can query Workflows to retrieve current state or synchronous data from running Workflows and signal Workflows to send asynchronous messages that influence their behavior. Clients also manage Workflows by canceling, terminating, or describing Workflow Executions.
The Temporal Server
The Temporal Server is the heart of the platform. It's responsible for orchestrating Workflow Execution, maintaining state, and ensuring reliability.
The Temporal Server consists of a frontend and multiple backend services, plus a database as a required external component. A Temporal Server may also include some optional components, such as Elasticsearch for advanced search visibility or Grafana for creating operational dashboards for observability.
Frontend Service
From your application's perspective, the Frontend Service is the Temporal endpoint your Temporal Client talks to. Your client sends requests (start Workflow, Signal, Query, etc.) to the Frontend, and the Frontend forwards them to the appropriate backend services.
It handles rate limiting, authorization, validation, and routing requests internally to the right subsystem. Clients never communicate directly with backend services or Workers so everything goes through this unified entry point.
Temporal Service Backend
The backend consists of several specialized services including the History, Matching, Worker, and a Persistence Layer.
History Service
The History Service is the part of Temporal that keeps track of everything that happens in each Workflow. It writes an ordered "event history" to the database so Temporal always knows what has happened. This includes when a Workflow started, Activities ran, Signals were received, timers fired, and when it completed or failed.
The Event History (every event/step of each workflow) is the key to making your application reliable and crash-proof. When an error or failure happens in your app, Temporal will recreate the state by parsing the event history and replay each step. That's why determinism and idempotency are important when you're creating your Workflows.
The History Service is a central part of how Temporal provides durable execution. It persists all Workflow Execution state, including the event history, any mutable state, and internal task queues like timers, transfers, replication, and visibility/indexing.
Matching Service
The Matching Service manages most of the coordination with the other services, especially the Task Queues or Task Queue partitions. This is where Tasks are dispatched to their respective queues before being picked up by the corresponding Workers. To determine how many more Tasks should be sent to a Worker from the Task Queue, Worker polling also takes place here. This is how tasks are dispatched to different queues before being sent to a Worker.
When a Workflow Task or Activity needs to be done, the matching service finds an appropriate Worker polling the queue and hands off the task.
Worker Service
The Worker Service is where all of the background functionality that keeps the Temporal Service running smoothly is handled. It takes care of all the internal/system Workflows that Temporal needs like maintenance jobs, cleanup, replication, archival, and visibility indexing. You won't need to directly interact with this layer because it's different from your application Workers. The Worker Service handles Temporal's own operational tasks, like archiving old Event histories or running scheduled maintenance.
Persistence Layer
You can configure the database that you want to store Workflow state, event history, and Task Queues. You can choose between Cassandra, MySQL, PostgreSQL, and SQLite. This is also where you store metadata about your Workflows and other data that needs to be recorded in order to ensure durable execution. This persistence enables recovery and replay - if anything fails, Temporal can reconstruct the exact state of a Workflow from its Event History.
External services
External services are anything outside of your own application code and outside Temporal itself. Examples include third‑party APIs, databases that store customer data, or messaging systems.
Inside the Temporal Service, the main services (Frontend, History, Matching, and the internal Worker Service) are set up so they can be scaled independently. You can run many copies of Frontend and Matching, and the History Service spreads its work across "shards," so each part can grow based on load.
Temporal can also integrate with Elasticsearch to provide advanced search and visibility capabilities for Workflow Executions. Without it, you're limited to basic filtering. Grafana can be used to enable operational dashboards and monitoring for the Temporal Service itself.
Infrastructure
This is where the code for your Workers, Workflows, Activities, Signals, Updates, and Queries get executed on your own infrastructure. This is where you can scale up the number of Workers to increase how many Workflows can run simultaneously.
Your Workflow code is the orchestration layer that defines the structure of your application. It needs to be deterministic so Temporal can help your app survive through process crashes, outages, and other failures.
Your Activities are where the actual work happens: invoking tools, making API requests, using third-party services. These can be as unpredictable and non-deterministic as needed.
That's what makes Temporal so valuable for long-running Workflows. When your Workflow involves multiple steps like:
- Calling an endpoint
- Waiting on customer input
- Triggering an event based on that input
- Calling a third-party service
- Updating the database
- Sending info to the customer
- Sending info to an internal user
- Calling a different endpoint
You can see all of the places where a failure or outage could cause issues. That's what is meant by durable execution in Temporal: any of these steps could fail and the Workflow will recreate the chain of events leading up to the incident until it moves forward.
Failures
Temporal Failures are representations (in the SDKs and Event History) of various types of errors that occur in the system.
Failure handling is an essential part of development. For more information, including the difference between application-level and platform-level failures, see Handling Failure From First Principles. For the practical application of those concepts in Temporal, see Failure Handling in Practice.
For languages that throw (or raise) errors (or exceptions), throwing an error that is not a Temporal Failure from a Workflow fails the Workflow Task (and the Task will be retried until it succeeds), whereas throwing a Temporal Failure (or letting a Temporal Failure propagate from Temporal calls, like an Activity Failure from an Activity call) fails the Workflow Execution. For more information, see Application Failure.
