DevOps

Scaling self-hosted n8n: Choosing your execution mode

Discover when to use queue mode vs. the default main process for scaling your self-hosted n8n instance. Make the right architectural choice for reliability.

Michael Kozak

22 Apr 2026 • 6 min read

The decision to self-host an automation platform like n8n is often driven by compelling needs: complete data sovereignty, protection of sensitive credentials, cost control at scale, and limitless customization. It places the full power of the platform within your infrastructure, which is a significant advantage for organizations with strict data privacy requirements or unique integration challenges.

However, this freedom comes with responsibility, particularly for architectural planning. An n8n instance that performs perfectly with a few dozen executions per day can quickly become a bottleneck when faced with production-level loads of thousands of events per hour. The most critical, yet often overlooked, decision that dictates your ability to scale is the choice of execution mode.

This is not just a minor configuration tweak; it is a fundamental architectural choice. In this article, we will break down the two primary execution modes for self-hosted n8n: the default main process and the scalable queue mode. We will explore the trade-offs and provide clear, practice-based criteria to help you select the architecture that ensures your automation solution is both powerful and resilient.

The default: Understanding main process execution

By default, a self-hosted n8n instance operates in what can be described as a monolithic model, using the main execution mode. In this configuration, a single Node.js process is responsible for handling every task. It serves the web interface, manages user authentication, listens for incoming webhooks, and directly executes every workflow that is triggered. This simplicity is its greatest strength, especially during initial setup and development.

For small teams, internal tool-building, or low-volume processes, this mode is perfectly adequate. The resource footprint is minimal, and the entire setup can run comfortably in a single Docker container, making it incredibly easy to deploy and manage. There are no external dependencies like message brokers to worry about, which lowers the initial operational complexity. You can get a functional automation environment up and running in minutes, which is ideal for prototyping and validating ideas.

However, this simplicity becomes a liability under load. Since one process does everything, it is a single point of failure. A computationally intensive workflow, such as a large data batch job, can consume all available CPU and memory, starving other operations. This means incoming webhook-triggered workflows might be delayed or time out completely, creating a poor experience for users and integrated systems. There is no mechanism for horizontal scaling; you can only vertically scale by giving the single process more resources, which has practical and financial limits. This mode is not designed for high-concurrency or mission-critical reliability.

The scalable alternative: queue mode architecture

For production environments that demand high performance and reliability, n8n offers queue mode. This mode transforms n8n from a monolithic application into a distributed, event-driven system. The architecture decouples the triggering of a workflow from its actual execution, which is the cornerstone of building resilient systems. Instead of the main process executing jobs directly, it acts as a lightweight API endpoint that simply accepts incoming requests (like webhooks) and places them as "jobs" onto a message queue.

This setup introduces several distinct components that work in concert. The Main Process continues to serve the UI and manage workflows, but its primary execution-related task is to enqueue jobs. The Message Queue, typically powered by a fast in-memory database like Redis, acts as a persistent buffer for these jobs. Finally, one or more dedicated Worker Processes run separately, continuously polling the queue for new jobs to execute. These workers are stateless and independent, focused solely on processing workflows.

This separation provides immense benefits. Scalability becomes horizontal; if your workload increases, you simply add more worker processes. This can even be done dynamically with container orchestration platforms like Kubernetes. Reliability is greatly improved. If a worker process crashes while executing a complex workflow, it does not affect the main instance or any other worker. The job can be requeued and attempted again, often automatically. This architecture also enables patterns like Dead-Letter Queues (DLQs), where jobs that repeatedly fail can be shunted aside for manual inspection without halting the entire system.

Key decision criteria: When to switch to queue mode

Migrating from main to queue mode is a significant architectural step. It is not a matter of "if" but "when" the benefits of scalability and resilience outweigh the costs of increased complexity. Making the decision requires evaluating your specific needs across several key axes. This is less about a single threshold and more about understanding the operational profile of your automation tasks.

Execution volume and concurrency

The most obvious driver is load. Are you processing a handful of workflows per hour, or are you handling thousands of incoming API calls and webhook events? The main process can typically handle low to moderate, steady traffic. However, it struggles with spiky, unpredictable loads. If your business relies on handling a sudden influx of orders from a marketing campaign or synchronizing data from a high-frequency source, the main process will likely drop events or become unresponsive. Queue mode is explicitly designed for this, absorbing massive spikes by letting jobs pile up in the queue, ensuring no event is lost. Workers then process this backlog at a sustainable pace.

Workflow complexity and duration

Consider the nature of your workflows. A workflow that validates a form submission and sends a Slack notification takes milliseconds. A workflow that queries a data warehouse, transforms a million records, and uploads them to a remote FTP server can run for hours. In main mode, that long-running batch job will block the execution of every other workflow. This "head-of-line blocking" is a critical failure point. In queue mode, the long-running job is just one message among many. While it occupies one worker, other workers remain free to process the short, latency-sensitive workflows, ensuring that a user-facing webhook response is not delayed by a back-office data sync.

Reliability and operational requirements

How critical is your automation? If a workflow fails to run, is it a minor inconvenience or a major business disruption? For mission-critical processes, such as order processing or infrastructure alerts, the resilience of queue mode is non-negotiable. The decoupling of components means a failure in one part of the system is less likely to cause a total outage. Furthermore, the operational overhead of queue mode, while higher, brings more robust capabilities. It necessitates proper DevOps practices, including infrastructure as code, monitoring of queue depth and worker health, and centralized logging. This investment in operational maturity is essential for any system that the business depends on.

Common pitfalls in queue mode implementation

Successfully implementing queue mode is more than just changing a configuration variable. It requires a shift in mindset and a deliberate approach to system design. We have observed several common pitfalls in projects that can undermine the very benefits you seek to achieve. Avoiding them is key to a successful, scalable deployment.

First, teams often under-provision the message broker. Redis is incredibly fast, but it is not a magic box. It requires adequate memory and CPU, and it must be configured for persistence if you want to survive restarts without losing jobs. A bottlenecked Redis instance will make your entire system slow, regardless of how many workers you have. Second is a lack of observability. Without proper monitoring, you are flying blind. You must track key metrics: queue depth (how many jobs are waiting?), job processing latency (how long does a job take?), and the error rate of workers. Dashboards and alerts for these metrics are not optional; they are essential for managing a distributed system.

Another frequent mistake is failing to design for statelessness. Workers can be terminated and replaced at any time, so workflows cannot rely on data stored on the local filesystem of a specific worker or the main process. State must be externalized to a database, a cache, or shared storage. Finally, many implementations lack a clear strategy for handling failed jobs. What happens when a workflow fails after three retries? Without a Dead-Letter Queue (DLQ) pattern, that job might be lost forever. A robust implementation routes these terminal failures to a separate queue for manual analysis and potential reprocessing.

Under-provisioning the message queue (e.g., Redis)
Lack of monitoring for queue depth and worker health
Designing stateful workflows that rely on local storage
No Dead-Letter Queue (DLQ) for handling terminal failures
Ignoring network latency between workers and dependencies
Inconsistent resource allocation for worker processes

Podsumowanie

Choosing the right execution mode for a self-hosted n8n instance is a critical decision that balances simplicity against the demands of production. The default main mode is an excellent choice for development, testing, and low-volume applications due to its simplicity and minimal overhead. However, as your execution volume, concurrency, and reliability requirements grow, it inevitably becomes a bottleneck.

Transitioning to queue mode marks the evolution of your automation platform into a mature, resilient, and scalable system. It introduces the operational overhead of managing a distributed architecture but pays significant dividends in performance and high availability. The decision to switch should be driven by a clear-eyed assessment of your workflow characteristics, business criticality, and your team's DevOps capabilities.

Choosing the right architecture for your self-hosted automation platform is a foundational decision with long-term consequences. If you are designing an automation architecture in your company, the AutomationNex.io team will gladly share its experience from n8n deployments in the context of your technology stack.