How nf_conntrack Overflow Causes Intermittent UDP Tracker Downtime with Docker

Introduction

When you run a UDP BitTorrent tracker behind Docker bridge networking, the Linux kernel creates conntrack (connection tracking) entries for UDP flows that pass through Docker's NAT layer. Under sustained tracker load those entries accumulate faster than they expire, the conntrack table fills up, and the kernel starts silently dropping packets.

The result is intermittent UDP timeouts with a characteristic self-recovery cycle: the table fills, a probe gets dropped, entries expire, the table drains, the next probe succeeds, and the cycle repeats. The application log is completely silent. No error, no counter, no warning — just unexplained timeout spikes on your uptime monitor.

This post documents the mechanism behind the problem, how to diagnose it, the fix, and a reboot-persistence trap that trips many operators.

Our Experience: Repeated Incidents Across Two Demos

First Demo — DigitalOcean (2024–2025)

The first occurrence was on the original torrust/torrust-demo hosted on DigitalOcean. UDP uptime on newTrackon had been fluctuating and eventually dropped to around 60 % at peak. The investigation is documented in torrust/torrust-demo#26.

The kernel journal confirmed: nf_conntrack: table full, dropping packet with 20 million+ early_drop events on CPU 3. After increasing nf_conntrack_max, UDP uptime on newTrackon recovered to 99.2 %.

A few months later, in June 2025, the same DigitalOcean server filled the conntrack table again (uptime back down to about 90 %, with fresh nf_conntrack: table full, dropping packet messages and tens of millions of early_drop events on CPU 3). The follow-up investigation in torrust/torrust-demo#72 tried to go further than just raising the ceiling and disable conntrack for the tracker port altogether using NOTRACK rules. As the Alternative Approaches section below describes in detail, that attempt failed in our Docker setup — even after switching the tracker to --network=host mode — and ultimately required restoring a server backup. We kept the sysctl tuning and migrated the demo to Hetzner shortly afterwards.

New Tracker Demo — Hetzner (2026)

In April 2026 we migrated the Torrust Tracker Demo to Hetzner and resized the server from a CCX23 (4 vCPU, 16 GB RAM) to a CCX33 (8 vCPU, 32 GB RAM) to improve performance. The opposite happened: UDP uptime the day after the resize was 83.9 %, down from 92.2 % before the resize.

As we explain in the symptom section below, a larger server can make things worse: more processing power means more requests per second, which fills the conntrack table faster and increases the drop rate.

Investigation (tracked in torrust/torrust-tracker-demo#21) found nf_conntrack_count = nf_conntrack_max = 262144 — the table completely full — with 2478 "table full" messages in dmesg.

The fix was applied on 2026-04-20 (see torrust/torrust-tracker-demo PR #22) with all three parameters and the module pre-load. We are monitoring newTrackon for recovery data.

Confirmed outcome (2026-04-27): the 7-day post-fix observation window is complete. newTrackon reports UDP uptime at 99.9 % — above the 99.0 % target. The conntrack table stabilised at roughly 32–34 % utilisation (≈ 340 000 of 1 048 576 entries) with no table-full events in dmesg and zero IPv4 UdpRcvbufErrors. The fix held across a server reboot and at peak load (~750 UDP req/s, ~2 000 HTTP req/s). Before the fix, UDP uptime had been as low as 83.9 % on the day the conntrack table first filled (262 144 / 262 144 entries).

The Symptom

If you run a UDP tracker and observe any of the following on an uptime monitor such as newTrackon, you may be hitting conntrack exhaustion:

UDP availability drops intermittently to 60–90 % while the HTTP tracker stays healthy.
Outages are self-recovering — they resolve without any operator intervention, typically within seconds to a few minutes.
You cannot reproduce the problem by sending a single announce manually; it only appears under sustained load.
There is nothing relevant in the tracker application log, the Docker logs, or netstat / ss socket counters.
Restarting the tracker or Docker has no lasting effect — the problem returns once load resumes.
Upgrading the server to a larger instance (more CPU, more RAM) makes things worse because the tracker can now handle more requests per second, which fills the conntrack table faster.

A counter-intuitive signal: if your UDP uptime drops after you resize to a larger server, conntrack exhaustion is the likely explanation. More processing power increases request throughput, which exhausts the table sooner.

Why It's Hard to Diagnose

The standard places you look for dropped packets do not show this problem:

Application log: the tracker process never sees the dropped packet. The kernel drops it before it reaches the socket.
Socket receive-buffer drops: ss -u -s and netstat -su show socket-level drops, not kernel-level conntrack drops. They will not increment.
Firewall logs: iptables / ufw log rules fire on packets that reach the firewall. A packet dropped by the conntrack subsystem before the firewall never appears in those logs.
Docker logs: Docker has no visibility into kernel packet drops.

The primary evidence is in dmesg and conntrack counters in /proc/sys/net/netfilter/.

bash
# Look for the telltale message:
dmesg | grep -i conntrack
# nf_conntrack: table full, dropping packet

# Check current fill level:
cat /proc/sys/net/netfilter/nf_conntrack_count
cat /proc/sys/net/netfilter/nf_conntrack_max

# If count equals max, the table is full right now.

When we investigated the second occurrence on Hetzner, we found nf_conntrack_count = nf_conntrack_max = 262144 — the table was completely full at the moment of inspection — and 2478 "table full" drop messages in dmesg.

The Mechanism: Docker DNAT and Conntrack

How Docker Publishes UDP Ports

When you publish a UDP port in Docker (-p 6969:6969/udp), Docker installs a DNAT (Destination Network Address Translation) rule in iptables. This rule rewrites the destination address of every inbound packet from the host's public IP to the container's private bridge IP.

NAT requires connection tracking. The kernel must remember which packets were rewritten so it can apply the reverse translation to outbound replies. For each new UDP "flow" (unique source IP + source port combination), the kernel creates a conntrack entry.

How Entries Accumulate Under UDP Tracker Load

Unlike TCP, UDP has no handshake. The kernel cannot know when a UDP exchange is "finished", so each entry persists until a configurable timeout expires:

One-way (unreplied) UDP: default timeout is 30 seconds.
Bidirectional (replied) UDP: default timeout is 120 seconds.

A BitTorrent tracker announce is a request–response exchange, so entries are classified as bidirectional with the 120-second timeout. Each unique client IP/port pair that sends an announce holds a conntrack entry for two full minutes.

Not the same as the announce interval. The conntrack timeout is a kernel-level timer — it controls how long the NAT translation entry survives after the last packet. It is completely independent of the tracker's announce interval, which is the time the tracker tells BitTorrent clients to wait before re-announcing the same torrent. The Torrust Tracker Demo sets an announce interval of 300 seconds (5 minutes); newTrackon requires announce intervals between 5 minutes and 3 hours. Each re-announce typically arrives on a new ephemeral source port, creating a fresh conntrack entry regardless of whether the previous entry has expired.

The Calculation

The minimum conntrack table size needed to handle your request rate without dropping packets is:

minimum_table_size = requests_per_second × udp_stream_timeout_seconds

With default settings (udp_timeout_stream = 120 s) and a table size of 262 144 entries:

Maximum safe request rate = 262 144 ÷ 120 ≈ 2 186 requests/s

That sounds large, but BitTorrent clients re-announce every 30–60 minutes from a rotating pool of ports. A tracker with tens of thousands of active torrents, each with dozens of peers, easily exceeds this rate at peak times.

Reducing the stream timeout to 15 seconds multiplies the effective capacity by 8× without changing the table size:

262 144 ÷ 15 ≈ 17 476 requests/s at the default table size

Combining a larger table with a shorter timeout gives significant headroom even on a busy public tracker.

The Fix: Three Kernel Parameters

Create (or edit) /etc/sysctl.d/99-conntrack.conf with the following content (the deployed version for the Torrust Tracker Demo is at server/etc/sysctl.d/99-conntrack.conf):

ini
# Raise the table ceiling.
# Default 65536–262144 is too small under tracker load.
net.netfilter.nf_conntrack_max = 1048576

# Reduce UDP stream timeout.
# Default 120 s; a tracker announce round-trip completes in milliseconds.
net.netfilter.nf_conntrack_udp_timeout_stream = 15

# Reduce one-way UDP timeout.
# Default 30 s.
net.netfilter.nf_conntrack_udp_timeout = 10

Apply the settings immediately without rebooting:

bash
sudo sysctl --system
# or apply only this file:
sudo sysctl -p /etc/sysctl.d/99-conntrack.conf

Verify that the new values are active:

bash

sysctl net.netfilter.nf_conntrack_max
sysctl net.netfilter.nf_conntrack_udp_timeout_stream
sysctl net.netfilter.nf_conntrack_udp_timeout

The three values above are a conservative starting point. You can calculate a more precise nf_conntrack_max from your actual request rate using the formula in the previous section. Raising the table ceiling increases kernel memory usage (roughly 300–400 bytes per entry). At nf_conntrack_max = 1 048 576 that is ≈ 384 MB of kernel memory reserved for the conntrack table — trivial on a 32 GB server, but worth budgeting for on a 1–2 GB VPS.

Don't Forget the Hash Table

When you raise nf_conntrack_max by an order of magnitude, the hash bucket count does not auto-scale. The default is around 65 536 buckets; if you keep that while raising the ceiling to 1 048 576, every lookup walks long collision chains and table operations degrade from O(1) toward O(n). The recommended ratio is roughly nf_conntrack_max / 4 to nf_conntrack_max / 8.

You can tune buckets with the nf_conntrack_buckets sysctl (writeable in the initial network namespace) or set the module parameter hashsize for early-boot consistency.

bash
# Runtime (sysctl): 262144 buckets pairs well with nf_conntrack_max = 1048576
sudo sysctl -w net.netfilter.nf_conntrack_buckets=262144

# Persistent (sysctl)
echo 'net.netfilter.nf_conntrack_buckets = 262144' | sudo tee /etc/sysctl.d/98-conntrack-buckets.conf

# Optional early-boot module option (equivalent bucket size)
echo 'options nf_conntrack hashsize=262144' | sudo tee /etc/modprobe.d/nf_conntrack.conf

# Verify
sysctl net.netfilter.nf_conntrack_buckets
cat /sys/module/nf_conntrack/parameters/hashsize

Reduced Timeouts Are Global

The nf_conntrack_udp_timeout* values are kernel-wide — they apply to every UDP flow on the host, not only to tracker traffic. A 15-second stream timeout is appropriate for request–response protocols like a BitTorrent tracker, DNS resolver, or QUIC server, but it can be aggressive for long-lived UDP services such as WireGuard, IPsec, VoIP/SIP gateways, or long-running game servers. If you co-host such services, either keep the default 120 s or use NOTRACK rules (see the Alternative Approaches section) to exempt them from connection tracking entirely.

The Reboot Persistence Trap

This is where many operators get burned: you apply the fix, it works perfectly, you reboot the server, and the problem silently comes back.

The net.netfilter.nf_conntrack_* sysctl keys only exist after the nf_conntrack kernel module has been loaded. The module is loaded by Docker when Docker starts. However, systemd applies sysctl configuration at boot before Docker runs — so when systemd reads /etc/sysctl.d/99-conntrack.conf, the keys do not exist yet and the settings are silently skipped.

The fix is to instruct the kernel to pre-load the module during boot:

bash
echo "nf_conntrack" | sudo tee /etc/modules-load.d/conntrack.conf

With this in place, the module is loaded early in the boot sequence, the sysctl keys exist when systemd applies sysctl.d, and the settings take effect before Docker starts.

Always pair the sysctl config with the module pre-load. Without /etc/modules-load.d/conntrack.conf, the settings will not survive a reboot even though sysctl --system confirms they are active on the running system.

After the next reboot, verify both that the module is loaded and that the values are correct:

bash

lsmod | grep nf_conntrack
sysctl net.netfilter.nf_conntrack_max
sysctl net.netfilter.nf_conntrack_udp_timeout_stream

Alternative Approaches: Avoid the Problem Entirely

Tuning conntrack raises the ceiling, but the most fundamental fix is to stop creating conntrack entries for tracker traffic in the first place. There are three approaches worth knowing about, in order of how invasive they are.

1. Host Networking (`--network=host`)

Running the tracker container with --network=host bypasses Docker's bridge and DNAT layer entirely. The tracker binds directly to the host network namespace, so no NAT rewrite happens and no conntrack entry is created for incoming UDP packets.

This is what many high-volume public trackers do. Trade-offs: you lose Docker's network isolation between containers, port mappings (-p host:container) are ignored, and the container can collide with any other process listening on the same port on the host.

yaml
# docker-compose.yml — host networking for the UDP tracker
services:
  tracker:
    image: torrust/tracker:latest
    network_mode: host
    # 'ports:' are ignored when network_mode: host
    # The tracker binds to 0.0.0.0:6969 on the host directly.

2. `NOTRACK` on the Tracker Port

If you want to keep bridge networking for isolation, you can tell the kernel to skip connection tracking for traffic on the tracker port using a rule in the raw table. Modern Ubuntu / Debian uses iptables-nft under the hood, so the cleanest way to express these rules is directly in nftables. Add the following to /etc/nftables.conf:

bash
# /etc/nftables.conf — disable conntrack for the UDP tracker ports
table inet raw {
    chain prerouting {
        type filter hook prerouting priority raw;
        udp dport { 6868, 6969 } notrack
    }
    chain output {
        type filter hook output priority raw;
        udp sport { 6868, 6969 } notrack
    }
}

Apply and persist across reboots:

bash
sudo systemctl enable nftables    # crucial — without this the rules are not loaded at boot
sudo systemctl restart nftables
sudo nft list ruleset | grep notrack

For comparison, the equivalent classic iptables form is:

bash
sudo iptables -t raw -A PREROUTING -p udp --dport 6969 -j NOTRACK
sudo iptables -t raw -A OUTPUT     -p udp --sport 6969 -j NOTRACK
# IPv6
sudo ip6tables -t raw -A PREROUTING -p udp --dport 6969 -j NOTRACK
sudo ip6tables -t raw -A OUTPUT     -p udp --sport 6969 -j NOTRACK

With NOTRACK, packets bypass conntrack and the table never grows from tracker traffic. The catch is significant: NAT requires conntrack, so once you stop tracking these packets, Docker's automatic DNAT for the published port no longer works.

We tried this and it broke the tracker. In torrust/torrust-demo#72 we added the nftables rules above, confirmed they were active (conntrack -S showed early_drop = 0), and immediately UDP announces from newTrackon and from our own tracker_checker client started timing out. HTTP kept working. Switching the tracker container to network_mode: host (per torrust/torrust-demo#27) did not fix it either, and we eventually had to restore a server backup. A secondary problem we observed: even with port-level NOTRACK, internal Docker traffic to the tracker (statsd on 8125, healthchecks, the index calling the tracker over 127.0.0.1) was still being tracked because those flows go through the loopback / bridge interfaces, not through the public DNAT path.

The takeaway is that NOTRACK is most useful with macvlan or with a bare-metal install that does not rely on Docker's DNAT/iptables rules. With host networking, many setups do not need NOTRACK at all. In a typical multi-container Docker Compose setup it is fragile and hard to get right.

Reboot trap, again. If you do go down the nftables route, run sudo systemctl enable nftables. We hit a case where the rules in /etc/nftables.conf were syntactically valid and present on disk, but nft list ruleset came back empty after a reboot because the nftables service was not enabled.

3. `macvlan` Network Driver

The macvlan driver gives the container its own MAC address and IP on the physical LAN. Packets reach the container without NAT, so no conntrack entries are created on the host for tracker traffic. This preserves container isolation but requires more involved network setup (a parent interface in promiscuous mode, an IP plan, and a host that is allowed to claim multiple MACs — which rules out most cloud providers that filter on the upstream switch).

Why we kept Docker bridge networking on the demo. The Torrust Tracker Demo uses Docker Compose with bridge networking because the same stack also runs HTTP services behind a reverse proxy and benefits from Docker's built-in DNS service discovery between containers. For us, sysctl tuning is the right balance. For a single-purpose, high-throughput public UDP tracker, --network=host is usually the simplest and most efficient choice.

Monitoring and Verification

After applying the fix, use these commands to confirm that the table is no longer exhausting. The conntrack CLI is not installed by default on most distributions; install it first:

bash
# Debian / Ubuntu
sudo apt-get install -y conntrack

# RHEL / Fedora / Rocky / Alma
sudo dnf install -y conntrack-tools

bash
# Current fill level (watch for count approaching max)
watch -n5 'echo "count: $(cat /proc/sys/net/netfilter/nf_conntrack_count) / max: $(cat /proc/sys/net/netfilter/nf_conntrack_max)"'

# Cross-check: nf_conntrack_count should match the number of entries listed by the conntrack tool
sudo conntrack -L 2>/dev/null | wc -l
cat /proc/sys/net/netfilter/nf_conntrack_count

# Drop messages since boot
dmesg | grep -c "table full"

# Conntrack statistics per CPU (early_drop column indicates table pressure)
sudo conntrack -S

# One-liner for the drop count across all CPUs
sudo conntrack -S | awk '{for (i=1;i<=NF;i++) if ($i ~ /^early_drop=/) { split($i,a,"="); sum += a[2] } } END {print "total early_drop:", sum+0}'

The conntrack -S output includes an early_drop counter per CPU. A non-zero value means the kernel had to evict entries early to make room — a leading indicator of exhaustion before packets start dropping. If this counter is growing, you need a larger table or shorter timeouts.

On the first Torrust demo, we observed 20 million+ early_drop events on CPU 3 before the fix. After increasing nf_conntrack_max and adjusting the timeouts, the counter stabilized at zero.

Consider adding a simple alerting rule that fires when nf_conntrack_count / nf_conntrack_max > 0.8. At 80 % fill, entries are still being accepted; at 100 % they are being dropped. Catching it at 80 % gives you time to react without customer-facing impact.

Independent Documentation

This is not unique to Torrust. The ftorrent/open README — a comprehensive guide to running the Aquatic tracker in Docker — covers the same problem in its "Kernel tuning for bridge networking" section. That guide documents the same nf_conntrack_max, nf_conntrack_udp_timeout, and nf_conntrack_udp_timeout_stream fixes, and extends them with two additional parameters: net.core.rmem_max / rmem_default to size UDP socket receive buffers, and net.core.netdev_max_backlog to prevent softirq drops when Docker's veth pair adds per-packet overhead. It also covers the same reboot-persistence trap (pre-loading the nf_conntrack module) and provides matching monitoring commands.

Any UDP service that receives sustained traffic through Docker bridge networking and Docker's DNAT layer is susceptible. BitTorrent trackers happen to be a high-frequency case because every peer re-announces periodically, generating a constant stream of short request–response exchanges.

PostgreSQL Support in Torrust Tracker

Released Torrust Tracker Deployer v0.1.0

How nf_conntrack Overflow Causes Intermittent UDP Tracker Downtime with Docker

Introduction

Our Experience: Repeated Incidents Across Two Demos

First Demo — DigitalOcean (2024–2025)

New Tracker Demo — Hetzner (2026)

The Symptom

Why It's Hard to Diagnose

The Mechanism: Docker DNAT and Conntrack

How Docker Publishes UDP Ports

How Entries Accumulate Under UDP Tracker Load

The Calculation

The Fix: Three Kernel Parameters

Don't Forget the Hash Table

Reduced Timeouts Are Global

The Reboot Persistence Trap

Alternative Approaches: Avoid the Problem Entirely

1. Host Networking (`--network=host`)

2. `NOTRACK` on the Tracker Port

3. `macvlan` Network Driver

Monitoring and Verification

Independent Documentation

Further Reading

Related Posts on This Blog

Official Documentation

Lessons

How nf_conntrack Overflow Causes Intermittent UDP Tracker Downtime with Docker

Introduction

Our Experience: Repeated Incidents Across Two Demos

First Demo — DigitalOcean (2024–2025)

New Tracker Demo — Hetzner (2026)

The Symptom

Why It's Hard to Diagnose

The Mechanism: Docker DNAT and Conntrack

How Docker Publishes UDP Ports

How Entries Accumulate Under UDP Tracker Load

The Calculation

The Fix: Three Kernel Parameters

Don't Forget the Hash Table

Reduced Timeouts Are Global

The Reboot Persistence Trap

Alternative Approaches: Avoid the Problem Entirely

1. Host Networking (--network=host)

2. NOTRACK on the Tracker Port

3. macvlan Network Driver

Monitoring and Verification

Independent Documentation

Further Reading

Related Posts on This Blog

Official Documentation

Lessons

Related Posts:

How We Fixed a One-Core Packet Processing Bottleneck in Torrust Tracker

PostgreSQL Support in Torrust Tracker