Part 2. L4 Load Balancer bind/unbind and connection lifecycle

To properly apply L4 Load Balancer, Connection Draining, Graceful Shutdown, and Zero Downtime Deployment in practice, you must first understand what bind/unbind changes at the packet level. This article organizes the connection life cycle based on the L4 -> Nginx -> App -> WebSocket structure.

Based on version

Linux Kernel 5.15+
Nginx 1.25+
HAProxy 2.8+ or equivalent L4 equipment -JVM 21

1. What bind/unbind actually changes

Key takeaways

bind registers the server as a “backend candidate capable of receiving new connections.”
unbind blocks “only new connections” and can maintain existing connections or force them to be disconnected.
What is important in operation is not the unbind itself, but the policy after unbinding (drain vs hard close).

Detailed description

From an L4 perspective, bind/unbind is a change in the set of routing destinations.

bind: Add VM to hash/round robin target pool
unbind: Remove VM from target pool
drain: Stop assigning new TCP 3-way handshake (SYN) to the VM targeted for removal.

In other words, unbind is a control plane operation, and the actual existence of a failure is determined by how the existing session is handled in the data plane.

drain on: Existing sessions wait for FIN-based normal termination
drain off or force shutdown: Increases the likelihood of RST occurring

Practical tips

Maintenance conversion is always fixed in the order unbind -> drain -> terminate.
During drain, simultaneously check whether new connections is 0 and active connections is decreasing.
Force the drain status check API into the deployment pipeline.

Common Mistakes

Call VM termination immediately after unbinding.
It is mistakenly judged as drain completion just because the health check fails.
The drain option name for each equipment is different, so the operation runbook is different.

2. Connection lifecycle: From SYN to FIN

Key takeaways

Normal termination is based on FIN, and fault termination is mainly indicated by RST or timeout.
In L4, Connection Draining is ultimately "block new SYN + wait for purge of existing ESTABLISHED".

Detailed description

Connection flow for a typical request:

Client -> L4: SYN
L4 -> Nginx (backend VM): Forward SYN
After completing the 3-way handshake ESTABLISHED
Data exchange (HTTP keepalive or WebSocket)
At the end FIN -> ACK -> FIN -> ACK

The problem comes after unbinding.

The new SYN goes to another VM,
The existing ESTABLISHED remains in the existing VM.
If the VM is forcibly shut down, it terminates with RST/timeout instead of FIN, resulting in an explosion of client errors.

Practical tips

Drain observation indicators are grouped into 3 axes: new, active, and reset.
Don’t just look at the L4 log, but also look at ESTABLISHED/CLOSE_WAIT with ss -ant in the VM.

Common Mistakes

Only look at the HTTP request success rate and miss the increase in TCP reset.
Underestimate drain time by assuming keepalive connections as short-lived requests.

# 서버별 연결 상태 확인
ss -ant | awk 'NR==1 || /:443/'

# reset 패턴 추적(커널 counters)
netstat -s | egrep -i 'reset|failed|retrans'

3. Architecture diagram: bind/unbind lifecycle

Key takeaways

The key turning point is BOUND -> DRAINING -> UNBOUND state movement.
In this section, the application must prepare Graceful Shutdown.

Detailed description

Mermaid diagram rendering...

[Before]
Client -> L4 -> VM-A (new + existing)
Client -> L4 -> VM-B (new + existing)

[After unbind VM-A + drain]
Client -> L4 -X-> VM-A (new blocked)
Client -> L4 ---> VM-B (new accepted)
Existing VM-A connections remain until FIN/timeout

Practical tips

Standardizing LB status as BOUND, DRAINING, DETACHED in internal documents reduces communication errors between operators.

Common Mistakes

Only records the drain state abstractly and does not leave transition events (time/indicators).

4. L4 vs L7 Load Balancer Selection Criteria

Key takeaways

L4 is strong in TCP session stability and performance, and L7 is strong in request unit control.
If there are many WebSocket long-lived connections, the L4-centric design is simple, but the drain policy must be stricter.

Detailed description

Item	L4 Load Balancer	L7 Load Balancer
control unit	TCP connection	HTTP request/stream
Advantages	Low overhead, high throughput	Routing/Header/Cookie Based Control
Weaknesses	request level policy limits	Proxy costs increase
WebSockets	Beneficial for maintaining sessions	Upgrade Processing Implementation Quality Matters
Types of Disabilities	RST/timeout centered	5xx/timeout centered

Practical tips

WebSocket Load Balancing generally has an L4-only or L4+Nginx (L7) mixed structure.
Zero Downtime Deployment is possible only when draining from L4 and performing Graceful Shutdown together in Nginx/App.

Common Mistakes

Increases delay by adding excessive layers even though the L7 function is not needed.
Because it is L4, it is considered safe and app-level graceful processing is omitted.

Operational Checklist

Did you switch the target VM to unbind + drain before starting maintenance?
Is new connections=0 confirmed during drain?
Doesn’t reset count increase rapidly compared to usual?
Have you confirmed the changes in ESTABLISHED, CLOSE_WAIT, and TIME_WAIT in the VM?
Do the app graceful timeout and LB drain timeout not conflict?

Summary

bind/unbind is not just a device setting, but connection lifetime control. The actual Graceful Shutdown and Zero Downtime Deployment are established only when the shutdown sequence of L4, Nginx, and App is aligned with Connection Draining as the center.

Next episode preview

In the next part, we deeply analyze the internal operation of Connection Draining from the perspective of TCP state transitions (SYN, FIN, RST, TIME_WAIT, CLOSE_WAIT).

Reference link

Previous post: Part 1. 왜 서버를 그냥 제거하면 서비스가 터질까
Next post: Part 3. Connection Draining의 내부 동작 (TCP 관점)

Part 2. L4 Load Balancer bind/unbind and connection lifecycle

Series: Graceful Drain 완벽 가이드

Part 2. L4 Load Balancer bind/unbind and connection lifecycle

Based on version

1. What bind/unbind actually changes

Key takeaways

Detailed description

Practical tips

Common Mistakes

2. Connection lifecycle: From SYN to FIN

Key takeaways

Detailed description

Practical tips

Common Mistakes

3. Architecture diagram: bind/unbind lifecycle

Key takeaways

Detailed description

Practical tips

Common Mistakes

4. L4 vs L7 Load Balancer Selection Criteria

Key takeaways

Detailed description

Practical tips

Common Mistakes

Operational Checklist

Summary

Next episode preview

Reference link

Series navigation

Comments