Part 2. @Scheduled in action: The price of simplicity and the multi-instance pitfalls

Source: Pexels - Modern Workspace with Digital Clock

Based on version

Java 21
Spring Boot 3.3.x
Spring Batch 5.2.x
Quartz 2.3.x
PostgreSQL 15
OpenSearch 2.x

1) Raise a problem

@Scheduled has the lowest implementation cost. It runs immediately inside a Spring Boot application, and there is virtually no learning curve. So in the beginning, most of them start with @Scheduled. However, as soon as the number of service instances increases from 1 to 2, the problem of the same cron running twice at the same time almost inevitably occurs.

The important question in operations is simple.

Is it safe to execute this task repeatedly?
Can the execution time be longer than the schedule interval?
When restarting after an instance failure, can previous work overlap?

@Scheduled itself does not answer this question. The answer comes from locking, idempotence, and execution history architecture.

2) Summary of key concepts

Advantages of `@Scheduled`

Code readability is high and adoption speed is fast.
Can use the same DI/transaction model as the application context.
Efficient in operating “1 to 5 tasks” in a small team.

Operating Limits

There is no concept of leader in a cluster environment.
Misfire policy (e.g. correction after failure to run on time) is not provided by default.
Execution history, retries, and parallel control must be implemented directly.

Lock Strategy Comparison (Summary)

Strategy	Advantages	Disadvantages	Recommended Situation
DB lock (`FOR UPDATE`/advisory lock)	No additional infrastructure required, strong consistency	DB load increase	Already RDBMS-centered operation
Redis lock (`SET NX PX`)	Fast lock acquisition, suitable for distributed environment	Need to consider TTL/network split	High-frequency scheduling, multiple apps
No lock + idempotency	Simple to implement, highly fault tolerant	Duplicate execution itself causes	Aggregation/synchronization without side effects

The transaction isolation level usually starts with READ COMMITTED, but if "same condition re-lookup" is present, REPEATABLE READ or a lock hint is used together.

Execution flow diagram

Mermaid diagram rendering...

Source: Pexels - A Man Looking at Multiple Monitors

3) Code example

Example A: `@Scheduled` + distributed lock wrapper

@Component
@RequiredArgsConstructor
public class SettlementScheduler {

    private final DistributedLockService lockService;
    private final SettlementService settlementService;

    @Scheduled(cron = "0 */5 * * * *")
    public void run() {
        String lockKey = "batch:settlement:5m";
        String lockToken = UUID.randomUUID().toString();

        if (!lockService.tryLock(lockKey, lockToken, Duration.ofMinutes(4))) {
            return;
        }

        try {
            settlementService.processPending(LocalDate.now());
        } finally {
            lockService.unlock(lockKey, lockToken);
        }
    }
}

Example B: Redis lock implementation core

@Service
@RequiredArgsConstructor
public class RedisDistributedLockService implements DistributedLockService {

    private final StringRedisTemplate redisTemplate;

    @Override
    public boolean tryLock(String key, String token, Duration ttl) {
        Boolean result = redisTemplate.opsForValue().setIfAbsent(key, token, ttl);
        return Boolean.TRUE.equals(result);
    }

    @Override
    public void unlock(String key, String token) {
        String current = redisTemplate.opsForValue().get(key);
        if (token.equals(current)) {
            redisTemplate.delete(key);
        }
    }
}

Example C: SQL using DB advisory lock (PostgreSQL)

-- job 이름을 hash로 변환해 advisory lock 사용
SELECT pg_try_advisory_lock(hashtext('batch:settlement:5m')) AS acquired;

-- 처리 완료 후 해제
SELECT pg_advisory_unlock(hashtext('batch:settlement:5m'));

Example D: Keyset-based failed execution history query

SELECT id, job_name, started_at, ended_at, error_code
FROM batch_job_execution
WHERE job_name = 'settlement'
  AND status = 'FAILED'
  AND id > :last_id
ORDER BY id ASC
LIMIT 100;

4) Real-world failure/operational scenarios

Situation: In an environment where a 5-minute cycle settlement batch ends in 4 minutes and 50 seconds, the lock update heartbeat was interrupted due to a GC pause (about 20 seconds). When the Redis TTL expired, another instance re-executed the same task, resulting in duplicate settlement.

Cause:

Lock TTL was set only based on “average execution time”.
The long Stop-The-World (G1 mixed GC) section was not considered in the design.
There was no business level idempotence constraint (UNIQUE(order_id, settlement_date)).

Response:

Recalculate TTL as P99 execution time + GC slack time (e.g. 4m50s -> 8m).
In case of heartbeat failure, immediately stop the operation and save the re-entrancy prevention flag.
Add idempotent unique index to domain table.

5) Design Checklist

Have you specified a strategy (locking or idempotency) to prevent duplicate execution in multi-instances?
Did you calculate the lock TTL based on P99 + GC margin rather than average?
Is it safe to re-run the same Windows when a failure occurs?
Is the schedule execution history (success/failure/time required) saved?
Have you intentionally selected the DB isolation level and lock hint (SKIP LOCKED, etc.)?
Have you measured the impact of JVM heap usage and GC pause on schedule interval?

6) Summary

@Scheduled is very powerful for “small and simple deployments,” but cluster stability cannot be guaranteed without separate design. Locking alone is not enough; idempotency/execution history/failure recovery must be combined to create an operational deployment.

7) Next episode preview

In the next part, we will cover Quartz. From a practical perspective, we summarize the criteria for selecting Quartz when operating cluster mode, JobStore (RAM vs JDBC), Misfire policy, and hundreds to thousands of schedules.

Part 2. @Scheduled in action: The price of simplicity and the multi-instance pitfalls

Series: Spring Boot 배치 전략 완전 정복

Based on version

1) Raise a problem

2) Summary of key concepts

Advantages of `@Scheduled`

Operating Limits

Lock Strategy Comparison (Summary)

Execution flow diagram

3) Code example

Example A: `@Scheduled` + distributed lock wrapper

Example B: Redis lock implementation core

Example C: SQL using DB advisory lock (PostgreSQL)

Example D: Keyset-based failed execution history query

4) Real-world failure/operational scenarios

5) Design Checklist

6) Summary

7) Next episode preview

Reference link

Series navigation

Comments

Based on version

1) Raise a problem

2) Summary of key concepts

Advantages of @Scheduled

Operating Limits

Lock Strategy Comparison (Summary)

Execution flow diagram

3) Code example

Example A: @Scheduled + distributed lock wrapper

Example B: Redis lock implementation core

Example C: SQL using DB advisory lock (PostgreSQL)

Example D: Keyset-based failed execution history query

4) Real-world failure/operational scenarios

5) Design Checklist

6) Summary

7) Next episode preview

Reference link

Series navigation

Comments

Advantages of `@Scheduled`

Example A: `@Scheduled` + distributed lock wrapper