Part 2. @Scheduled in action: The price of simplicity and the multi-instance pitfalls
The redundant execution, distributed locks, and failure recovery trade-offs that occur when applying @Scheduled to the operating environment are organized based on architecture.
Series: Spring Boot 배치 전략 완전 정복
총 12편 구성. 현재 2편을 보고 있습니다.
- 01Part 1. Nature and classification of batches: schedule, event, manual, bulk, near-real-time
- 02Part 2. @Scheduled in action: The price of simplicity and the multi-instance pitfallsCURRENT
- 03Part 3. Quartz cluster architecture: JobStore, Misfire, large-scale schedule management
- 04Part 4. Spring Batch core: Chunk, transaction boundary, restartable job design
- 05Part 5. Spring Batch Extension: Tradeoff between Partition and Multi-threaded Step
- 06Part 6. Manual Deployment Strategy: REST Triggers, Admin UI, Parameter Reprocessing, Rollback
- 07Part 7. DB Bulk Search Strategy: OFFSET/LIMIT Limits and Keyset, ID Range, Covering Index
- 08Part 8. OpenSearch/Elasticsearch Deployment Strategy: Scroll, Search After, PIT, Bulk, Rollover
- 09Part 9. Distributed environment deployment: Leader Election, Kubernetes CronJob, and lock strategy comparison
- 10Part 10. Performance Optimization: Batch Size, Commit Interval, JVM Memory, Backpressure
- 11Part 11. Failure response architecture: Partial Failure, Poison Data, DLQ, Retry, Idempotence
- 12Part 12. Integrated reference architecture and final selection guide

Source: Pexels - Modern Workspace with Digital Clock
Based on version
- Java 21
- Spring Boot 3.3.x
- Spring Batch 5.2.x
- Quartz 2.3.x
- PostgreSQL 15
- OpenSearch 2.x
1) Raise a problem
@Scheduled has the lowest implementation cost. It runs immediately inside a Spring Boot application, and there is virtually no learning curve. So in the beginning, most of them start with @Scheduled. However, as soon as the number of service instances increases from 1 to 2, the problem of the same cron running twice at the same time almost inevitably occurs.
The important question in operations is simple.
- Is it safe to execute this task repeatedly?
- Can the execution time be longer than the schedule interval?
- When restarting after an instance failure, can previous work overlap?
@Scheduled itself does not answer this question. The answer comes from locking, idempotence, and execution history architecture.
2) Summary of key concepts
Advantages of @Scheduled
- Code readability is high and adoption speed is fast.
- Can use the same DI/transaction model as the application context.
- Efficient in operating “1 to 5 tasks” in a small team.
Operating Limits
- There is no concept of leader in a cluster environment.
- Misfire policy (e.g. correction after failure to run on time) is not provided by default.
- Execution history, retries, and parallel control must be implemented directly.
Lock Strategy Comparison (Summary)
| Strategy | Advantages | Disadvantages | Recommended Situation |
|---|---|---|---|
DB lock (FOR UPDATE/advisory lock) | No additional infrastructure required, strong consistency | DB load increase | Already RDBMS-centered operation |
Redis lock (SET NX PX) | Fast lock acquisition, suitable for distributed environment | Need to consider TTL/network split | High-frequency scheduling, multiple apps |
| No lock + idempotency | Simple to implement, highly fault tolerant | Duplicate execution itself causes | Aggregation/synchronization without side effects |
The transaction isolation level usually starts with READ COMMITTED, but if "same condition re-lookup" is present, REPEATABLE READ or a lock hint is used together.
Execution flow diagram

Source: Pexels - A Man Looking at Multiple Monitors
3) Code example
Example A: @Scheduled + distributed lock wrapper
@Component
@RequiredArgsConstructor
public class SettlementScheduler {
private final DistributedLockService lockService;
private final SettlementService settlementService;
@Scheduled(cron = "0 */5 * * * *")
public void run() {
String lockKey = "batch:settlement:5m";
String lockToken = UUID.randomUUID().toString();
if (!lockService.tryLock(lockKey, lockToken, Duration.ofMinutes(4))) {
return;
}
try {
settlementService.processPending(LocalDate.now());
} finally {
lockService.unlock(lockKey, lockToken);
}
}
}
Example B: Redis lock implementation core
@Service
@RequiredArgsConstructor
public class RedisDistributedLockService implements DistributedLockService {
private final StringRedisTemplate redisTemplate;
@Override
public boolean tryLock(String key, String token, Duration ttl) {
Boolean result = redisTemplate.opsForValue().setIfAbsent(key, token, ttl);
return Boolean.TRUE.equals(result);
}
@Override
public void unlock(String key, String token) {
String current = redisTemplate.opsForValue().get(key);
if (token.equals(current)) {
redisTemplate.delete(key);
}
}
}
Example C: SQL using DB advisory lock (PostgreSQL)
-- job 이름을 hash로 변환해 advisory lock 사용
SELECT pg_try_advisory_lock(hashtext('batch:settlement:5m')) AS acquired;
-- 처리 완료 후 해제
SELECT pg_advisory_unlock(hashtext('batch:settlement:5m'));
Example D: Keyset-based failed execution history query
SELECT id, job_name, started_at, ended_at, error_code
FROM batch_job_execution
WHERE job_name = 'settlement'
AND status = 'FAILED'
AND id > :last_id
ORDER BY id ASC
LIMIT 100;
4) Real-world failure/operational scenarios
Situation: In an environment where a 5-minute cycle settlement batch ends in 4 minutes and 50 seconds, the lock update heartbeat was interrupted due to a GC pause (about 20 seconds). When the Redis TTL expired, another instance re-executed the same task, resulting in duplicate settlement.
Cause:
- Lock TTL was set only based on “average execution time”.
- The long Stop-The-World (G1 mixed GC) section was not considered in the design.
- There was no business level idempotence constraint (
UNIQUE(order_id, settlement_date)).
Response:
- Recalculate TTL as P99 execution time + GC slack time (e.g. 4m50s -> 8m).
- In case of heartbeat failure, immediately stop the operation and save the re-entrancy prevention flag.
- Add idempotent unique index to domain table.
5) Design Checklist
- Have you specified a strategy (locking or idempotency) to prevent duplicate execution in multi-instances?
- Did you calculate the lock TTL based on P99 + GC margin rather than average?
- Is it safe to re-run the same Windows when a failure occurs?
- Is the schedule execution history (success/failure/time required) saved?
- Have you intentionally selected the DB isolation level and lock hint (
SKIP LOCKED, etc.)? - Have you measured the impact of JVM heap usage and GC pause on schedule interval?
6) Summary
@Scheduled is very powerful for “small and simple deployments,” but cluster stability cannot be guaranteed without separate design. Locking alone is not enough; idempotency/execution history/failure recovery must be combined to create an operational deployment.
7) Next episode preview
In the next part, we will cover Quartz. From a practical perspective, we summarize the criteria for selecting Quartz when operating cluster mode, JobStore (RAM vs JDBC), Misfire policy, and hundreds to thousands of schedules.
Reference link
- Spring Batch Reference
- Quartz Scheduler Documentation
- PostgreSQL Transaction Isolation
- 블로그: Idempotency Key API 설계
Series navigation
- Previous post: Part 1. 배치의 본질과 분류: 스케줄, 이벤트, 수동, 대량, Near-real-time
- Next post: Part 3. Quartz 클러스터 아키텍처: JobStore, Misfire, 대규모 스케줄 관리