Part 5. Spring Batch Extension: Tradeoff between Partition and Multi-threaded Step

Source: Pexels - Server racks on data center

Based on version

Java 21
Spring Boot 3.3.x
Spring Batch 5.2.x
Quartz 2.3.x
PostgreSQL 15
OpenSearch 2.x

1) Raise a problem

When a throughput bottleneck occurs, the team almost instinctively increases threads. But parallelization in deployment is not a simple acceleration button. As the degree of parallelism increases, the following problems simultaneously increase.

Order Guaranteed Collapse
Increased lock competition
Expansion of reprocessing range in case of failure
Memory usage skyrockets

So parallelization should be chosen based on “recoverable structure in case of failure” rather than “maximum TPS”.

2) Summary of key concepts

Comparison of parallelization methods

method	Advantages	Disadvantages	suitable data
Multi-threaded Step	Simple implementation, quick application	Reader/Writer thread-safe requirements	Independent record with low order weight
Partition Step	Reduce collisions by separating sections	Partition design complex	ID range shardable bulk data
Remote Chunking	Horizontal expansion glass	Message infrastructure/operational complexity	Ultra-capacity, long-time processing

By partition

ID Range: Simplest and easy to restart.
Hash Partition: Key distribution is good, but debugging difficulty increases.
Time Window: Advantageous for logarithmic data.

Parallel processing diagram

Mermaid diagram rendering...

Source: Pexels - Drone shot of cargo containers

3) Code example

Example A: ID Range Partitioner

public class IdRangePartitioner implements Partitioner {

    private final JdbcTemplate jdbcTemplate;

    public IdRangePartitioner(JdbcTemplate jdbcTemplate) {
        this.jdbcTemplate = jdbcTemplate;
    }

    @Override
    public Map<String, ExecutionContext> partition(int gridSize) {
        Long minId = jdbcTemplate.queryForObject("SELECT COALESCE(MIN(id),0) FROM orders WHERE status='PENDING'", Long.class);
        Long maxId = jdbcTemplate.queryForObject("SELECT COALESCE(MAX(id),0) FROM orders WHERE status='PENDING'", Long.class);

        long targetSize = ((maxId - minId) / gridSize) + 1;
        Map<String, ExecutionContext> result = new HashMap<>();

        long start = minId;
        long end = start + targetSize - 1;
        int partition = 0;
        while (start <= maxId) {
            ExecutionContext context = new ExecutionContext();
            context.putLong("startId", start);
            context.putLong("endId", Math.min(end, maxId));
            result.put("partition" + partition, context);
            start += targetSize;
            end += targetSize;
            partition++;
        }
        return result;
    }
}

Example B: Partition Step Settings

@Bean
public Step masterStep(JobRepository jobRepository,
                       Step workerStep,
                       Partitioner partitioner,
                       TaskExecutor taskExecutor) {
    return new StepBuilder("masterStep", jobRepository)
        .partitioner("workerStep", partitioner)
        .step(workerStep)
        .gridSize(8)
        .taskExecutor(taskExecutor)
        .build();
}

Example C: Partition target query SQL

SELECT id, customer_id, total_amount
FROM orders
WHERE status = 'PENDING'
  AND id BETWEEN :start_id AND :end_id
ORDER BY id ASC
LIMIT :chunk_size;

Example D: Alleviating index and lock contention

CREATE INDEX idx_orders_status_id ON orders (status, id);

-- 워커별로 잠긴 레코드를 건너뛰어 경합 최소화
SELECT id
FROM orders
WHERE status = 'PENDING'
  AND id BETWEEN :start_id AND :end_id
FOR UPDATE SKIP LOCKED
LIMIT :chunk_size;

4) Real-world failure/operational scenarios

Situation: Order aggregation was parallelized with 16 partitions, but the same (order_id, stat_date) was stored redundantly in the result table.

Cause:

Reader divided the partition range, but Writer used a common aggregate key.
There was no unique index in the result table, so the race condition was reflected as is.
When using REPEATABLE READ, transactions were maintained for a long time, resulting in a rapid increase in lock wait.

Action:

Add UNIQUE(order_id, stat_date) to the result table.
Change Writer to INSERT ... ON CONFLICT DO UPDATE.
Adjust partition size to history-based weighted partition rather than equal partition.

5) Design Checklist

Is it clear whether the purpose of parallelization is throughput or latency?
Does the partition key guarantee conflict-free write boundaries?
Is the Reader/Writer thread-safe?
Is it possible to recover within a narrow restart range when a partition fails?
Have you calculated JVM heap usage (chunk x thread x object size)?
Did you block duplicate writing with a unique index and idempotent writer?

6) Summary

Parallel processing provides speed, but a weak design amplifies failures. Partition boundaries, write idempotency, and restart range must be designed first and then the number of threads must be increased to achieve operable performance.

7) Next episode preview

The next section covers manual deployment strategies. We design a “system in which operators safely intervene,” including REST triggers, Admin UI execution, parameter reprocessing, and rollback/audit tracking.