Part 8. OpenSearch/Elasticsearch Deployment Strategy: Scroll, Search After, PIT, Bulk, Rollover
We summarize the OpenSearch deployment design criteria and operational trade-offs considering both bulk querying and mass indexing from a practical perspective.
Series: Spring Boot 배치 전략 완전 정복
총 12편 구성. 현재 8편을 보고 있습니다.
- 01Part 1. Nature and classification of batches: schedule, event, manual, bulk, near-real-time
- 02Part 2. @Scheduled in action: The price of simplicity and the multi-instance pitfalls
- 03Part 3. Quartz cluster architecture: JobStore, Misfire, large-scale schedule management
- 04Part 4. Spring Batch core: Chunk, transaction boundary, restartable job design
- 05Part 5. Spring Batch Extension: Tradeoff between Partition and Multi-threaded Step
- 06Part 6. Manual Deployment Strategy: REST Triggers, Admin UI, Parameter Reprocessing, Rollback
- 07Part 7. DB Bulk Search Strategy: OFFSET/LIMIT Limits and Keyset, ID Range, Covering Index
- 08Part 8. OpenSearch/Elasticsearch Deployment Strategy: Scroll, Search After, PIT, Bulk, RolloverCURRENT
- 09Part 9. Distributed environment deployment: Leader Election, Kubernetes CronJob, and lock strategy comparison
- 10Part 10. Performance Optimization: Batch Size, Commit Interval, JVM Memory, Backpressure
- 11Part 11. Failure response architecture: Partial Failure, Poison Data, DLQ, Retry, Idempotence
- 12Part 12. Integrated reference architecture and final selection guide

Source: Pexels - Multi monitor workspace
Based on version
- Java 21
- Spring Boot 3.3.x
- Spring Batch 5.2.x
- Quartz 2.3.x
- PostgreSQL 15
- OpenSearch 2.x
1) Raise a problem
Search deployment has different failure patterns than DB deployment. DB is based on transactions, but in OpenSearch, segment/refresh/shard status determines performance and consistency. In particular, the following problem occurs frequently with "full reindex":
- Cluster memory pressure due to Scroll context leak
- Search After sort key missing/duplicate due to poor design
- 429 (Too Many Requests) chain occurs due to excessive bulk size setting
- Index throughput plummets due to misuse of refresh policy
2) Summary of key concepts
Comparison of inquiry methods
| method | Advantages | Disadvantages | Recommended Situation |
|---|---|---|---|
| Scroll API | A familiar model for bulk sequential lookups | Cost of maintaining context is high, occupied for a long time | Short-term batch extraction |
| Search After | Low state maintenance cost, excellent scalability | Sort key design required | Continuous Bulk Processing |
| PIT + Search After | Consistent Snapshots + Scalability | PIT lifespan management required | Operational Deployment Basic Recommendation |
Scroll vs SearchAfter comparison table
| Item | Scroll | Search After |
|---|---|---|
| Maintain server health | High (Scroll context) | low (client pointer) |
| Long-term batch stability | Context Expiration Risk | relatively stable |
| Alignment Requirements | relatively simple | Stable sort key required |
| Resource Impact | Memory/File Handle Occupation | Relatively lightweight |
| Modern Operations Recommended | LIMITED | Basic selection (especially PIT combinations) |
Index Strategy Core
- Bulk is adjusted based on “payload bytes” rather than “number of documents” (e.g. 5~15MB).
- Increase or temporarily disable
refresh_intervalduring bulk indexing. - Control segment/shard size with index rollover (
max_age,max_docs,max_size).
Pipeline diagram

Source: Pexels - Black server racks
3) Code example
Example A: PIT + Search After Lookup DSL
POST /products/_pit?keep_alive=2m
POST /_search
{
"size": 1000,
"pit": {
"id": "${pit_id}",
"keep_alive": "2m"
},
"sort": [
{ "updated_at": "asc" },
{ "_shard_doc": "asc" }
],
"search_after": ["2026-03-03T00:00:00Z", 120341],
"query": {
"range": {
"updated_at": {
"gte": "2026-03-01T00:00:00Z"
}
}
}
}
Example B: Bulk API NDJSON
POST /products_v3/_bulk
{ "index": { "_id": "1001" } }
{ "product_id": 1001, "name": "Keyboard", "price": 49000, "updated_at": "2026-03-03T01:02:03Z" }
{ "index": { "_id": "1002" } }
{ "product_id": 1002, "name": "Mouse", "price": 29000, "updated_at": "2026-03-03T01:02:04Z" }
Example C: Source DB incremental query SQL
SELECT id, name, price, updated_at
FROM products
WHERE updated_at > :last_synced_at
ORDER BY updated_at, id
LIMIT 1000;
Example D: Rollover Policy
PUT _ilm/policy/products-rollover
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_size": "40gb",
"max_age": "7d",
"max_docs": 50000000
}
}
}
}
}
}
4) Real-world failure/operational scenarios
Situation: In nightly re-indexing, Bulk size was set to 50MB and 8 workers indexed simultaneously while maintaining refresh=true. After 10 minutes, the number of 429s exploded and indexing delays accumulated, increasing the batch completion time by 6 times.
Cause:
- Bulk payload exceeded the shard write buffer.
- Segment merging costs increased rapidly due to forced refresh.
- Because the Search After sort key was a single
updated_atcolumn, simultaneous document omission occurred.
Improvements:
- Adjust bulk payload to around 10MB and limit concurrency to 8 -> 3.
- During deployment
refresh_interval=30s, perform manual refresh once after completion. - Supplement the sort key with
(updated_at, _id)or(updated_at, _shard_doc).
5) Design Checklist
- Have you considered PIT + Search After as the default rather than Scroll when performing mass searches?
- Search After Does the sort key guarantee uniqueness and stability?
- Is bulk size controlled based on bytes?
- Is there a backoff and concurrency reduction strategy when 429/rejection occurs?
- Is the refresh policy adjusted to the index mode?
- Are index rollover and shard size targets managed as operational indicators?
6) Summary
OpenSearch deployment is not a simple API call but a matter of cluster resource control. The practical defaults are PIT + Search After for lookup and small bulk + controlled refresh + rollover policy for index.
7) Next episode preview
The next part covers distributed environment deployment strategies. Preventing multi-instance duplicate execution, leader election, Kubernetes CronJob vs. app internal placement, and DB/Redis/Zookeeper lock comparison are summarized from an architectural perspective.
Reference link
- Spring Batch Reference
- Quartz Scheduler Documentation
- PostgreSQL Transaction Isolation
- 블로그: Idempotency Key API 설계