Technical

Startups

Data

Asia

Apache Doris Tops JSONBench in Cold Queries and Data Quality

Apache Doris 3.1 achieves first place in the JSONBench benchmark, outperforming MongoDB by 164x and PostgreSQL by over 1,000x in cold query performance.

Apache Doris 3.1 achieves first place in the JSONBench benchmark, outperforming MongoDB by 164x and PostgreSQL by over 1,000x in cold query performance.

Apache Doris 3.1 achieves first place in the JSONBench benchmark, outperforming MongoDB by 164x and PostgreSQL by over 1,000x in cold query performance.

NewDecoded

Published Nov 6, 2025

Nov 6, 2025

5 min read

Database Dominates Industry-Standard Benchmark

Apache Doris has claimed the top spot in JSONBench, the industry-standard benchmark for semi-structured JSON data analytics, with its latest 3.1 release. The database ranked first in both cold query performance and data quality, achieving a score of 1.57 in cold query tests while accurately importing 100% of the benchmark dataset. This marks a significant achievement in the competitive landscape of JSON analytics, where cold queries represent the toughest challenge as they require reading massive files directly from disk without cached data. The performance gap between Apache Doris and competing solutions is substantial. In cold query scenarios, Doris delivered speeds approximately 164 times faster than MongoDB, over 1,000 times faster than PostgreSQL, and nearly 2 times faster than Elasticsearch. The system also secured second place in hot query performance, trailing only ClickHouse, the benchmark's original developer.

Technical Architecture Drives Performance

The breakthrough performance stems from deep optimizations to the VARIANT data type introduced in Apache Doris 3.1. Key innovations include sparse subcolumns that store only frequently accessed JSON keys in columnar format, schema templates for standardizing subcolumn data types, and enhanced column pruning with path indexing. These features work together to minimize I/O operations and metadata overhead while ensuring consistent index performance. Apache Doris employs an efficient I/O path that loads only necessary data through path-level column pruning and late materialization. The database supports JSON path indexes including ZoneMap and BloomFilter with predicate pushdown capability, enabling file-level pruning to accelerate filter evaluation. Combined with a vectorized execution engine and intelligent caching strategies, the architecture achieves sub-second query response times even on datasets containing over 1 billion JSON records.

Real-World Cost and Efficiency Advantages

Beyond raw performance metrics, Apache Doris offers significant operational benefits. The system reduces cold query I/O costs by over 60% compared to Elasticsearch under similar workloads, while requiring only 50% of the storage space needed by Elasticsearch and one-third of PostgreSQL's storage footprint. This efficiency translates directly to reduced infrastructure costs for organizations handling large-scale JSON analytics. The storage-compute separation architecture enables strong performance even in large-scale S3 or HDFS-based deployments, allowing organizations to scale resources independently based on workload requirements. These advantages position Apache Doris as a compelling replacement for Elasticsearch, MongoDB, and PostgreSQL in use cases involving log analytics, event data processing, and behavioral analytics.

Decoded

Apache Doris's dominance of JSONBench signals a broader shift in the database market toward specialized systems that excel at specific workloads rather than general-purpose solutions. While MongoDB built its reputation on JSON document handling and Elasticsearch on search capabilities, Apache Doris demonstrates that purpose-built columnar storage with intelligent indexing can deliver superior performance for analytical workloads on semi-structured data. This benchmark result validates the architectural decisions made by the Apache Doris team and positions the database as a serious contender in the rapidly growing market for real-time analytics platforms. For organizations evaluating their data infrastructure, the combination of dramatic performance improvements and substantial cost reductions makes Apache Doris increasingly difficult to ignore, particularly as JSON data continues to proliferate across enterprise systems.

Decoded Take

Decoded Take

Decoded Take

Apache Doris's dominance of JSONBench signals a broader shift in the database market toward specialized systems that excel at specific workloads rather than general-purpose solutions. While MongoDB built its reputation on JSON document handling and Elasticsearch on search capabilities, Apache Doris demonstrates that purpose-built columnar storage with intelligent indexing can deliver superior performance for analytical workloads on semi-structured data. This benchmark result validates the architectural decisions made by the Apache Doris team and positions the database as a serious contender in the rapidly growing market for real-time analytics platforms. For organizations evaluating their data infrastructure, the combination of dramatic performance improvements and substantial cost reductions makes Apache Doris increasingly difficult to ignore, particularly as JSON data continues to proliferate across enterprise systems.

Share this article

Related Articles

Related Articles

Related Articles