Skip to main content

Apache Iceberg

Iceberg is a high-performance format for huge analytic tables. Iceberg brings the reliability and simplicity of SQL tables to big data, while making it possible for engines like Spark, Trino, Flink, Presto, and Hive to safely work with the same tables, at the same time.

  • Expressive SQL
  • Full Schema Evolution
  • Hidden Partitioning
  • Time Travel and Rollback
  • Data Compaction

Features

  • Use SQL tables for big data
  • Work with the same tables simultaneously using engines like Spark, Trino, Flink, Presto, Hive, Impala, StarRocks, Doris, and Pig
  • Capture metadata information on the state of datasets as they change over time
  • Partition large tables into smaller ones to speed up read and load times
  • Run reproducible queries on the same table snapshot
  • Reset tables to their previous state to easily walk back errors
  • Enable ACID transactions at scale, allowing concurrent writers to work in tandem
  • Track changes to a table over time
  • Query historical data and verify changes between updates

apache-iceberg

Apache Iceberg V3

Variant

Data Types

  • Primitive types - Basic atomic data types that cannot be broken down into simpler types. Ex - Integer, String
  • Structure types - Composed of other types, fixed schema. Ex - List, Map
  • Semi-structured types - Flexibility to handle complex, hierarchical data structures such as JSON. Ex - Variant

Components of the variant data type

  • Metadata - Enhanced metadata for type definitions in support of pruning
  • Encoding - Binary encoding of semi-structured data to a files, metadata and values
  • Shredding - Materialization of elements in the data to hidden columns

Variants benefits

  • Performance and Cost
    • Columnar storage
    • Predicate pushdown
    • Statistics
  • Flexible Schema
    • No predefined schema needed
    • Evolves automatically
  • Efficient storage
    • Shredded format
    • Compression
  • Schema navigation
    • Dot notation
    • Nested access

Deletion Vectors

  • Row level deletes
  • Speeds up write operations
  • Storage optimization

Write modes in Apache Iceberg

  • Copy on write
  • Merge on read

write modes in apache iceberg

Row level deletes in Iceberg

row level deletes in Iceberg

Row lineage

  • Write specification
    • Writes record level row lineage information
    • Inherits row lineage information from metadata
  • Efficient reads
    • New metadata columns
    • Avoid full scans
    • _row_id metadata column for row identity
  • Streamlined operations
    • On by default
    • Record state information lives with the data

Queries

SELECT event_details:price::int as price
FROM tbl_sales_events
WHERE event_details:account_id::int = 12345