Amazon announced at its re:Invent conference updates to its Amazon Simple Storage Service (S3) to make it the first cloud object store with fully managed support for Apache Iceberg.
Amazon S3 Tables deliver up to 3 times faster query performance with 10 times higher transactions per second and automated maintenance, ensuring S3 users manage and store tabular data on any scale.
S3 tables provide a purpose-built solution for managing Apache Iceberg tables in data lakes, providing faster query performance and higher TPS compared to general-purpose S3 buckets.
Amazon S3 Metadata is another new service that streamlines data discovery by automatically capturing queryable object metadata and custom metadata using object tags.
S3 Metadata captures system-defined details such as object size and source, integrating metadata into S3 Tables, ensuring an up-to-date view of data, accelerating analytics across data lakes.
Customers using S3 Tables can create dedicated table buckets that streamline the storage and querying of tabular data in fully managed Iceberg tables.
The service also offers advanced Iceberg features like row-level transactions, queryable snapshots via time travel and schema evolution.
S3 Metadata also enables customers to enrich their data by adding custom metadata with object tags and querying through SQL queries, which effective storage optimization and improved data preparation for analytics.
The release of Amazon S3 Tables seeks to address the issue of managing large-scale tabular data, which customers typically organize using Apache Parquet, a file format optimized for data queries and organizing billions of files.
Andy Warfield, VP of storage and distinguished engineer at AWS, says the system removes the overhead of data stores on top of objects to enable customers to shift their focus back to working with their data.