AWS Glue 5.0 provides fine-grained access control based on policies defined in AWS Lake Formation for granular control over data lake resources at the table, column, and row levels.
Lake Formation, a data lake management service, allows you to define fine-grained access controls through grant and revoke statements and automatically enforce those policies using compatible engines.
Using AWS Glue 5.0 with Lake Formation lets you enforce permissions on each Spark job to apply Lake Formation permissions control when AWS Glue runs jobs.
To enable Lake Formation FGAC for AWS Glue 5.0 jobs, create a standard Data Catalog table, then register the location, and grant table permissions using Lake Formation.
You can create PySpark jobs in AWS Glue to process input data, configure FGAC on the tables with row and column-based filters and limit read access to specific columns using Lake Formation permissions.
To enforce FGAC, use Spark SQL and Spark DataFrames and configure Lake Formation FGAC for AWS Glue notebooks through the console.
AWS Glue 5.0 governs access through a user profile and a system profile driver, delegating table stage reads to system executors.
Enabling Lake Formation FGAC in AWS Glue jobs makes previously dynamic data Frames with non-delegable operations compatible.
AWS Glue 5.0 unifies handling of FGAC permissions across service integrations, notably Amazon Athena, Amazon EMR, and Amazon Redshift Spectrum.
Through Lake Formation permissions, AWS Glue 5.0 simplifies granular access control to data lake resources at the table, column, and row levels.