Amazon EMR Studio Features

Amazon EMR Studio is a web-based IDE for data preparation, visualization, collaboration, and debugging on Amazon EMR clusters

Administrators can link EMR Studio to cluster templates via Service Catalogue, controlling user access to specific cluster configurations

Multiple EMR Studios can be created to manage access to clusters across different VPCs

EMR Studio supports integration with EMR on EC2 and EKS clusters, but EKS clusters must be connected via managed endpoints

Trusted identity propagation is supported only when EMR Studio is set up with IAM Identity Centre and enabled on connected clusters

Workspaces and endpoints use FIPS 140 certified cryptographic modules for encryption-in-transit, supporting regulated workloads

EMR Studio is compatible with Amazon EMR versions 5.32.0+ (5.x) and 6.2.0+ (6.x), but some features require later versions for full support

Known limitations include lack of support for certain Jupyter magic commands, SparkMagic on EKS, and issues with kernel startup and idle kernel cleanup

Notebooks may show errors if not connected to a cluster, and some Spark UI links may not function without manual regeneration

Service limits include a maximum of 100 EMR Studios per AWS account, 5 subnets per Studio, 5 IAM Identity Centre Groups per Studio, and 100 users per Studio