How to set up an EMR studio

Setting up Amazon EMR Studio requires an AWS account, an S3 bucket for notebook backups, and a VPC with up to five subnets for cluster connectivity

Attach a permissions policy to the service role, including actions for network interfaces, S3 access, and KMS if using encrypted buckets

Set up user permissions by creating an EMR Studio user role and attaching session policies that define user/group rights within the Studio

In IAM Identity Centre mode, use sts:SetContext and AssumeRole in the trust policy, and assign session policies to users/groups for fine-grained access

In IAM authentication mode, use ABAC and IAM policies, granting actions like elasticmapreduce:CreateStudioPresignedUrl

Define IAM policies for user actions such as workspace creation, cluster management, and Git repository access; cluster-level permissions control data access

Optionally, create custom security groups to manage network traffic for Workspaces and clusters; defaults are used if not specified

Establish the EMR Studio via the AWS CLI or console, specifying settings like name, S3 location, authentication mode, VPC, subnets, and security groups

Assign users or groups to the Studio using the console or CLI, setting session policies and managing permissions as needed

After setup, users can access and use EMR Studio for interactive analytics, batch jobs, and collaborative data science