For
spinning up the EMR cluster you need to follow the below steps
Step 1: Need to create the Amazon S3 buckets for storage purpose or you can have
HDFS as the storage underneath the EMR cluster. In this example, I consider S3
as my storage. So select on S3 in “Services” tab under “Storage” and click on
create bucket.
Step 2: This is how you create a bucket. Enter the bucket name which should be
only lowercase letters without the special characters or upper case letters.
Select the region you wanted to host and if you wanted to copy the same
settings from already existing bucket, select that particular bucket then
proceed to next step.
Step 3: In this step you set the bucket properties. If you wanted
to enable and configure properties such as Versioning and tags do it from here.
Also, if you wanted to enable and configure logging properties such as generate
the Server access and Object-level logs can also be done in this step. Try to explore
more about these properties by going through AWS documentation.
Step 4: Here in this “set-permissions” step you provide the
necessary permission to the existing admin user or you can give access to other
AWS account which is present in the IAM groups with READ, WRITE permissions for
objects and object permissions. For remaining two properties as shown in the
picture leave it as default settings as recommended by AWS.
Step 5: In this step review the settings you’ve provided.
Verify everything and then click on Create Bucket.
Step 6: Next click on EC2 under Services tab. On the left-hand
side click on Key Pairs which will open the tab shown in picture. Then click on
Create Key Pair, provide a name for that key and if you wanted to save it on to
your local machine to perform SSH to that EC2 machine save it locally. So, all
the first five steps are prerequisites for spinning up the EMR cluster.
Step 7: Here starts the spinning up EMR cluster. So, select
EMR under the Services tab and click on “Create Cluster”.
Step 8: Enter the necessary details, specify the S3 folder by
clicking the folder symbol which displays all the s3 buckets associated with
that account. If you wanted to manually
select the hadoop services which you wanted to be installed, you can go through
it by clicking on the “Advanced Options” button. There select all the services you
want and then proceed to next step.
Step 9: Once you hit the “Create
cluster” button at the bottom it starts creation of cluster as shown in
picture. Sit back and relax it takes a while until it says “Running”.
Step 10: Once cluster is ready it looks
like this and you’re all set to use it. Now start using EMR cluster. Follow AWS
documentation for accessing the EC2 instances through CLI where you need to perform
the SSH tunneling.
Note: Error while trying to perform SSH to that EC2 instances through
CLI. For that you need to define the inbound and outbound rules such as the “All
TCP” rule should be enabled to your IP range or you can leave it open for everyone
in Inbound rules. Similarly define out bound rules too where you enable “All
TCP” rule as in inbound rules. It completely depends on how you customize it
according to your requirement.
Comments
Post a Comment