0. AWS
Q. What is AWS?
-Amazon Web Services is a Cloud Provider
-AWS provide servers and services that can be used on demand and scale easily.
Q. SSO?
Single Sign ON – it is like using 1 account for many
logins.
Q. AWS Cloud Use Cases?
AWS used in
- Enterprise IT,
- Data analytics
- Website hosting
- Gaming
- Storage
- Mobile
- Social Apps
Q. AWS
Well-Architected tool?
An AWS
Well-Architected tool provided consistent approach for reviewing your
architecture against AWS best practices.
Q. Pricing in AWS?
It is based on 3 factors.
1) Compute (CPU, resources, memory)
2) Storage
3) Outbound Data transfer (like downloading froms3)
Q. How to Choose region?
1) Compliance with data governance and legal rules.
2) Proximity to Customers
3) Pricing
4) Available Services
5) Based on Latency: how much band width of data is coming
Q. Edge location use?
Used for caching the data.
For example, a user in India accessing a file from US east region for the first time the file will be accessed from US East region. If any other user from India tries to access the same file, then they can directly access it from caching server in India.
Q. Maximum Limit set by aws for roles, Users in an account?
1000 roles,
5000 Users,
300 Groups,
Iam roles and users per policies can be 10,
IAM users can be member of 10 groups
You can raise a ticket to support team and ask for increasing the limit.
1. Iam
Q. IAM
- Identity and Access Management
- enables you to manage the access to AWS services and resources securely.
- IAM Service usage is FREE!
- IAM is a Global service.
Q. Features of IAM:
- Securely manage services and resources.
- Creates IAM principals such as Users, Groups and Roles.
- Allow/Deny access via IAM Policy.
- Supports Identity Federation.
- Provides Multi-Factor Authentication.
- Provides Policy Simulator.
Q. How can users access AWS?
AWS can be accessed in 3 ways:
- AWS Management Console (Password is used to login)
- AWS Command Line Interface (access keys )
- AWS Software Developer Kit (access keys)
Q. What is a Role in Iam?
Role is not linked to any person/user or service. Rather you assume the Role as which grants you permission to access that(service).
Q. What are policies in Iam?
There are 2 types of policies.
1) Managed policies (created by AWS)
2) Custom policy (Created by us and we have to manage them)
Policies are:
1. A policy is an entity in AWS that, when attached to an identity or resource defines their permissions.
2. A policy contains different Permissions, determine whether the request is allowed or denied.
3. Policies are stored in AWS as JSON documents.
4. Policy is attached to either principals (Users, Roles or Groups) or to resources (e.g., S3 bucket).
Q. Iam policy consists of?
Policy Consists of:
1) Version: It’s a policy language version, always include “2012-10-17”
2) ID: Identifier for policy (optional), (it helps in quick recognition for which policy is)
3) Statement: one or more required
Statement consists of:
1) SID: Statement Identifier (optional)
2) Effect: Whether statement allows or denies access ()
3) Principal: accounts/user/role to which this policy is applied to (ARN is used here)
4) Actions: Set of actions policy allows or denies
5) Resources: list of resources to which action applied to
6) Conditions: conditions applied when the policy is executed
Q. IAM Terminologies:
- Resources: Resources are the things on which actions can be taken. Example: EC2 instances are resources so someone who has the power to start or stop an instance, they are acting on a resource.
- Principals: Things that can take action. So, principals act and they act upon resources. It includes users, groups and roles. These are also called as Identities.
- Users: IAM users are entities created in AWS and have permissions to perform some actions.
- Group – Collection of IAM users.
- Role – An identity that grants permission.
Q. What is ARN?
ARN --Amazon Resource Names
It is an unique name for all the resources which are being assigned in AWS.
It is helpful when we write policies, rules.
It is automatically assigned for services we create.
Q. Defaualt deny in Iam?
IAM User and IAM Group by default has no permissions to access AWS resources. This is called as “Default Deny” or “Non-Explicit Deny”.
Q. How to create/import 1000 users at a time in AWS?
By Bulk importing csv file, using AWS SSO (single Sign on) we can create 1000 users at a time.
Q. What is MFA in IAM?
AWS Multi-Factor Authentication (MFA) is a simple best practice that adds an extra layer of protection on top of your user’s name and password.
Q. What are
identity-based policies?
Policies that you can attach to a principal (or identity)
such as an IAM user, role or group.
These policies control what actions that identity can
perform, on which resources, and under what conditions.
Q. Resource based policies?
Policies that you attach to a resource, such as S3, IAM, Lambda, EC2.
Q. How policies are different form Permission boundaries?
Permission Boundaries are like add on policies on Policies.
Example:
If you have AdminrolePolicy then if access is given for DynamoDB then your area admin for DynamoDB.
Q. What is Access Analyzer in Iam?
With Access Analyzer you can analyze:
1) what kind of security your account has?
2) What services provided for your account
2. EC2
Q. EC2
- Elastic Compute Cloud.
Infrastructure as a service.
-Virtual server to run applications.
-Scalable Computing Capacity in AWS Cloud. (Increase /decrease as required)
-Scale up or down to handle changing requirements, reduce the need to forecast traffic.
Q. Types of EC2 instances?
- General purpose. (t2, t3, t4, m4, m5) -balance b/n memory, Compute, Networking
- Compute optimized. (c4, c5, c6)
- Memory optimized. (r4, r5, r6, u, x1, x2, z1)
- Storage optimized. (d2, d3, h1, i3, im, is)
- Accelerated computing (f1, g2, g3, g4…)
Q. Instance purchasing options?
Amazon EC2 provides the following purchasing options.
On-Demand Instances – Pay, by the second.
Savings Plans – commitment to a consistent usage, for a term of 1 or 3 years.
Reserved Instances – Reserve for a term of 1 or 3 years.
Spot Instances – Request unused EC2 instances,
Dedicated Hosts – Pay for a physical host
Dedicated Instances – Pay, by the hour, for instances that run on single-tenant hardware.
Capacity Reservations – Reserve capacity for your EC2 instances in a specific Availability Zone for any duration.
Q. Type of
Scaling recommend to Amazon databases?
Autoscaling
Q. Diff type
of Load balancers?
- Application Load Balancer,
- Network Load Balancer,
- Gateway Load balancer,
- Classic load balancer.
Q. How can you
auto delete the snapshots?
AWS ops autometer service is used which handles
snapshots automatically.
Q. EC2 storage volumes?
- Storage volumes for temporary data, which are deleted when you stop, hibernate, or terminate your instance, known as instance store RAM volumes.
Ex: used for testing, POC's
- Persistent storage volumes (permanently there) for your data using Amazon Elastic Block Store (Amazon EBS), known as Amazon EBS Volumes.
3. S3
Q. S3
Simple Storage Service
Q. Default storage service in S3?
Standard.
Q. Can a bucket have different objects in different storage class?
YesQ. Amazon S3
is which type of service?
Object
Q. For which type of data S3 offers encryption services?
- Data in
flight,
- data in rest
Q. Does S3 objects
are accessible from any region.
True
Q. S3
storage classes:
1. Standard
2. Standard-IA
3. Intelligent-Tiering
4. One Zone-IA
5. Glacier Instant retrieval
6. Glacier Flexible retrieval
7. Glacier Deep archive.
Q. Copy a file from Ec2 instance to s3?
- SSH EC2 instance.
>aws s3 cp abc.txt s3://my-bucket-1b
Q. Copy a file from s3 to Ec2 instance?
>aws s3 cp s3://my-bucket-1b/xyz.txt .
4. VPC
Q. VPC main
components?
VPC: Networking Layer for EC2.
Subnet: A logical partition on an IP network/address into multiple segments/IP
address (or range of IP addresses)
CIDR: a method for allocating IP addresses
Route Tables: Set of rules, called Routes that are used to determine where network
traffic is directed.
Internet Gateway: A gateway that you attach to your VPC to
enable communication between resources in your VPC and the internet
VPC Endpoints: Allows you to connect to AWS services using a private network instead
of using the public Internet. (without Internet Gateway, NAT device, VPN
connection, instances do not require public IP addresses to communicate with
resources in the service)
Q. Host id in Ipv4
Ipv4:
172.16.47.5
172.16 -- network prefix
47.5 -- Host ID
Q. IP addresses
calculation in CIDR?
10.0.0.0/24 à256
Ip addresses (10.0.0.0 – 10.0.0.255)
Every network has 32bits
32- CIDR=32-24=8
No: of Ips è 2^8 =256
Q. Ipv4 vs
Ipv6
|
Ipv4 |
Ipv6 |
|
1) Format is 32bit, 4groups of
up to 3 decimal digits
|
1) Format is 128-bit, 8 groups
of 4 hexadecimal digits
|
|
2) Default and required for
all VPC's |
2) Option only
|
|
3) VPC CIDR block size can be
from /16 to /28 |
3) VPC CIDR block size fixed at
/56
|
|
4) Subnet CIDR block size can
be from /16 to /28 |
4) Subnet CIDR block size fixed at /64 |
|
5) You can choose the Private
ipv4 CIDR block for your VPC |
5) |
Q, NACL vs Security Group?
Security Group vs NACL
SG works at instance level, NACL works at subnet level.
5. Lambda
6. Glue
Q. Access S3 for glue job?
Iam role to access S3, GlueServiceRole, GlueServiceNotebookRole, GlueConsoleAccess
Q. Type (where job runs?)
Spark, Spark Streaming, Python shell
Q. Glue version?
Spark3.1, Python 3 (Glue version 3.0)
Q. Script stored at?
In S3 bucket in specific folder
Q. Temporary directory in Glue job?
Location where temporary results are saved.
Q. Incremental load implementation?
Advanced properties -- job bookmark -- enable
Q. Monitoring options?
Job metrics (Enable creation of cloud watch metrics CloudWatch metrics)
Continuous logging
Spark UI (enable Spark UI for monitoring)
Security, Script libraries, job parameters?
Q. Library path?
Specify S3 bucket folder/key path.
Q. Dependent jars path?
Specify S3 bucket folder/key path.
Q. Referenced files path?
Specify S3 bucket folder/key path.
Q. Worker type?
G.1X (memory intensive jobs Max 299 workers
G.2X (for ML transformations) Max 149 workers
Data Processing Units (or DPUs)
Q. Max number of concurrent runs allowed for this job?
Max concurrency (default 1), error is returned when threshold reached.
Q. Number of retries?
Mention
Q. Job time out in Glue job?
Job execution time limit. Default is 2880min (48hrs)
Q. Delay notification threshold (in min)?
If job runs longer than specified time, glue sends notification via CloudWatch.
Q. Catalog options?
Use Glue Catalog as Hive meta store.
Q. Choose Data source?
Data source has to be crawled and schema table should be in Glue Catalog DB.
Q. Choose data target?
Create tables in data target.
Use tables in the data Catalog and update your data target.
Data store: S3, JDBC (select connection)
No comments:
Post a Comment