AWS Solutions architect 3 - Compute
Server-Based Compute (EC2) Fundamentals
Architecture
graph LR A[EC2 host] B[Instance Store volume] C[EC2 instance] D[EBS] E[AMI] F[CloudWatch] G[Bucket] H(ENI - Elastic Network interface) A --> B; B --> C; C --> A; C --> D; E --> C; C --> F; D --> G; C -- security group--> H;
- EC2 instances states
- main states
- run (you are only billed for this one)
- stop
- intermeduate states (you are not billed for these)
- pending
- stoping
- terminate (deleting)
- main states
Types/sizes
- EC2 instances are gruped into families
- general purpose
- compute optimized
- memory optimized
- accelerated computing
- Types
- T2 and T3: low-cost, rovide burst capability
- M5: for general workloads
- C4: provides more capable CPU
- X1 and R4: optimize large ammounts of fast memory
- I3: deliver fast IO
- P2, G3 and F1: deliver GPU and FPGAs
- sizes
- nano
- micro
- small
- medium
- large
- x.large
- 2x.large
- larger
- special cases
- ‘a’: use AMD CPUs
- ‘A’: arm based
- ‘n’: higher speed networking
- ‘d’: NVMe storage
Storage architecture
- Elastic Block Service (EBS): storage service that creates abd manages volumes based on 4 underlying storage types
- Volumes: presistent, can be attached and removed from EC2 instances, and are replicated within a single AZ.
- types
- mechanical
- sc1: lowest cost, infrequent access, can’t be boot volume
- st1: low cost, throughput intensive, can’t be boot volume
- solid state
- gp2: default, balance of IOPS/MiB/s - burst pool IOPS perGB
- io1: highst performance, can adjust size and IOPS separately
- mechanical
- types
- to protect against AZ failure, EBS snapshots (to S3) can be used. Data is replicated accross AZs in the region and (optionally) internationally
EBS Snapshots
- EBS snapshots: a pint-in-time backup of an EBS volume stored in S3
- Initial snapshot: full copy of the volume
- Future snapshots only strore data changed since last snapshot
- Uses
- move/copy instances between AZs
- It is recommended to power of instance to create snapshot, or flush disk
- snapshots can be copied between regiosn, shared and automated using DLM (Data Lifecycle Manager)
Security Groups
- Security groups are fostware firewalls, that can be attached to network interfaces
- Each have inbound and outbound rules (rules, from-to)
- They have a hidden implicit/default deny rule, but can not explicitly deny traffic
- Stateful
- Can reference AWS resoures, other security groups and even thenselves
Instance metadata
- Instance metadata: data relating to te instance taht can be accessed from within the instance itself using a utility capable of accessing HTTP and using the URL
Address (IP must be memorized) | Description |
---|---|
http://169.254.169.254/latest/metadata |
Latest metadata |
http://169.254.169.254//latest/metadata/ami-id |
AMI id |
http://169.254.169.254/latest/metadata/instance-id |
Instance id |
http://169.254.169.254/latest/metadata/instance-type |
Instance type |
- A way that scripts and applications running onEC2 can get visibility of data they would normally need API calls for
- The metadata can provide the currentexternal IPv4 address for the instace, which is not configured on the instance itself but provided by the Internet gateway in the VPC, It provides the AZ the instance was launched in and the security groups applied to the instance. In the case of spot instances, it will also provide the approximate time the instance will terminate
Server-Based Compute (EC2) Intermediate
AMI
- AMI (amazon Machine Image): used to build instacnes
- Store snapshots of EBS volumes, permissions and a block of device mapping, which configures how the instance OS sees the attacheed volumes
- AMIs can be shared, free or paid, and can be copied to other AWS
- Steps
- Configure instance (soure instance amd attached EBS volums are configured with any required software and configuration)
- Create image (snapshots are created from volumes, AMI references snapshots, permissions and block device mappings)
- Launch instance -> new instance
- With appropiate launch permissions, instances can be created from an AMI. EBS volumes are created using snapshots as the source, and an EC2 instance is created using the blk device mapping to reference its new volumes
Bootstrap
- Bootstrapping is a process where instructions are executed on an instance during its launch process. Bootstraping is used to configure the instance, perform software installation, and add application configuration.
- In EC”, user data can be used to run shell scripts (bash or powershell) or sun cloud-init directives
graph TD A(S3 or Github) B(OS repository) C(User script) D[AMI - OS + baked components] E[Instance + user datascript] F[Final instance] A --> C; B --> C; C --> E; D --> E; E --> F;
ENI, IP, and DNS
private instance (only communicates inside the VPC)
- allocated an ip-x.x.x.x.ec2.internal DNS name - only works inside AWS
- private IP allocated when launching instances
- An ENI is assigned an IP
public instance (has a public IP address)
- Elastic IPs are static, when allocated, they replace the normal public IP, which is deallocated
- The public DNS resolves to the public address externally, but the private address internally
- a public IPv4 address can be allocated. This is allocated when the machine starts and deallocated when it stops
Instance roles
- EC2 instance roles: IAM roles that can be assumed by EC2 using and intermediary called an instance profile. An instance profile is either created automatically when using the console UI or manually when using the CLI. It’s a container for the role taht is associated with an EC2 instance.
- The instance profile allows applications on the EC2 instance to access the credentials from the role using instance metadata
graph TD A[Permission policy] B[IAM role] C[EC2] D[Credentials] A -- STS assume role --> B; B -- instance metadata provide temp access --> C; C -- credentials can be used to access AWS --> D;
- Order of credentials
- Command line options:
aws [command] --profile [profile name]
- Environment variables:
AWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY_ID
,AWS_SESSION_TOKEN
- AWS CLI credentials file:
aws configure
, not recommended for production environments - Container credentials: IAM roles associated with AWS Elastic Container Service (ECS) Task Definitions. Recommended for ECS environments
- Instance profile credential: IAM roles associated with Amazon Elastic Compute Cloud (EC”) via Instance Profiles -> temporary credentials, recommended for EC2 environments
- Command line options:
Server-Based Compute (EC2) advanced
EBS Volume and Snapshot Encryption
- Volume encryptin uses EC2 host hardware to encrypt data at rest and in transit beteen EBS and EC2 instances
- Encryption generates data encryption key (DEK) from a customer master hey (CMK) in each region
- Snapshots of that voluem are encrypted with the same DEK, as are any volumes created from that snapshot
graph TD A[KMD] B[EC2 host] C[Encrypted DEKs] D[EC2 instance] A -- plaintext DEKs in EC2 memory --> B; B --> D; C -- decrypted via CMK from host--> A; C --> D;
EC2 instance and OS see plaintext data as normal -> no performance impact
EBS optimization, enhanced networking, placement groups
EBS optimization
- Legacy non-EBS-optimized used a shared networking path for data and storage communications
- EBS-optimized mode, which was historically optinal and is now the default, adds optimization paths for storage and traditional data networking (separated).
- Consistent utilization
- Required feature to support higher performance storage
Enhanced networking
- Traditional virtual networking: an EC2 arranges access for n virtual machinesto access 1 physical network card (software multitasking = slow)
- Enhanced networking: uses SR-IOV, allows a single physical network card appear as multiple physical devices. Each instance can be given 1 of these (fake) physical devices (faster transfer rates, lower CPU usage, lower latency).
- EC2 delivers via Elastic Network Adapter (ENA)
- EC2 delivers via Intel 82599 Virtual Function (VF) Interface
Placement Groups
- Cluster PG: 1 next to each other, just 1 availability zone, all clusters on the same place
- Partition PG: 1 next to each other, 2 availability zones, part 1 one AZ, part in the other
- Spread PG: similar to partition, max of 7 instances per AZ.
- each instance occupies a partition
- each instance has an isolated fault domain
- great on mail servers, domain controllers, file servers and application HA pairs.
EC2 billing models
Spot and Spot Fleet
- Spot instances allow consumption of spare AWS capacity for a given instance type and size in a specific AZ
- provided as long as yur bid price is above the spot price
- if you bis is exceeded, instances are terminated in a 2 minute warning
- Spot fleets are container for “capacity needs”
- you aim specify pools of certain types/sizes aiming for a given “capacity”. A minimum percentage of non-demand can be set to ensure the fleet is always active.
- perfect for non-critical workloads, burst workloas, consistent non-critical jobs that can tolerate interrumptions without impacting functionality
- not suitable for lonniing workloads that require stability and cannot tolerate interrumptions
- Spot instances allow consumption of spare AWS capacity for a given instance type and size in a specific AZ
Reserved Instances
- Reserved instances: lock in a reduced rate for 1-3 years
- Zonal reserved instances: include capacity reservation
- Commitment: cost even if instances aren’t launched
- Reserved = long running, understood and consistent workloads
Dedicated Hosts
- Dedicated hosts: EC2 hosts for a given type and size that can be dedicated to you
- The number of instances that can run the host is fixed, depending on the type and size
- An on-demand or reserved fee is charged for the dedicated host
- Used when software is licensed per core/CPU and not compatible with running within a shared cloud environment
Serverless compute (Lambda)
APIs and microservices
Microservices architecture: inverse of monolithic: components are separated into microservicesand operate independently.
- A microservice does 1 thing - and does it well
- Operations, updates and scaling can be done on a per-microservice basis
- Inflexible scaling: either increasing or decreasing the instance size or duplicating the instance
- Microservices operate as independent applications: allow direct communication between compoennts and users
API (Application Programming Interface) is and interface accessed (consumed) by another service (rest/soap: json/xml)
- API endpoints re locations taht allow API interaction, hosts 1 or more APIs and makes them available for on a network (public or private)
- static (abstracted ffrom whta the code is doing) -> lower risk changes
Serverless and event-driven architectures
- Serverless compute
- when using an event driven architecture, a system operates around “events” that represent an action or a change of stae
- efficient beacuse events are generated and pushed, rathr than been polled (traditional polls = compute on, scale poorly)
- principles
- Back-end as a Service (BaaS) -> 3rd party services where possible rather tahn running your own (auth, cognito, dynamoDB)
- Function as a Service (FaaS) -> application logic
Lambda essentials
- Lambda: FaaS -> Functions = code which run at runtime
- Functions are invoked by events, perform actions for up to 15 minutes, and terminate
- Stateless (each run is clean, lack of persistance)
- Parts
- Runtime environment (prebuilt, e.g. Python)
- Funciton code
- Execution role (access to AWS -> temproary security credentials available via STS)
- Lambda free tier: 100ms
API gateway essentials
- API ateway: managed API endpoint service
- Can be used to create, publish, monitor and secure APIs “as a service”
- Can use other AWS services for compute (FaaS, IaaS) as well to store recall data
Step functions
Container-based compute and microservices
- Step functions: serverless visual workflow service, that provides state machines. A state machine can orchestrate other AWS services with simple logic, branching and parallel execution, and it mantains a state.
- Workflow steps are known as states, and they can perform work via tasks
- Step functions allows ofr long-running serverless workflows
- A state machine can be defined using Amazon States Language (ASL) -> similar to json
- Longest runtime allowable on a state machine: up to 1 year
graph LR A[state machine is executed from another service or component] B[Lambda] subgraph AWS seo functions C(Start) D[Lambda] E[Lambda] F(Manual approval) G(approved) H(rejected) end I[API gateway] J[Email] A --> B; B --> C; C --> D; D -- generate email --> J; J -- send email --> I; I --> E; E -- update states --> F; F --> G; F --> H;
Docker
- Container: package that contains an application, libraries, and file systems equired to tun it
- Run on a container engine that generally runs within a single OS, such as Linux
- Povide isolation benefits of virtualization but more lightweight, allowing faster starts and more dense packing with a host
- Example from EC2 machine
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16sudo amazon-linux-extras install docker
sudo service docker start
sudo usermod -a -G docker ec2-user
sudo yum install git
git clone https://github.com/linuxacademy/content-aws-csa2019.git
cd content-aws-csa2019/lesson_files/03_compute/Topic5_Containers/Docker/
docker build -t containercat .
docker images --filter reference=containercat
docker run -t -i -p 80:80 containercat
docker login --username YOUR_USER
docker images
docker tag IMAGEID YOUR_USER/containercat
docker push YOUR_USER/containercat
ECS
- A managed container engine: it allows dicker containers to be deployed and managed within AWS environments
- EC2 can use infrastructure clusters (for backing infrastructure):
- Based on EC2
graph TD subgraph ECS A[Cluster manager] B[Placement engine] end subgraph EC2 C[EC2-1] D[EC2-2] end E(Image) A --> C; B -- task definition is used to create ECS task--> D; E -- stored and rertieved from registry --> D;
- Based on Fargate (service that encapsulates clusters)
graph TD subgraph ECS A[Cluster manager] B[Placement engine] end C[Fargate service] D(Image) A --> C; B -- task definition is used to create ECS task--> C; D -- stored and rertieved from registry --> C;
- Tips
- Cluster: logicaal collection of ECS resources
- Task definition: defines your application, similar to a dockerfile but for running containers in ECS
- Container definition: inside a task definition, a container definition defines the individual containers used by a Task. It controls the CPU and memory each container has, in addition to port mappings for the container
- Task: a single running copy of any containers defined by a task definition. 1 working copy of an application (e.g. DB and web containers)
- Service: services allow task definitions to be scaled by adding additional tasks. Defined Minimum and Maximum values
- Registry: storage for container images (e.g ECS Container Registry, DockerHub). Used to download image to create containers