AWS Solutions architect 8 - Application, Analytics, and Operations
Application Integration
Simple Notification Service (SNS)
Publisher/subscriber base service
- SNS coordinates and manages the sending and delivery of messages
- messages sent to a topic (base) are distributed to subscribers
- SNS integrates with many AWs services
- using SNS, CloudWatch can notify admins of important alerts
Components
Topic: isolated configuration for SNS, including permissions
- messages (<= 256 KB) are sent to topic
- subscribers to that topic receive messages
Subscriber: endpoints that receive messages for a topic
- HTTP(S)
- email and email-json
- SQS (message that can be added to more requests)
- mobile phone push notifications
- lamda functions
- SMS (cell phone message)
Publisher: entity that publishes/sends messages to queues
- application
- AWS services, including S3 (S3 events), CloudWatch. CloudFormation…
graph TD subgraph private VPC G[Cloud formation] H[EC2] I[ASG] end subgraph public Internet J((API HTTP/S)) end subgraph SNS A(Publisher) B[Topic] C[Subscriber] end D(mail) E[SQS Queue] F(lambda) G[cell phone message] A --> B; B -- send message --> C; C --> J; J --> B; G --> B; H --> B; I --> B; C --> D; C --> E; C --> F; C --> G;
Simple Queue Service (SQS)
SQS provides fully managed, highly available message queues for inter-processes/server/service messaging
- used mainly to create decoupled architecture
- messages are added to a queue and retrieved via polling
Polling types
- short polling: available messages are returned ASAP. May return 0 messages, increased number of API calls
- long polling: waits for messages for a given WaitTimeSeconds (much efficient)
- more efficient: less empty API calls/responses
Queue types
- Standard (distributed, scalabale to nearly unlimited message volume)
- FIFO (messages are delivered once only. Throughput limited to -3000 messages per second with batching or -300 without by default)
Lambda functions can be invoked based on a queue offering better scaling and faster response than Auto Scaling groups for any messages that can be processed quickly
Elements
SQS messages
- (<= 256 KB) can link data from S3 for larger payloads
- When a message is polled, it is hidden in the queue
- it can be deleted qhen the process is done
- if not deleted, ater a VisibilityTimeout it may return to the queue
Queues can be configured with maxReceiveCount , allowing messages that failing to be moved to a dead-letter queue
Lambda functions can be invoked based on messages on a queue offering better caling and faster response than AutoScaling groups for any messages that can be processed quickly
graph LR subgraph autoscaling group A[Instances] end B[Queue] subgraph autoscaling group C[Instances] end D((Assets S3 bucket)) A -- front-end pool--> B; B -- worker pool --> C; A --> D; D --> C;
Commands on AWS CLI
- get configuration
1
2
3aws sqs get-queue-attributes \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/sqstest \
--attribute-names All - send message
1
2
3aws sqs send-message \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/sqstest \
--message-body "this is a payload" - short poll queue
1
2
3# remeber messages are not deleted right away
aws sqs receive-message \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/sqstest - long poll queue
1
2
3
4
5
6
7
8
9# if there are not messages, it will wait around 10 seconds and try again
aws sqs receive-message --wait-time-seconds 10 \
--max-number-of-messages 10 \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/sqstest
# alternative
aws sqs --region us-east-1 receive-message \
--wait-time-seconds 10 \
--max-number-of-messages 10 \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/sqstest - remove message
1
2
3
4# make a short poll, get the receiptHandle from the result
aws sqs delete-message \
--queue-url https://sqs.us-east-1.amazonaws.com/123456789012/sqstest \
--receiptHandle "aReceiptHandleGoesHere"
- get configuration
Elastic Transcoder
- Elastic Transcoder: AWS service that allows you to convert media files from an input format to one or more output formats. It’s delivered as a service, and you are billed a per-minute charge
Pipeline: is a queue for jobs, it stores source and destination settings, notifications, security, and other high settings. Jobs are processed in the order they are added as resources allow
- can send notifications as jobs progress through various states. These might notify an administrator or initiate further-driven processing
Job: defines the input object and up to 30 output objects/formats. Jobs are added to a pipeline in the same region and use buckets defined in the pipeline for input/output
Presets contain transcoding settings and can be applied to jobs to ensure output compatible with various devices, such asiPhones, tablets, or other form factors
graph TD subgraph pipeline A[Job-1] B[Job-2] end subgraph job C[input] D[output-1] E[output-2] end F[presets] G[SNS] A --> C; C --> D; C --> E; F --> E; B --> G;
Analytics
Athena
Amazon Athena: interactive query service that utilizes schema on read, allowing you to run ad-hoc SQL-like queries on data from a range of sources. Results are returned in seconds, and you are billed only for the compute time used and any existing storage costs
- Serverless query engine cable of readig a wide range od data formats fromS3. Uses a schema.on.read approach to allow SQLlike queries against non-relational data (map attributes)
- Athena can query many forms of structured, semi-structured, and unconstructed data in S3
- Athena can be used to query various AWS logs, including flow, logs and ELB logs
- Tables are defined in a data catalog and are applied on read. Athena allow SQL queries against data stored on S3, through the schema-on-read-tables
- No data is modified by Athena, and output can be sent to visualization tools
Used for transactional queries, similar to Relational DB
graph TD subgraph S3 A(XML) B(JSON) C(CSV/TSV) D(AVRO) E(ORC) F(PARQUET) G(Source data) end subgraph tables H[schema] I[table] J[schema] end K(Amazon Athena) A --> G; B --> G; C --> G; D --> G; E --> G; F --> G; G --> H; H --> I; I --> J; J --> K;
Configuration example
- DDL code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21CREATE EXTERNAL TABLE IF NOT EXISTS default.vpc_flow_logs (
version int,
account string,
interfaceid string,
sourceaddress string,
destinationaddress string,
sourceport int,
destinationport int,
protocol int,
numpackets int,
numbytes bigint,
starttime int,
endtime int,
action string,
logstatus string
)
PARTITIONED BY (dt string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ' '
LOCATION 's3://{your_log_bucket}/AWSLogs/{account_id}/vpcflowlogs/us-east-1/'
TBLPROPERTIES ("skip.header.line.count"="1"); - Query example
1
2
3
4
5
6
7
8
9
10
11
12SELECT day_of_week(from_iso8601_timestamp(dt)) AS
day,
dt,
interfaceid,
sourceaddress,
destinationport,
action,
protocol
FROM vpc_flow_logs
WHERE action = 'REJECT' AND protocol = 6
order by sourceaddress
LIMIT 100;
- DDL code
EMR
Amazon Elastic MapReduce: tool for large-scale parallel processing of a big data and other data workloads.
- Based on Apache Hadoop framework and is delivered as a managed cluster using EC2 instance
- EMR is used for huge-scale log analysis, indexing machine learning, and other large scale applications
Structure
- Master node: manages the cluster
- manages HDFS naming, distributes workloads, monitors health
- log into it via SSH
- if the master node fails, the cluster fails
- EMR clusters: have 0 or more core nodes, which are managed by the master node
- run tasks, manage data for HDFS
- if they fail, ir can cause cluster inestability
- Task nodes (optional)
- execute tasks, but have no involvement with important cluster functions -> can be spot instances
- if task node fails, a core node starts the task on another task/core node
- data can be input/output from S3, intermediate data can be stored using HDFS in the cluster or EMRFS using S3
- Master node: manages the cluster
Used for repository, scaling unlimite load levels while storing petabytes worth of data
graph LR A((S3)) subgraph EMR B[Master node] C(Core node) D(Task node) end A --> B; B --> A; B --> C; B --> D; C --> B; C --> D; D --> B; D --> C;
Kinesis and Firehose
Kinesis: scalabale and resilient streaming service for AWS. It is designed to ingest large amounts of data from hundreds, thousands, or even millions of producers. Usd for anaylitics
- consumers can access a rolling windows of that data, or it can be stored in persistent storage of database products
- process Real Time data, and store it
Use Kinesis instead of SQS
you need many producers and many consumers
rolling window of data
independency for reading the same data window (queue does not allow that)
graph LR A[Producers] B[Kinesis essentials] C[Kinesis Data Firehose] subgraph consumers D[instances] E[lambda] end A --> B; B --> C; B --> D; B --> E;
Kinesis essentials
- Kinesis Stream
- can be usd to collect, process and analyze a large amount of incoming data
- public service accessible from inside VPCs or from the Internet by an unlimited number of producers
- include storage for all incoming data with a 24 hour default window, which can be increased to 7 days for an additional charge
- data records are added by producers and read by consumers
- Kinesis Shard
- added to streams to allow to allow them to scale
- a stream starts with at least 1 shard, whihc allows 1 MiB of ingestion and 2 MiB of consumption
- shards can be added or removed from streams
- Kinesis Data Record
- basic entity written to and read from Kinesis streams, data record can be up to 1 MB in size
- Kinesis Stream
Kinesis data firehose
- accepts records from a data source, and persist them directly into S3, Redshift…
- they can be trasformed before storing them, so you avoid using extra hardware
Redshift
Redshift: petabyte-scale data warehousing solution (target data storage). Long-time strategic
column-based database designed for analytical workloads
used for OLAP (column based databases, for retrieval and analytics)
multiple databases become source data to be injected into a data warehouse solution such as Redshift
Load and unload
- from/to S3
- backups can be performed to S3
- various AWS services such as Kinesis can inject data into Redshift
graph LR subgraph redshift A(Leader node) B[Compute node -slices] C[Compute node -slices] end D[S3] A --> B; A --> C; A -- data load --> D; D -- data unload --> A;
Components
- Leader node: gets the question, send it
- Compute node: distributed query
- Slices: where the queries are actually executed
Logging and Monitoring
CloudWatch
- CloudWatch: service that provides near real-time monitoring of AWS products (metrics repository)
- you can import custon metric data in real-time from some AWS services and on-premises platforms
- data retention is based on granularity
- 1h metrics -> 455 days
- 5m metrics -> 63 days
- 1m metrics -> 15 days
- metrics can be configured with alarms that can take actions
- data can be presented as a dashboard
graph LR A[AWS] B[custom on premises] subgraph CloudWatch C(metrics) D((alarm)) E((dashboard)) end F[SNS] G[AutoScaling] H[Client] A --> C; B --> C; C --> D; C --> E; D --> F; D --> G; E --> H;
Logs
CloudWatch logs: provides funcionality to store, monitor, access logs from multiple systems. What happened?
metric filters: pattern matches text in all log events in all log streams of whichever group it is created in, creating a metric
- can be used to analyze logs and create metrics (e.g. SSH failed logins)
log group: container for log streams. Controls retention, monitoring and access control
log stream: sequence of log events with the same source
graph LR subgraph log group A[log stream] B[log stream] C[log stream] end D(log events) E(metric filter) F[Alarm] D --> C; C --> E; E --> F;
CloudWatch Agent installation on an EC2 machine
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37## download
wget https://s3.amazonaws.com/amazoncloudwatch-agent/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm
## install
sudo rpm -U ./amazon-cloudwatch-agent.rpm
## execute the CloudWatch agent wizard:
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
## config example, it will generate a json file
# On which OS are you planning to use the agent?
# 1. linux
# Are you using EC2 on On-Premises hosts?
# 1. EC2
# Which user are you planning to run the agent?
# 1. root
# Do you want to turn on StatsD daemon?
# 2. no
# Do you want to monitor metrics from CollectD?
# 2. no
# Do you want to monitor any host metrics? e.g. CPU, memory, etc.
# 1. yes
# Do you want to monitor cpu metrics per core? Additional CloudWatch
# charges may apply.
# 2. no
# Do you want to add ec2 dimensions (ImageID, InstanceID, InstanceType,
# AutoScalingGroupName) into all of your metrics if the info is available?
# 2. no
# Would you like to collect your metrics at high resolution (sub-minute
# resolution)? This enables sub-minute resolution for all metrics,
# but you can customize for specific metrics in the output json file.
# 2. 10s
# Which default metrics config do you want?
# 1. Basic
## start it
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
-a fetch-config -m ec2 \
-c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -sMonitor a custom agent on the AWS console
- Navigate to CloudWatch -> Metrics in the left-hand menu
- In the Custom Namespaces section, click CWAgent. If you don’t see it yet, wait a few more minutes.
- In the Metrics section, click host to view the metrics.
CloudTrail
- CloudTrail: governance, compliance, risk management and auditing (Who did it?) that records account activity within an AWS
- any action taken by users, roles or AWS services are recorded here
- trails are regional can be created, giving more control over logging and allowing events to be stored in S3 and CloudWatch logs
- Components
- CloudTrail events: activity recorded (90 days by default: called “event history”)
- management events: events that log control plane events (e.g. login configuring security)
- data events (e.g. object-level events in S3)
- CloudTrail events: activity recorded (90 days by default: called “event history”)
- events -> cloudtrail -> S3 and CloudWatch logs
VPC Flow logs
- VPC flow logs: capture metadata about the traffic flowing in and out of networking interfaces within a VPC (account-id, protocol, bytes, log status…) -> will not snif traffic content
- can be placed on a specific network interface within a VPC
- will capture metadata from the capture traffic - only metadata on the traffic
- you may define filters (in doubt, select all)
- attached to:
VPC
subnet
network interface
graph LR A[CloudWatch logs] B[S3] subgraph VPC C((Flow logs)) end C --> A; C --> B;
Flow logs don’t capture some traffic: Amazon DNS server, Windows license activation, DHCP traffic, VPC router
- Example
- Flow logs -> Create flow log
- Verify it has active status, and copy the destination field hyperlink
- check the policies
- Go to ClouWatch, and create a group log
- Generate filter, test pattern
Operations
CloudWatch events
- CloudWatch events: near real time visibility of changes that happen within an AWS account
- using rules you can match against certain events within an account and deliver those events to a number of supported targets
- AWS services are either natively supported as event sources and deliver events directly, or allow event pattern matching against Cloudtraile events
- additional rules support scheduled events as sources, allowing a cron-style function for periodically passing events to targets
- event target examples: EC2, lambda, step function state machines. SNS topics, SQS queues
KMS
AWS Key Management Service (KMS): provides regional secure key management and encryption and decryption services
- FIPS 140-level 2 validated service -> certain aspects support level 3 (CloudHSM via custom KeyStores)
KMs Custom Master Keys (CMS): created in a region and never leave the region or KMS
- they can encryptdata up to 4KB
- have key policies and can be used to create other keys
CMK type Can view Can manage Dedicated to my account Customer Managed Yes Yes Yes AWS Managed CMK Yes No Yes AWS owned CMK No No No Encrypt/decrypt
operation data amount you supply it returns encrypt up to 4KB with CMK data, specify the key to use ciphertext decrypt up to 4KB ciphertext plain text re-encrypt up to 4KB with CMK ciphertext, new key new ciphertext Data Encryption key (DEK): generated using a CMK, can be used to encrypt/decrypt data of any size
- encrypted DEK and encrypted data can be stored together
- KMS is used to decryot the DEK, which can decrypt the data
Commands
1
2
3
4
5
6
7
8
9
10
11
12
13aws kms create-key --description "KMS DEMO CMK"
aws kms create-alias --target-key-id XXX \
--alias-name "alias/kmsdemo" --region us-east-1
echo "this is a secret message" > topsecret.txt
aws kms encrypt --key-id KEYID --plaintext file://topsecret.txt \
--output text --query CiphertextBlob
aws kms encrypt --key-id KEYID --plaintext file://topsecret.txt \
--output text --query CiphertextBlob | base64 --decode > topsecret.encrypted
aws kms decrypt --ciphertext-blob fileb://topsecret.encrypted \
--output text --query Plaintext | base64 --decode
# for re-encrypt, plaintext for locally,
# cyphertextblob for other systems -> enveloped encryption
aws kms generate-data-key --key-id KEYID --key-spec AES_256 --region us-east-1
Deployment
Elastic Beanstalk
Elastic Beanstalk (EB): Platform as a Service. To deploy code and, with very little modifications, the service will provision the infrastructure on your behalf
- handles provisioning, monitoring, autoscaling, load balancing and software updating
- supports many languages and platforms (Java, .NET, NodeJS, PHP, Ruby, Python, go, Docker, apache, IIS, nginx, Tomcat)
Elastic Beanstalk Application architecture (which is actually a Beanstak container, which store components as environments)
EB environment - worker tier
EB environment - web server
Swap URL
graph LR A(data) subgraph EBE B[ELB] C[Autoscaling] D[instance] end E(database) A --> B; B --> C; C --> D; D --> E; E --> D;
Patterns and anti-patterns
- YES: provision environment for an application with little admin overhead
- YES: if you use 1 of the supported languages and can add EB-specific config
- NO: if you want low-level infrastructure control
- NO: if you need chef support
Deployment options:
- All at once: an updated application version is deployed to all instances. Quick and siumple, but not recognized for product deployments
- Rolling: splits instances into batches and deploys 1 batch at a time
- Rolling with additional batch: as above, but provisions a new batch, deploying and testing before removing the odl batch (inmutable)
- Blue/green: maintain 2 environments, deploy and swap CNAME
OpsWorks
OPsWorks: implementation of the Chef configuration manager and deployment platform
- moves away from the low-level configurability of CloudFormation, but not as far as Elastic Beanstalk
- lets you create a stack of resources with ayers and manage resources as a unit
Stack components
- Stacks
- a unit of managed infrastructure
- ca use per application or per platform
- could use stacks for development, staging or production environments
- Layers
- comparable to application tiers within a stack
- e.g. database layer, application layer, proxy layer
- recipes are generally associated with layers and configure what to installon instances in that layer
- Instances
- instances are EC2 instances asssociated with layer
- configured as 24/7, load based, or time based
- Apps
- apps are deployed to layers from a source code repo or S3
- actual deployment happens using recipes on a layer
- other recipes are run when deployments happen, potencially to reconfigure other instances
- Recipes
- setup: executed on an instance when first provisioned
- configure: executed on all instances when instances are added or removed
- deploy and undeploy: executed when pps are added or removed
- shutdown: executed when instance is shut down, but before stopped
- Stacks