AWS Solutions architect 8 - Application, Analytics, and Operations

Application Integration

Simple Notification Service (SNS)

  • Publisher/subscriber base service

    • SNS coordinates and manages the sending and delivery of messages
    • messages sent to a topic (base) are distributed to subscribers
    • SNS integrates with many AWs services
    • using SNS, CloudWatch can notify admins of important alerts
  • Components

    • Topic: isolated configuration for SNS, including permissions

      • messages (<= 256 KB) are sent to topic
      • subscribers to that topic receive messages
    • Subscriber: endpoints that receive messages for a topic

      • HTTP(S)
      • email and email-json
      • SQS (message that can be added to more requests)
      • mobile phone push notifications
      • lamda functions
      • SMS (cell phone message)
    • Publisher: entity that publishes/sends messages to queues

      • application
      • AWS services, including S3 (S3 events), CloudWatch. CloudFormation…
          graph TD
      
      subgraph private VPC
        G[Cloud formation]
        H[EC2]
        I[ASG]
      end
      
      subgraph public Internet
        J((API HTTP/S))
      end
      
      subgraph SNS
        A(Publisher)
        B[Topic]
        C[Subscriber]
      end
      
      D(mail)
      E[SQS Queue]
      F(lambda)
      G[cell phone message]
      
      A --> B;
      B -- send message --> C;
      C --> J;
      J --> B;
      G --> B;
      H --> B;
      I --> B;
      
      C --> D;
      C --> E;
      C --> F;
      C --> G;

Simple Queue Service (SQS)

  • SQS provides fully managed, highly available message queues for inter-processes/server/service messaging

    • used mainly to create decoupled architecture
    • messages are added to a queue and retrieved via polling
  • Polling types

    • short polling: available messages are returned ASAP. May return 0 messages, increased number of API calls
    • long polling: waits for messages for a given WaitTimeSeconds (much efficient)
    • more efficient: less empty API calls/responses
  • Queue types

    • Standard (distributed, scalabale to nearly unlimited message volume)
    • FIFO (messages are delivered once only. Throughput limited to -3000 messages per second with batching or -300 without by default)

Lambda functions can be invoked based on a queue offering better scaling and faster response than Auto Scaling groups for any messages that can be processed quickly

  • Elements

    • SQS messages

      • (<= 256 KB) can link data from S3 for larger payloads
      • When a message is polled, it is hidden in the queue
        • it can be deleted qhen the process is done
        • if not deleted, ater a VisibilityTimeout it may return to the queue
    • Queues can be configured with maxReceiveCount , allowing messages that failing to be moved to a dead-letter queue

    • Lambda functions can be invoked based on messages on a queue offering better caling and faster response than AutoScaling groups for any messages that can be processed quickly

          graph LR
      
      subgraph autoscaling group
        A[Instances]
      end
      
      B[Queue]
      
      subgraph autoscaling group
        C[Instances]
      end
      
      D((Assets S3 bucket))
      
      A -- front-end pool--> B;
      B -- worker pool --> C;
      
      A --> D;
      D --> C;
  • Commands on AWS CLI

    • get configuration
      1
      2
      3
      aws sqs get-queue-attributes \
      --queue-url https://sqs.us-east-1.amazonaws.com/123456789012/sqstest \
      --attribute-names All
    • send message
      1
      2
      3
      aws sqs send-message \
      --queue-url https://sqs.us-east-1.amazonaws.com/123456789012/sqstest \
      --message-body "this is a payload"
    • short poll queue
      1
      2
      3
      # remeber messages are not deleted right away
      aws sqs receive-message \
      --queue-url https://sqs.us-east-1.amazonaws.com/123456789012/sqstest
    • long poll queue
      1
      2
      3
      4
      5
      6
      7
      8
      9
      # if there are not messages, it will wait around 10 seconds and try again
      aws sqs receive-message --wait-time-seconds 10 \
      --max-number-of-messages 10 \
      --queue-url https://sqs.us-east-1.amazonaws.com/123456789012/sqstest
      # alternative
      aws sqs --region us-east-1 receive-message \
      --wait-time-seconds 10 \
      --max-number-of-messages 10 \
      --queue-url https://sqs.us-east-1.amazonaws.com/123456789012/sqstest
    • remove message
      1
      2
      3
      4
      # make a short poll, get the receiptHandle from the result
      aws sqs delete-message \
      --queue-url https://sqs.us-east-1.amazonaws.com/123456789012/sqstest \
      --receiptHandle "aReceiptHandleGoesHere"

Elastic Transcoder

  • Elastic Transcoder: AWS service that allows you to convert media files from an input format to one or more output formats. It’s delivered as a service, and you are billed a per-minute charge
    • Pipeline: is a queue for jobs, it stores source and destination settings, notifications, security, and other high settings. Jobs are processed in the order they are added as resources allow

      • can send notifications as jobs progress through various states. These might notify an administrator or initiate further-driven processing
    • Job: defines the input object and up to 30 output objects/formats. Jobs are added to a pipeline in the same region and use buckets defined in the pipeline for input/output

    • Presets contain transcoding settings and can be applied to jobs to ensure output compatible with various devices, such asiPhones, tablets, or other form factors

          graph TD
      
      subgraph pipeline
        A[Job-1]
        B[Job-2]
      end
      
      subgraph job
        C[input]
        D[output-1]
        E[output-2]
      end
      
      F[presets]
      G[SNS]
      
      A --> C;
      C --> D;
      C --> E;
      F --> E;
      B --> G;

Analytics

Athena

  • Amazon Athena: interactive query service that utilizes schema on read, allowing you to run ad-hoc SQL-like queries on data from a range of sources. Results are returned in seconds, and you are billed only for the compute time used and any existing storage costs

    • Serverless query engine cable of readig a wide range od data formats fromS3. Uses a schema.on.read approach to allow SQLlike queries against non-relational data (map attributes)
    • Athena can query many forms of structured, semi-structured, and unconstructed data in S3
    • Athena can be used to query various AWS logs, including flow, logs and ELB logs
    • Tables are defined in a data catalog and are applied on read. Athena allow SQL queries against data stored on S3, through the schema-on-read-tables
    • No data is modified by Athena, and output can be sent to visualization tools
  • Used for transactional queries, similar to Relational DB

        graph TD
    
      subgraph S3
        A(XML)
        B(JSON)
        C(CSV/TSV)
        D(AVRO)
        E(ORC)
        F(PARQUET)
        G(Source data)
      end
    
      subgraph tables
        H[schema]
        I[table]
        J[schema]
      end
    
      K(Amazon Athena)
    
      A --> G;
      B --> G;
      C --> G;
      D --> G;
      E --> G;
      F --> G;
    
      G --> H;
      H --> I;
      I --> J;
      J --> K;
  • Configuration example

    • DDL code
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      CREATE EXTERNAL TABLE IF NOT EXISTS default.vpc_flow_logs (
      version int,
      account string,
      interfaceid string,
      sourceaddress string,
      destinationaddress string,
      sourceport int,
      destinationport int,
      protocol int,
      numpackets int,
      numbytes bigint,
      starttime int,
      endtime int,
      action string,
      logstatus string
      )
      PARTITIONED BY (dt string)
      ROW FORMAT DELIMITED
      FIELDS TERMINATED BY ' '
      LOCATION 's3://{your_log_bucket}/AWSLogs/{account_id}/vpcflowlogs/us-east-1/'
      TBLPROPERTIES ("skip.header.line.count"="1");
    • Query example
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      SELECT day_of_week(from_iso8601_timestamp(dt)) AS
      day,
      dt,
      interfaceid,
      sourceaddress,
      destinationport,
      action,
      protocol
      FROM vpc_flow_logs
      WHERE action = 'REJECT' AND protocol = 6
      order by sourceaddress
      LIMIT 100;

EMR

  • Amazon Elastic MapReduce: tool for large-scale parallel processing of a big data and other data workloads.

    • Based on Apache Hadoop framework and is delivered as a managed cluster using EC2 instance
    • EMR is used for huge-scale log analysis, indexing machine learning, and other large scale applications
  • Structure

    • Master node: manages the cluster
      • manages HDFS naming, distributes workloads, monitors health
      • log into it via SSH
      • if the master node fails, the cluster fails
    • EMR clusters: have 0 or more core nodes, which are managed by the master node
      • run tasks, manage data for HDFS
      • if they fail, ir can cause cluster inestability
    • Task nodes (optional)
      • execute tasks, but have no involvement with important cluster functions -> can be spot instances
      • if task node fails, a core node starts the task on another task/core node
    • data can be input/output from S3, intermediate data can be stored using HDFS in the cluster or EMRFS using S3
  • Used for repository, scaling unlimite load levels while storing petabytes worth of data

        graph LR
    
      A((S3))
    
      subgraph EMR
        B[Master node]
        C(Core node)
        D(Task node)
      end
    
      A --> B;
      B --> A;
      B --> C;
      B --> D;
      C --> B;
      C --> D;
      D --> B;
      D --> C;

Kinesis and Firehose

  • Kinesis: scalabale and resilient streaming service for AWS. It is designed to ingest large amounts of data from hundreds, thousands, or even millions of producers. Usd for anaylitics

    • consumers can access a rolling windows of that data, or it can be stored in persistent storage of database products
    • process Real Time data, and store it
  • Use Kinesis instead of SQS

    • you need many producers and many consumers

    • rolling window of data

    • independency for reading the same data window (queue does not allow that)

          graph LR
      A[Producers]
      B[Kinesis essentials]
      C[Kinesis Data Firehose]
      
      subgraph consumers
        D[instances]
        E[lambda]
      end
      
      A --> B;
      B --> C;
      B --> D;
      B --> E;
  • Kinesis essentials

    • Kinesis Stream
      • can be usd to collect, process and analyze a large amount of incoming data
      • public service accessible from inside VPCs or from the Internet by an unlimited number of producers
      • include storage for all incoming data with a 24 hour default window, which can be increased to 7 days for an additional charge
      • data records are added by producers and read by consumers
    • Kinesis Shard
      • added to streams to allow to allow them to scale
      • a stream starts with at least 1 shard, whihc allows 1 MiB of ingestion and 2 MiB of consumption
      • shards can be added or removed from streams
    • Kinesis Data Record
      • basic entity written to and read from Kinesis streams, data record can be up to 1 MB in size
  • Kinesis data firehose

    • accepts records from a data source, and persist them directly into S3, Redshift…
    • they can be trasformed before storing them, so you avoid using extra hardware

Redshift

  • Redshift: petabyte-scale data warehousing solution (target data storage). Long-time strategic

    • column-based database designed for analytical workloads

    • used for OLAP (column based databases, for retrieval and analytics)

    • multiple databases become source data to be injected into a data warehouse solution such as Redshift

    • Load and unload

      • from/to S3
      • backups can be performed to S3
      • various AWS services such as Kinesis can inject data into Redshift
          graph LR
      
      subgraph redshift
        A(Leader node)
        B[Compute node -slices]
        C[Compute node -slices]
      end
      
      D[S3]
      
      A --> B;
      A --> C;
      A -- data load --> D;
      D -- data unload --> A;
  • Components

    • Leader node: gets the question, send it
    • Compute node: distributed query
    • Slices: where the queries are actually executed

Logging and Monitoring

CloudWatch

  • CloudWatch: service that provides near real-time monitoring of AWS products (metrics repository)
    • you can import custon metric data in real-time from some AWS services and on-premises platforms
    • data retention is based on granularity
      • 1h metrics -> 455 days
      • 5m metrics -> 63 days
      • 1m metrics -> 15 days
    • metrics can be configured with alarms that can take actions
    • data can be presented as a dashboard
          graph LR
      
      A[AWS]
      B[custom on premises]
      
      subgraph CloudWatch
        C(metrics)
        D((alarm))
        E((dashboard))
      end
      
      F[SNS]
      G[AutoScaling]
      H[Client]
      
      A --> C;
      B --> C;
      C --> D;
      C --> E;
      D --> F;
      D --> G;
      E --> H;

Logs

  • CloudWatch logs: provides funcionality to store, monitor, access logs from multiple systems. What happened?

    • metric filters: pattern matches text in all log events in all log streams of whichever group it is created in, creating a metric

      • can be used to analyze logs and create metrics (e.g. SSH failed logins)
    • log group: container for log streams. Controls retention, monitoring and access control

    • log stream: sequence of log events with the same source

          graph LR
      
      subgraph log group
        A[log stream]
        B[log stream]
        C[log stream]
      end
      
      D(log events)
      E(metric filter)
      F[Alarm]
      
      D --> C;
      C --> E;
      E --> F;
  • CloudWatch Agent installation on an EC2 machine

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    ## download
    wget https://s3.amazonaws.com/amazoncloudwatch-agent/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm
    ## install
    sudo rpm -U ./amazon-cloudwatch-agent.rpm
    ## execute the CloudWatch agent wizard:
    sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard

    ## config example, it will generate a json file
    # On which OS are you planning to use the agent?
    # 1. linux
    # Are you using EC2 on On-Premises hosts?
    # 1. EC2
    # Which user are you planning to run the agent?
    # 1. root
    # Do you want to turn on StatsD daemon?
    # 2. no
    # Do you want to monitor metrics from CollectD?
    # 2. no
    # Do you want to monitor any host metrics? e.g. CPU, memory, etc.
    # 1. yes
    # Do you want to monitor cpu metrics per core? Additional CloudWatch
    # charges may apply.
    # 2. no
    # Do you want to add ec2 dimensions (ImageID, InstanceID, InstanceType,
    # AutoScalingGroupName) into all of your metrics if the info is available?
    # 2. no
    # Would you like to collect your metrics at high resolution (sub-minute
    # resolution)? This enables sub-minute resolution for all metrics,
    # but you can customize for specific metrics in the output json file.
    # 2. 10s
    # Which default metrics config do you want?
    # 1. Basic

    ## start it
    sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
    -a fetch-config -m ec2 \
    -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -s
  • Monitor a custom agent on the AWS console

    1. Navigate to CloudWatch -> Metrics in the left-hand menu
    2. In the Custom Namespaces section, click CWAgent. If you don’t see it yet, wait a few more minutes.
    3. In the Metrics section, click host to view the metrics.

CloudTrail

  • CloudTrail: governance, compliance, risk management and auditing (Who did it?) that records account activity within an AWS
    • any action taken by users, roles or AWS services are recorded here
    • trails are regional can be created, giving more control over logging and allowing events to be stored in S3 and CloudWatch logs
  • Components
    • CloudTrail events: activity recorded (90 days by default: called “event history”)
      • management events: events that log control plane events (e.g. login configuring security)
      • data events (e.g. object-level events in S3)
  • events -> cloudtrail -> S3 and CloudWatch logs

VPC Flow logs

  • VPC flow logs: capture metadata about the traffic flowing in and out of networking interfaces within a VPC (account-id, protocol, bytes, log status…) -> will not snif traffic content
    • can be placed on a specific network interface within a VPC
    • will capture metadata from the capture traffic - only metadata on the traffic
    • you may define filters (in doubt, select all)
  • attached to:
    • VPC

    • subnet

    • network interface

          graph LR
      
      A[CloudWatch logs]
      B[S3]
      
      subgraph VPC
        C((Flow logs))
      end
      
      C --> A;
      C --> B;

Flow logs don’t capture some traffic: Amazon DNS server, Windows license activation, DHCP traffic, VPC router

  • Example
    1. Flow logs -> Create flow log
    2. Verify it has active status, and copy the destination field hyperlink
    3. check the policies
    4. Go to ClouWatch, and create a group log
    5. Generate filter, test pattern

Operations

CloudWatch events

  • CloudWatch events: near real time visibility of changes that happen within an AWS account
    • using rules you can match against certain events within an account and deliver those events to a number of supported targets
    • AWS services are either natively supported as event sources and deliver events directly, or allow event pattern matching against Cloudtraile events
    • additional rules support scheduled events as sources, allowing a cron-style function for periodically passing events to targets
    • event target examples: EC2, lambda, step function state machines. SNS topics, SQS queues

KMS

  • AWS Key Management Service (KMS): provides regional secure key management and encryption and decryption services

    • FIPS 140-level 2 validated service -> certain aspects support level 3 (CloudHSM via custom KeyStores)
  • KMs Custom Master Keys (CMS): created in a region and never leave the region or KMS

    • they can encryptdata up to 4KB
    • have key policies and can be used to create other keys
    CMK type Can view Can manage Dedicated to my account
    Customer Managed Yes Yes Yes
    AWS Managed CMK Yes No Yes
    AWS owned CMK No No No
  • Encrypt/decrypt

    operation data amount you supply it returns
    encrypt up to 4KB with CMK data, specify the key to use ciphertext
    decrypt up to 4KB ciphertext plain text
    re-encrypt up to 4KB with CMK ciphertext, new key new ciphertext
  • Data Encryption key (DEK): generated using a CMK, can be used to encrypt/decrypt data of any size

    • encrypted DEK and encrypted data can be stored together
    • KMS is used to decryot the DEK, which can decrypt the data
  • Commands

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    aws kms create-key --description "KMS DEMO CMK"
    aws kms create-alias --target-key-id XXX \
    --alias-name "alias/kmsdemo" --region us-east-1
    echo "this is a secret message" > topsecret.txt
    aws kms encrypt --key-id KEYID --plaintext file://topsecret.txt \
    --output text --query CiphertextBlob
    aws kms encrypt --key-id KEYID --plaintext file://topsecret.txt \
    --output text --query CiphertextBlob | base64 --decode > topsecret.encrypted
    aws kms decrypt --ciphertext-blob fileb://topsecret.encrypted \
    --output text --query Plaintext | base64 --decode
    # for re-encrypt, plaintext for locally,
    # cyphertextblob for other systems -> enveloped encryption
    aws kms generate-data-key --key-id KEYID --key-spec AES_256 --region us-east-1

Deployment

Elastic Beanstalk

  • Elastic Beanstalk (EB): Platform as a Service. To deploy code and, with very little modifications, the service will provision the infrastructure on your behalf

    • handles provisioning, monitoring, autoscaling, load balancing and software updating
    • supports many languages and platforms (Java, .NET, NodeJS, PHP, Ruby, Python, go, Docker, apache, IIS, nginx, Tomcat)
  • Elastic Beanstalk Application architecture (which is actually a Beanstak container, which store components as environments)

    • EB environment - worker tier

    • EB environment - web server

    • Swap URL

          graph LR
      
      A(data)
      
      subgraph EBE
        B[ELB]
        C[Autoscaling]
        D[instance]
      end
      
      E(database)
      
      A --> B;
      B --> C;
      C --> D;
      D --> E;
      E --> D;
  • Patterns and anti-patterns

    • YES: provision environment for an application with little admin overhead
    • YES: if you use 1 of the supported languages and can add EB-specific config
    • NO: if you want low-level infrastructure control
    • NO: if you need chef support
  • Deployment options:

    • All at once: an updated application version is deployed to all instances. Quick and siumple, but not recognized for product deployments
    • Rolling: splits instances into batches and deploys 1 batch at a time
    • Rolling with additional batch: as above, but provisions a new batch, deploying and testing before removing the odl batch (inmutable)
    • Blue/green: maintain 2 environments, deploy and swap CNAME

OpsWorks

  • OPsWorks: implementation of the Chef configuration manager and deployment platform

    • moves away from the low-level configurability of CloudFormation, but not as far as Elastic Beanstalk
    • lets you create a stack of resources with ayers and manage resources as a unit
  • Stack components

    • Stacks
      • a unit of managed infrastructure
      • ca use per application or per platform
      • could use stacks for development, staging or production environments
    • Layers
      • comparable to application tiers within a stack
      • e.g. database layer, application layer, proxy layer
      • recipes are generally associated with layers and configure what to installon instances in that layer
    • Instances
      • instances are EC2 instances asssociated with layer
      • configured as 24/7, load based, or time based
    • Apps
      • apps are deployed to layers from a source code repo or S3
      • actual deployment happens using recipes on a layer
      • other recipes are run when deployments happen, potencially to reconfigure other instances
    • Recipes
      • setup: executed on an instance when first provisioned
      • configure: executed on all instances when instances are added or removed
      • deploy and undeploy: executed when pps are added or removed
      • shutdown: executed when instance is shut down, but before stopped