Libraries required

Asyncio: CLI interpreter

  • Install

    1
    python -m pip install asyncio
  • Use

    1
    python -m asyncio

HTTPX: Async GET

  • Install

    1
    pyhton -m pip install httpx
  • Use (similar to requests)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    """
    Make the HTTP call and return the response in a typed and verified mode
    """
    import httpx

    async with httpx.AsyncClient() as client:
    response = await client.get('https://ifconfig.co/json')
    print(response)
    # <Response [200 OK]>

Pydantic: response data validation

Validate so we can access the JSON body of this response, to build typing.
If the API returns something unexpected, we want to return a friendly exception, instead of an AttributeError.

  • Install

    1
    python -m pip install pydantic
  • Use

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    """
    Integrate typing: (r.json)
    - Check the url result exists, check the typing
    """
    from pydantic import BaseModel
    from ipaddress import IPv4Address

    class IfConfig(BaseModel):
    ip: IPv4Address
    ip_decimal: int

    ifconfig = IfConfig.parse_obj(r.json())
    print(ifconfig)
    # IfConfig(ip=IPv4Address('78.153.21.75'), ip_decimal=1807729179)

Creation of an API client

  • API client code

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    from ipaddress import IPv4Address

    from pydantic import BaseModel
    from httpx import AsyncClient

    IFCONFIG_URL = 'https://ifconfig.co/'

    class IfConfig(BaseModel):
    ip: IPv4Address
    ip_decimal: int

    # client should inherit from AsyncClient
    class IfConfigClient(AsyncClient):
    def __init__(self):
    super().__init__(base_url=IFCONFIG_URL)

    # async call
    async def get_ifconfig(self):
    request = await self.get('json')

    # validation
    try:
    ifconfig = IfConfig.parse_obj(request.json())
    except ValidationError:
    print('Something went wrong!')

    return ifconfig
  • API call

    1
    2
    3
    4
    5
    6
    7
    from ifconfig import IfConfigClient

    async with IfConfigClient() as client:
    ifconfig = await client.get_ifconfig()

    print(ifconfig)
    # IfConfig(ip=IPv4Address('78.153.21.75'), ip_decimal=1807729179)

API test with mocks

  • Use
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    from unittest.mock import patch
    import ifconfig

    mock_ip = IPv4Address('78.153.21.75')
    mock_ip_decimal = 1807729179
    mock_ifconfig = ifconfig.IfConfig(ip=mock_ip, ip_decimal=mock_ip_decimal)

    @patch.object(ifconfig.IfConfigClient, 'get_ifconfig', return_value=mock_ifconfig)
    def test_ifconfig_processing(get_ifconfig):
    assert get_ifconfig.assert_awaited_once()

What is the Python Black formatter

Black is the uncompromising Python code formatter. By using it, you agree to cede control over minutiae of hand-formatting. In return, Black gives you speed, determinism, and freedom from pycodestyle nagging about formatting.You will save time and mental energy for more important matters.

Install Black

1
python -m pip install black

Format

1
2
3
4
# format a file
python -m black my_dummy_code_file.py
# format a directory
python -m black my_dummy_code_directory/

Check files for formatting

  • Check which python files can be formatted in the current directory, but do not want to modify the files yet

    1
    python -m black --check
  • Check what are the changes that black proposes

    1
    black --check --diff my_dummy_code_file.py

Change number of characters per line

1
2
# default black value is 88
black -l 60 my_dummy_code_file.py

Obtain file object

  • Open and close

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    """
    Retrieves a file object on a certain mode
    :param file: relative or basolute path to the file, including the extension
    :type file: str
    :param mode: a string that indicates what you want to do with the file
    :type mode: str

    :returns: a file object, with file-oriented API to an underlying resource
    :rtype: obj
    """
    my_file_object = open(file, mode)
    # clean up
    my_file_object.close()
    Mode Representation
    Read ("r")
    Append ("a")
    Write ("w")
    Create ("x")
    Text mode ("t")
    Binary mode ("b")
    Read + (over)write (r+) or ("w+")
    Read + append ("a+")
    • example
      1
      2
      3
      4
      # The relative path is "names.txt". Writing in bynary mode
      open('dummy_file.txt', 'wb') # The relative path is "names.txt"
      # Open file inside a folder
      open('dummy_folder/dummy_file.txt')

Read a file

  • Read

    1
    2
    3
    my_file_object = open('dummy_folder/dummy_file.txt', 'r')
    print(my_file_object.mode) # Output: 'r'
    print(my_file_object.read()) # read returns a str
    • File object attributes:
      • name: the name of the file.
      • closed: True if the file is closed. False otherwise.
      • mode: the mode used to open the file.
  • Readline

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    """"
    Reads one line of the file until it reaches the end of that line.
    A trailing newline character (\n) is kept in the string.

    :param size: optional, the maximum number of characters that you want to read.
    :type size: int

    :returns: one line of the file
    :rtype: str
    """"
    file_object.readline(size)
    • example
      1
      2
      3
      file_object = open('dummy_file.txt', 'r')
      print(file_object.readline())
      file_obejct.close()
  • Readlines

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    """"
    Reads the file, and retrieves a list of string.
    Each element of the list is a line of the file.
    A trailing newline character (\n) is kept in the string.

    :param size: optional, the maximum number of characters that you want to read.
    :type size: int

    :return: a list of strings, representing the whole file
    :rtype: list
    """"
    file_object.readlines()
    • example
      1
      2
      3
      4
      5
      file_object = open('dummy_file.txt', 'r')
      # iterate on the list
      for line in file_object.readlines():
      print(line)
      file_object.close()

Create a file

You can create and write dynamically on it. If it already exists you will get an exception.

1
file_object = open(filename, 'x')

Write on a file

  • write

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    """"
    Writes on a file, either appending (mode 'a') or overwritting ('w')

    :param my_new_content: the content you want to append
    :type my_new_content: str
    :param mode: the writting mode
    :type mode: str

    :return: the number of characters written
    :rtype: int
    """"
    file_object.write(my_new_content, writting_mode)
    • example
    1
    2
    3
    4
    5
    6
    7
    8
    # append
    file_object = open('dummy_file.txt', 'a')
    file_object.write('\nthis a new line', )
    file_object.close()
    # overwrite
    file_object = open('dummy_file.txt', 'w')
    file_object.write('This is new content', )
    file_object.close()

Delete a file

1
2
3
import os

os.remove(file_name)

Context manager

Structure which handles the files closing for you

1
2
3
with open(file_name, file_mode) as file_object
# so something with file_object
print(f.readlines())

Handle exceptions

Exception Cause
FileNotFoundError a file or directory is requested but doesn’t exist.
PermissionError Trying to run an operation without the adequate access rights
IsADirectoryError A file operation is requested on a directory
  • Try/catch
    1
    2
    3
    4
    5
    6
    try:
    # Try to run this block of code
    except <type_of_exception>:
    # Python stops the process and jumps to this block
    finally
    # Do this after running the code, even if an exception was raised
    • example
      1
      2
      3
      4
      5
      6
      try:
      f = open('names.txt')
      except FileNotFoundError:
      print('The file does not exist')
      finally:
      f.close()

Application Integration

Simple Notification Service (SNS)

  • Publisher/subscriber base service

    • SNS coordinates and manages the sending and delivery of messages
    • messages sent to a topic (base) are distributed to subscribers
    • SNS integrates with many AWs services
    • using SNS, CloudWatch can notify admins of important alerts
  • Components

    • Topic: isolated configuration for SNS, including permissions

      • messages (<= 256 KB) are sent to topic
      • subscribers to that topic receive messages
    • Subscriber: endpoints that receive messages for a topic

      • HTTP(S)
      • email and email-json
      • SQS (message that can be added to more requests)
      • mobile phone push notifications
      • lamda functions
      • SMS (cell phone message)
    • Publisher: entity that publishes/sends messages to queues

      • application
      • AWS services, including S3 (S3 events), CloudWatch. CloudFormation…
          graph TD
      
      subgraph private VPC
        G[Cloud formation]
        H[EC2]
        I[ASG]
      end
      
      subgraph public Internet
        J((API HTTP/S))
      end
      
      subgraph SNS
        A(Publisher)
        B[Topic]
        C[Subscriber]
      end
      
      D(mail)
      E[SQS Queue]
      F(lambda)
      G[cell phone message]
      
      A --> B;
      B -- send message --> C;
      C --> J;
      J --> B;
      G --> B;
      H --> B;
      I --> B;
      
      C --> D;
      C --> E;
      C --> F;
      C --> G;

Simple Queue Service (SQS)

  • SQS provides fully managed, highly available message queues for inter-processes/server/service messaging

    • used mainly to create decoupled architecture
    • messages are added to a queue and retrieved via polling
  • Polling types

    • short polling: available messages are returned ASAP. May return 0 messages, increased number of API calls
    • long polling: waits for messages for a given WaitTimeSeconds (much efficient)
    • more efficient: less empty API calls/responses
  • Queue types

    • Standard (distributed, scalabale to nearly unlimited message volume)
    • FIFO (messages are delivered once only. Throughput limited to -3000 messages per second with batching or -300 without by default)

Lambda functions can be invoked based on a queue offering better scaling and faster response than Auto Scaling groups for any messages that can be processed quickly

  • Elements

    • SQS messages

      • (<= 256 KB) can link data from S3 for larger payloads
      • When a message is polled, it is hidden in the queue
        • it can be deleted qhen the process is done
        • if not deleted, ater a VisibilityTimeout it may return to the queue
    • Queues can be configured with maxReceiveCount , allowing messages that failing to be moved to a dead-letter queue

    • Lambda functions can be invoked based on messages on a queue offering better caling and faster response than AutoScaling groups for any messages that can be processed quickly

          graph LR
      
      subgraph autoscaling group
        A[Instances]
      end
      
      B[Queue]
      
      subgraph autoscaling group
        C[Instances]
      end
      
      D((Assets S3 bucket))
      
      A -- front-end pool--> B;
      B -- worker pool --> C;
      
      A --> D;
      D --> C;
  • Commands on AWS CLI

    • get configuration
      1
      2
      3
      aws sqs get-queue-attributes \
      --queue-url https://sqs.us-east-1.amazonaws.com/123456789012/sqstest \
      --attribute-names All
    • send message
      1
      2
      3
      aws sqs send-message \
      --queue-url https://sqs.us-east-1.amazonaws.com/123456789012/sqstest \
      --message-body "this is a payload"
    • short poll queue
      1
      2
      3
      # remeber messages are not deleted right away
      aws sqs receive-message \
      --queue-url https://sqs.us-east-1.amazonaws.com/123456789012/sqstest
    • long poll queue
      1
      2
      3
      4
      5
      6
      7
      8
      9
      # if there are not messages, it will wait around 10 seconds and try again
      aws sqs receive-message --wait-time-seconds 10 \
      --max-number-of-messages 10 \
      --queue-url https://sqs.us-east-1.amazonaws.com/123456789012/sqstest
      # alternative
      aws sqs --region us-east-1 receive-message \
      --wait-time-seconds 10 \
      --max-number-of-messages 10 \
      --queue-url https://sqs.us-east-1.amazonaws.com/123456789012/sqstest
    • remove message
      1
      2
      3
      4
      # make a short poll, get the receiptHandle from the result
      aws sqs delete-message \
      --queue-url https://sqs.us-east-1.amazonaws.com/123456789012/sqstest \
      --receiptHandle "aReceiptHandleGoesHere"

Elastic Transcoder

  • Elastic Transcoder: AWS service that allows you to convert media files from an input format to one or more output formats. It’s delivered as a service, and you are billed a per-minute charge
    • Pipeline: is a queue for jobs, it stores source and destination settings, notifications, security, and other high settings. Jobs are processed in the order they are added as resources allow

      • can send notifications as jobs progress through various states. These might notify an administrator or initiate further-driven processing
    • Job: defines the input object and up to 30 output objects/formats. Jobs are added to a pipeline in the same region and use buckets defined in the pipeline for input/output

    • Presets contain transcoding settings and can be applied to jobs to ensure output compatible with various devices, such asiPhones, tablets, or other form factors

          graph TD
      
      subgraph pipeline
        A[Job-1]
        B[Job-2]
      end
      
      subgraph job
        C[input]
        D[output-1]
        E[output-2]
      end
      
      F[presets]
      G[SNS]
      
      A --> C;
      C --> D;
      C --> E;
      F --> E;
      B --> G;

Analytics

Athena

  • Amazon Athena: interactive query service that utilizes schema on read, allowing you to run ad-hoc SQL-like queries on data from a range of sources. Results are returned in seconds, and you are billed only for the compute time used and any existing storage costs

    • Serverless query engine cable of readig a wide range od data formats fromS3. Uses a schema.on.read approach to allow SQLlike queries against non-relational data (map attributes)
    • Athena can query many forms of structured, semi-structured, and unconstructed data in S3
    • Athena can be used to query various AWS logs, including flow, logs and ELB logs
    • Tables are defined in a data catalog and are applied on read. Athena allow SQL queries against data stored on S3, through the schema-on-read-tables
    • No data is modified by Athena, and output can be sent to visualization tools
  • Used for transactional queries, similar to Relational DB

        graph TD
    
      subgraph S3
        A(XML)
        B(JSON)
        C(CSV/TSV)
        D(AVRO)
        E(ORC)
        F(PARQUET)
        G(Source data)
      end
    
      subgraph tables
        H[schema]
        I[table]
        J[schema]
      end
    
      K(Amazon Athena)
    
      A --> G;
      B --> G;
      C --> G;
      D --> G;
      E --> G;
      F --> G;
    
      G --> H;
      H --> I;
      I --> J;
      J --> K;
  • Configuration example

    • DDL code
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      CREATE EXTERNAL TABLE IF NOT EXISTS default.vpc_flow_logs (
      version int,
      account string,
      interfaceid string,
      sourceaddress string,
      destinationaddress string,
      sourceport int,
      destinationport int,
      protocol int,
      numpackets int,
      numbytes bigint,
      starttime int,
      endtime int,
      action string,
      logstatus string
      )
      PARTITIONED BY (dt string)
      ROW FORMAT DELIMITED
      FIELDS TERMINATED BY ' '
      LOCATION 's3://{your_log_bucket}/AWSLogs/{account_id}/vpcflowlogs/us-east-1/'
      TBLPROPERTIES ("skip.header.line.count"="1");
    • Query example
      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      SELECT day_of_week(from_iso8601_timestamp(dt)) AS
      day,
      dt,
      interfaceid,
      sourceaddress,
      destinationport,
      action,
      protocol
      FROM vpc_flow_logs
      WHERE action = 'REJECT' AND protocol = 6
      order by sourceaddress
      LIMIT 100;

EMR

  • Amazon Elastic MapReduce: tool for large-scale parallel processing of a big data and other data workloads.

    • Based on Apache Hadoop framework and is delivered as a managed cluster using EC2 instance
    • EMR is used for huge-scale log analysis, indexing machine learning, and other large scale applications
  • Structure

    • Master node: manages the cluster
      • manages HDFS naming, distributes workloads, monitors health
      • log into it via SSH
      • if the master node fails, the cluster fails
    • EMR clusters: have 0 or more core nodes, which are managed by the master node
      • run tasks, manage data for HDFS
      • if they fail, ir can cause cluster inestability
    • Task nodes (optional)
      • execute tasks, but have no involvement with important cluster functions -> can be spot instances
      • if task node fails, a core node starts the task on another task/core node
    • data can be input/output from S3, intermediate data can be stored using HDFS in the cluster or EMRFS using S3
  • Used for repository, scaling unlimite load levels while storing petabytes worth of data

        graph LR
    
      A((S3))
    
      subgraph EMR
        B[Master node]
        C(Core node)
        D(Task node)
      end
    
      A --> B;
      B --> A;
      B --> C;
      B --> D;
      C --> B;
      C --> D;
      D --> B;
      D --> C;

Kinesis and Firehose

  • Kinesis: scalabale and resilient streaming service for AWS. It is designed to ingest large amounts of data from hundreds, thousands, or even millions of producers. Usd for anaylitics

    • consumers can access a rolling windows of that data, or it can be stored in persistent storage of database products
    • process Real Time data, and store it
  • Use Kinesis instead of SQS

    • you need many producers and many consumers

    • rolling window of data

    • independency for reading the same data window (queue does not allow that)

          graph LR
      A[Producers]
      B[Kinesis essentials]
      C[Kinesis Data Firehose]
      
      subgraph consumers
        D[instances]
        E[lambda]
      end
      
      A --> B;
      B --> C;
      B --> D;
      B --> E;
  • Kinesis essentials

    • Kinesis Stream
      • can be usd to collect, process and analyze a large amount of incoming data
      • public service accessible from inside VPCs or from the Internet by an unlimited number of producers
      • include storage for all incoming data with a 24 hour default window, which can be increased to 7 days for an additional charge
      • data records are added by producers and read by consumers
    • Kinesis Shard
      • added to streams to allow to allow them to scale
      • a stream starts with at least 1 shard, whihc allows 1 MiB of ingestion and 2 MiB of consumption
      • shards can be added or removed from streams
    • Kinesis Data Record
      • basic entity written to and read from Kinesis streams, data record can be up to 1 MB in size
  • Kinesis data firehose

    • accepts records from a data source, and persist them directly into S3, Redshift…
    • they can be trasformed before storing them, so you avoid using extra hardware

Redshift

  • Redshift: petabyte-scale data warehousing solution (target data storage). Long-time strategic

    • column-based database designed for analytical workloads

    • used for OLAP (column based databases, for retrieval and analytics)

    • multiple databases become source data to be injected into a data warehouse solution such as Redshift

    • Load and unload

      • from/to S3
      • backups can be performed to S3
      • various AWS services such as Kinesis can inject data into Redshift
          graph LR
      
      subgraph redshift
        A(Leader node)
        B[Compute node -slices]
        C[Compute node -slices]
      end
      
      D[S3]
      
      A --> B;
      A --> C;
      A -- data load --> D;
      D -- data unload --> A;
  • Components

    • Leader node: gets the question, send it
    • Compute node: distributed query
    • Slices: where the queries are actually executed

Logging and Monitoring

CloudWatch

  • CloudWatch: service that provides near real-time monitoring of AWS products (metrics repository)
    • you can import custon metric data in real-time from some AWS services and on-premises platforms
    • data retention is based on granularity
      • 1h metrics -> 455 days
      • 5m metrics -> 63 days
      • 1m metrics -> 15 days
    • metrics can be configured with alarms that can take actions
    • data can be presented as a dashboard
          graph LR
      
      A[AWS]
      B[custom on premises]
      
      subgraph CloudWatch
        C(metrics)
        D((alarm))
        E((dashboard))
      end
      
      F[SNS]
      G[AutoScaling]
      H[Client]
      
      A --> C;
      B --> C;
      C --> D;
      C --> E;
      D --> F;
      D --> G;
      E --> H;

Logs

  • CloudWatch logs: provides funcionality to store, monitor, access logs from multiple systems. What happened?

    • metric filters: pattern matches text in all log events in all log streams of whichever group it is created in, creating a metric

      • can be used to analyze logs and create metrics (e.g. SSH failed logins)
    • log group: container for log streams. Controls retention, monitoring and access control

    • log stream: sequence of log events with the same source

          graph LR
      
      subgraph log group
        A[log stream]
        B[log stream]
        C[log stream]
      end
      
      D(log events)
      E(metric filter)
      F[Alarm]
      
      D --> C;
      C --> E;
      E --> F;
  • CloudWatch Agent installation on an EC2 machine

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    ## download
    wget https://s3.amazonaws.com/amazoncloudwatch-agent/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm
    ## install
    sudo rpm -U ./amazon-cloudwatch-agent.rpm
    ## execute the CloudWatch agent wizard:
    sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard

    ## config example, it will generate a json file
    # On which OS are you planning to use the agent?
    # 1. linux
    # Are you using EC2 on On-Premises hosts?
    # 1. EC2
    # Which user are you planning to run the agent?
    # 1. root
    # Do you want to turn on StatsD daemon?
    # 2. no
    # Do you want to monitor metrics from CollectD?
    # 2. no
    # Do you want to monitor any host metrics? e.g. CPU, memory, etc.
    # 1. yes
    # Do you want to monitor cpu metrics per core? Additional CloudWatch
    # charges may apply.
    # 2. no
    # Do you want to add ec2 dimensions (ImageID, InstanceID, InstanceType,
    # AutoScalingGroupName) into all of your metrics if the info is available?
    # 2. no
    # Would you like to collect your metrics at high resolution (sub-minute
    # resolution)? This enables sub-minute resolution for all metrics,
    # but you can customize for specific metrics in the output json file.
    # 2. 10s
    # Which default metrics config do you want?
    # 1. Basic

    ## start it
    sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
    -a fetch-config -m ec2 \
    -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json -s
  • Monitor a custom agent on the AWS console

    1. Navigate to CloudWatch -> Metrics in the left-hand menu
    2. In the Custom Namespaces section, click CWAgent. If you don’t see it yet, wait a few more minutes.
    3. In the Metrics section, click host to view the metrics.

CloudTrail

  • CloudTrail: governance, compliance, risk management and auditing (Who did it?) that records account activity within an AWS
    • any action taken by users, roles or AWS services are recorded here
    • trails are regional can be created, giving more control over logging and allowing events to be stored in S3 and CloudWatch logs
  • Components
    • CloudTrail events: activity recorded (90 days by default: called “event history”)
      • management events: events that log control plane events (e.g. login configuring security)
      • data events (e.g. object-level events in S3)
  • events -> cloudtrail -> S3 and CloudWatch logs

VPC Flow logs

  • VPC flow logs: capture metadata about the traffic flowing in and out of networking interfaces within a VPC (account-id, protocol, bytes, log status…) -> will not snif traffic content
    • can be placed on a specific network interface within a VPC
    • will capture metadata from the capture traffic - only metadata on the traffic
    • you may define filters (in doubt, select all)
  • attached to:
    • VPC

    • subnet

    • network interface

          graph LR
      
      A[CloudWatch logs]
      B[S3]
      
      subgraph VPC
        C((Flow logs))
      end
      
      C --> A;
      C --> B;

Flow logs don’t capture some traffic: Amazon DNS server, Windows license activation, DHCP traffic, VPC router

  • Example
    1. Flow logs -> Create flow log
    2. Verify it has active status, and copy the destination field hyperlink
    3. check the policies
    4. Go to ClouWatch, and create a group log
    5. Generate filter, test pattern

Operations

CloudWatch events

  • CloudWatch events: near real time visibility of changes that happen within an AWS account
    • using rules you can match against certain events within an account and deliver those events to a number of supported targets
    • AWS services are either natively supported as event sources and deliver events directly, or allow event pattern matching against Cloudtraile events
    • additional rules support scheduled events as sources, allowing a cron-style function for periodically passing events to targets
    • event target examples: EC2, lambda, step function state machines. SNS topics, SQS queues

KMS

  • AWS Key Management Service (KMS): provides regional secure key management and encryption and decryption services

    • FIPS 140-level 2 validated service -> certain aspects support level 3 (CloudHSM via custom KeyStores)
  • KMs Custom Master Keys (CMS): created in a region and never leave the region or KMS

    • they can encryptdata up to 4KB
    • have key policies and can be used to create other keys
    CMK type Can view Can manage Dedicated to my account
    Customer Managed Yes Yes Yes
    AWS Managed CMK Yes No Yes
    AWS owned CMK No No No
  • Encrypt/decrypt

    operation data amount you supply it returns
    encrypt up to 4KB with CMK data, specify the key to use ciphertext
    decrypt up to 4KB ciphertext plain text
    re-encrypt up to 4KB with CMK ciphertext, new key new ciphertext
  • Data Encryption key (DEK): generated using a CMK, can be used to encrypt/decrypt data of any size

    • encrypted DEK and encrypted data can be stored together
    • KMS is used to decryot the DEK, which can decrypt the data
  • Commands

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    aws kms create-key --description "KMS DEMO CMK"
    aws kms create-alias --target-key-id XXX \
    --alias-name "alias/kmsdemo" --region us-east-1
    echo "this is a secret message" > topsecret.txt
    aws kms encrypt --key-id KEYID --plaintext file://topsecret.txt \
    --output text --query CiphertextBlob
    aws kms encrypt --key-id KEYID --plaintext file://topsecret.txt \
    --output text --query CiphertextBlob | base64 --decode > topsecret.encrypted
    aws kms decrypt --ciphertext-blob fileb://topsecret.encrypted \
    --output text --query Plaintext | base64 --decode
    # for re-encrypt, plaintext for locally,
    # cyphertextblob for other systems -> enveloped encryption
    aws kms generate-data-key --key-id KEYID --key-spec AES_256 --region us-east-1

Deployment

Elastic Beanstalk

  • Elastic Beanstalk (EB): Platform as a Service. To deploy code and, with very little modifications, the service will provision the infrastructure on your behalf

    • handles provisioning, monitoring, autoscaling, load balancing and software updating
    • supports many languages and platforms (Java, .NET, NodeJS, PHP, Ruby, Python, go, Docker, apache, IIS, nginx, Tomcat)
  • Elastic Beanstalk Application architecture (which is actually a Beanstak container, which store components as environments)

    • EB environment - worker tier

    • EB environment - web server

    • Swap URL

          graph LR
      
      A(data)
      
      subgraph EBE
        B[ELB]
        C[Autoscaling]
        D[instance]
      end
      
      E(database)
      
      A --> B;
      B --> C;
      C --> D;
      D --> E;
      E --> D;
  • Patterns and anti-patterns

    • YES: provision environment for an application with little admin overhead
    • YES: if you use 1 of the supported languages and can add EB-specific config
    • NO: if you want low-level infrastructure control
    • NO: if you need chef support
  • Deployment options:

    • All at once: an updated application version is deployed to all instances. Quick and siumple, but not recognized for product deployments
    • Rolling: splits instances into batches and deploys 1 batch at a time
    • Rolling with additional batch: as above, but provisions a new batch, deploying and testing before removing the odl batch (inmutable)
    • Blue/green: maintain 2 environments, deploy and swap CNAME

OpsWorks

  • OPsWorks: implementation of the Chef configuration manager and deployment platform

    • moves away from the low-level configurability of CloudFormation, but not as far as Elastic Beanstalk
    • lets you create a stack of resources with ayers and manage resources as a unit
  • Stack components

    • Stacks
      • a unit of managed infrastructure
      • ca use per application or per platform
      • could use stacks for development, staging or production environments
    • Layers
      • comparable to application tiers within a stack
      • e.g. database layer, application layer, proxy layer
      • recipes are generally associated with layers and configure what to installon instances in that layer
    • Instances
      • instances are EC2 instances asssociated with layer
      • configured as 24/7, load based, or time based
    • Apps
      • apps are deployed to layers from a source code repo or S3
      • actual deployment happens using recipes on a layer
      • other recipes are run when deployments happen, potencially to reconfigure other instances
    • Recipes
      • setup: executed on an instance when first provisioned
      • configure: executed on all instances when instances are added or removed
      • deploy and undeploy: executed when pps are added or removed
      • shutdown: executed when instance is shut down, but before stopped

Load balancing and auto-scaling

Fundamentals

  • Load balancing: distribute incoming connections across a group of servers or services
  • ELB: load of balancing is a service which provides a set of highly available and scalable load balancers
    • Classic
    • Application (ALB)
    • Network (NLB)
  • Structure
    • a node is placed in each AZ the Load Balancer is active in

    • each node gets 1/N of the traffic (N represents the number of nodes)

    • historically each node only balances instances in the AZ

    • cross-zone load balancing allows each node to distribute traffic to all instances

    • an ELB facing

      • public: accept traffic from Internet
      • internal: only accessible inside a VPC and is often used between tiers
    • ELB accepts traffic via listeners using protocols and ports

      • it can strip HTTPS at this point -> handle encryption/decryption, reduing CPU usage on instances
          graph TD
      
      A[Client]
      B[Node]
      C[Node]
      
      A --> B
      A --> C

Types

  • Classic Load Balancers (CLB)

    • oldest type, nowdays it should be avoided

      • support Layer 3 and 4 (TCP, SSL) and some HTTP/S features
      • it is not a layer 7 device, so no real HTTP/S
      • 1 SSL certificate per CLB - can get expensive for complex projects
      • can offload SSL connections - HTTPS to the load balancer and HTTP to the instance (lower CPU and admin overhead on instances)
      • can be associated with autoscaling groups
      • DNS a record is used to connect to the CLB
    • health check

      • configured to check the health of any attached services
      • if a problem is detected, incoming connections won’t be routed to instances until it returns to health
      • CLB health checks can be TCP, HTTP, HTTPS and SSL based on ports 1-65635
        • with HTTP/S checks, a HTTP/S path can be tested
          graph LR
      
      A[Route53]
      B[ELB]
      
      subgraph zoneA
        C[instance]
        D[instance]
      end
      
      subgraph zoneB
        E[instance]
        F[instance]
      end
      
      A --> B;
      B --> C;
      B --> D;
      B --> E;
      B --> F;
  • Application Load Balancers (ALB)

    • operates at layer 7 of the OSI model

    • recommended nowdays as default LB for VPCs (better performance, usually cheaper)

    • content rules can direct certain traffic to specific target groups

      • host-based rules: route traffic based on the host used
      • path-based rules: route traffic based on URL path
    • ALB support EC2, ECS, EKS, Lambda, HTTP, HTTP/2 and websockets -> this targets can be integrated with firewall (WAF)

    • use an ALB if you need to use containers or microservices

    • targets -> target groups -> content rules (targets can be in multiple target groups)

    • an ALB can host multiple SSL certificates using SNI

          graph LR
      
      A(Internet)
      B[ALB]
      C[listener and rule]
      D[listener and rule]
      
      subgraph zoneA
        E[instance]
        F[instance]
      end
      
      G[instance]
      
      subgraph zoneB
        H[instance]
        I[instance]
      end
      
      A --> B;
      B --> C;
      B --> D;
      C --> E;
      C --> F;
      C --> G;
      D --> G;
      D --> H;
      D --> I;
  • Network Load Balancers (NLB)

    • newest type of balancers, operate at layer 4

    • benefits vs ALB

      • support other protocols
      • support groups
      • less latency (no processing above Layer 4 is required)
      • IP addressable - static address
      • bets load performance balancingwithin AWS
      • source IP address preservation - packets unchanged
      • targets can be addressed using IP address
          graph LR
      
      A[NLB]
      B[TCP 8080 - listener and rule]
      C[TCP 80 - listener and rule]
      
      subgraph TGroup1
        D[IP-1]
        E[IP-2]
      end
      
      subgraph TGroup2
        F[IP-3]
        G[IP-4]
      end
      
      A --> B;
      A --> C;
      B --> D;
      B --> E;
      C --> F;
      C --> G;

Templates and configurations

  • Configure attributes on EC2 instances via launch templates and launch configuration
  • Configurations (what should be provisioned)
    • AMI to use for EC2 launch
    • instance type
    • storage
    • key pair
    • IAM role
    • user data
    • purchase options
    • network configuration
    • security groups
  • Templates: address some weaknesses of legacy launch configurations and add the following features
    • versioning and inheritance
    • tagging
    • more davanced purchasing options
    • new instance features, including:
      • elastic graphics
      • T2/T3 ulimited settings
      • placement groups
      • capacity reservations
      • tenancy options
  • Launch templates should be used over launch configurations where possible
  • Neither can be edited after creation: anew version of the template or a new launch configuration should be created

Auto scaling groups

  • Autoscaling groups use lanch configurations or launch templates and allow automatic scale-out or scale-in based on configurable metrics(usually paired with ELBs)

  • Define ‘minimum size’, ‘desired capacity’ and ‘maximum capacity’

    • can be configured to use multiple AZs to improve high availability
    • unhealty instances are terminated and recreated
    • ELB checks or EC2 status can be used
  • metrics such as CPU utilization to network transfer can be used either to scale out or scale in using scaling policies

    • scaling type
      • manual
      • scheduled
      • dynamic
    • cooldowns can be defined to ensure rapid in/out events don’t occur
    • scaling policies
      • simple
      • step scaling
      • target tracking
  • stress tests extras required

    1
    2
    3
    sudo amazon-linux-extras install epel -y
    sudo yum install stress -y
    stress --cpu 2 --timeout 30000

VPN and Direct Connect

VPC VPN (IPsec)

  • VPC Virtual Private Networks provide a software-based secure connection between a VPC and on-premises networks
  • Components
    • Virtual Private Cloud (VPC)
    • Virtual Private Gateway (VPG) attached to a VPC
    • Customer Gateway (CGW) - configuration for on-premises router
    • VPN connection (using 1 or 2 IPsec tunnels)
  • Best practices
    • use dynamic VPNs (uses BPG) where possible

    • connect both tunnels to your CGW - VPC VPN is HA by design

    • where possible, use 2 VPN connections and 2 CGWs

    • speed (without Direct Connect Partner): 1 or 10 Gbps

          graph LR
      
      subgraph VPC
        A((VPC router))
        B((VPG))
        C((Tunnel endpoint A))
        D((Tunnel endpoint B))
      end
      
      subgraph datacenter
        H((CGW-1))
        I((CGW-2))
      end
      
      E(VPN-1);
      F(VPN-2);
      
      A --> B;
      B --> C;
      B --> D;
      C --> E;
      C --> F;
      D --> E;
      D --> F;
      E --> H;
      F --> I;

Direct Connect architecture

  • Direct Connect (DX): physical connection between your network and AWS either directly via a cross-connect and a customer router at a DX location or via DX partner
  • Dedicated Connections are via AWS and a use single-mode fiber, running either 1Gbps using 1000base-LX or 10GBASE-LR
  • Virtual Interfaces (VIF) run on top of a DX
    • public VIFs can access AWs public services such as S3 only

    • private VIFs are used to connect into VPCs. DX is not highly available or encrypted

          graph LR
      
      subgraph AWS cloud
        A((VGW on VPC))
        B[S3]
        C[DynamoDB]
      end
      
      subgraph DX location
        D[AWS direct connect]
        E((Customer or partner router))
      end
      
      subgraph Customer
        F((Customer on-premises router))
      end
      
      A --> D;
      B --> C;
      C --> D;
      D --> E;
      E -- private VIF --> F;
      E -- public VIF --> F;

Direct Connect vs. VPN

  • VPN
    • urgent need - can be deployed in minutes
    • cost constrained - cheap and economical
    • low end or consumer hardware - DX requires BGP
    • encryption required
    • flexibility to change locations
    • highly available options available
    • short-term connectivity (DX generally has a physical minimum due to the physical transit connections required) - not applicable if you are in a DX location because then it’s almost on demand
  • Direct Connect
    • higher throughput
    • consistent performance (throughput)
    • consitent low latency
    • large ammounts of data - cheaper than VPN for higher volume
    • no contention with existing Internet connection
  • Both
    • VPN as a cheaper HA option for DX
    • VPN as an additional layer HA (in addition to 2 DX)
    • if some form of connectivity is needed immediatly, VPN provides it before the DX connection is live
    • can be used to add encryption over the top of a DX (public VIF/VPN)

Snow*

  • Methods for moving large ammounts of data quicky out of AWS

  • Don’t need to worry about writing code or the speed or data allocation of your Internet, VPN or DX connection

  • You can log a job and receive an empty device or one full of data requested -> perform copy with your usual tooling and ship the device back

  • Types

    • Snowball
      • for in or out jobs
      • log a job and an mepty device or device with data is shipped
      • ideal for TB or PB data transfers - 5p TB or 80 TB capacity per snowball
      • 1 Gbps (RJ45 1 Gbase-TX) or 10 Gbps (LR/LS) using a SFP
      • data encryption using KMS
      • generally used from 10 TB -> 10 PB (the enconomical range)
      • larger jobs or multiple locations can use multiple snowballs
      • end-to-end process time is low for the ammount ofa data week(s)
    • Snowball Edge
      • includes both storage and compute
      • larger capacity
      • 10 Gbps (RJ45), 10/25 Gbps (SFP), 45/50/100 Gbps (QSFP+)
      • compute can be used for local instances or lambda functionality
      • 3 versions
        • edge storage optimized: 80 TB, 24vCPU and 32 GiB RAM
        • edge compute optimized: 100 TB + 7.68 TB NVMe, 52vCPUs, and 208 GiB RAM
        • edge compute opttimized with GPU: as above with GPU equivalent to P3 EC2 instance
    • Snowmobile
      • poratble storage data center within a shipping container on a semi track
      • available inc ertain areas via special order AWS
      • used when a single location 10 PB+ is required
      • each snowmobile can transfer up to 100 PB
      • not economical ofr sub 10 PB and where multiple locations are required
      • situated on side and connected to your data center for the duration of the transfer

Data and DB migration

Storage Gateway

  • Hybrid storage service that allows you to migrate data into AWS, extending your on premises storage capacity using AWS
  • Types
    • File gateway (can be accessed directly from S3, volumens can use an AWS-deployed Storage Gateway)
    • Volumen gateway (Volume -> provide block storage in S3 with point-in -time backups as EBS snapshots)
    • Tape gateway (VTL)

AWS DMS

  • AWS Database Migration Service: service to migrate relational databases, it can migrate to and from any locations with network connectivity to AWS

    • compatible with broad range of DB sources, ibcluding Oracle, MsSQL, MySQL, MariaDB, PostgresSQL, MongoCD, Aurora and SAP

    • data can be synced to most of the above engines, as well as REdshift, S3 and DynamoDB

    • you can also use the Schema Conversion Tool (AWS SCT) to trasnform between different database engines as part of a migration

          graph LR
      
      A((soure))
      
      subgraph AWS DMS
        B(Source Endpoint)
        C[Replication task on replication instance]
        D(Destination Endpoint)
      end
      
      E((Target))
      
      A --> B;
      B --> C;
      C --> D;
      D --> E;
  • with DMS at a high level, you can providion a replication instance, define source and destination endpoints that popint at asource and target databases, and create a replication task. DMS handles the rest, and you can continue using your database while the process runs. DMS is useful ina number of uncommon scenarios:

    • scaling database resources up or donw without downtime
    • migrating databases from on-premises to AWS, from AWS to on-premises, or from/to other cloud platforms
    • moving data between different DB engines, including schema conversion
    • partial/subset data migration
    • migration with little to no admin overhead, as a service

Identity Federation and SSO

IDF

  • Identity Federation (IDF) is an architecture where identities of an external identity provider (IDP)a re recognized. Single sign-on (SSO) is wether the credentials of an external identity are used to allow access to local systems (e.g. AWS)

  • Types

    • Cross-account roles: a remote account (IDP) is allowed to assume a role and access your account’s resources
    • SAML 2.0 IDF: an on-premises or AWS-hosted directory service instance is configured to allow Active Directory users to log in to the AWS console
    • Web Identity Federation: IDPs such as Google, Amazon and FAcebook are allowed to assume roles and access resources in your account
  • Cognito and the Secure Token Service (STS) are used for IDF

  • A federated identity is verified using an external IDP and by proving the identity (using token or assertion of some kind) is allowed to swap that ID for temporary AWS credentials by assuming a role

  • SAML 2.0 structure

        graph LR
    
      subgraph AWS cloud
        D[AWS SSO endpoint]
        E[STS]
        F[Role]
        G[AWS console]
      end
    
      subgraph enterprise identity provider
        A[enterprise desktop]
        B[IDP -ADSF]
        C(Identity Store)
      end
    
      A -- login ADFS portal --> B;
      B -- authenticate store --> C;
      C -- authenticated store --> B;
      B -- token asertion SAML --> A;
      A -- login AWS console --> G;
      A -- post SAML assertion--> D;
      D -- redirect AWS temp credentials --> A;
      D --> E;
      D --> F;
  • Web identity structure

        graph LR
    
      A[application Web or mobile]
      B(user)
      C((Google ID provider))
    
      subgraph AWS cloud
        D[STS temp credentials]
        E[Amazon Cognito]
        F[Amazon DynamoDB]
      end
    
      A -- anonymous usage --> B;
      A -- redirect for login --> C;
      C -- token --> A;
    
      C --> D;
      C -- token exchange Google/Cognito --> E;
      C -- resource access Temp Credentials --> F;

When to use Identity Federation

  • How to use IDF + when to use IDF
    • Enterprise access to AWS resources
      • users/staff have existing pool of identities
      • you need those identities to be used across all enterprise systems, including AWS
      • access to AWS resources using SSO
      • potentially 10.000+ users, more than IAM can handle
      • you might have an ID team within your business
    • Mobile and web apps
      • mobile/webapp requires access to AWS resources
      • you need a certain level of guest acccess (and extra once logged in)
      • customers have other identities (Google, Twitter, Facebook…) and you need to use those
      • tou do not want credentials stored on your application
      • could be millions or more users - beyond capabilities of IAM
      • customers might have multiple 3rd party logins, but they represent 1 real person
    • Centralized Identity Management (AWs accounts)
      • <1000 AWS ccounts in an organization
      • need central sore of IDs -either IAM or external provider
      • role switching used from and ID account into member accounts
0%