Hitesh Sahu
Hitesh SahuHitesh Sahu
  1. Home
  2. ›
  3. posts
  4. ›
  5. …

  6. ›
  7. 2 Infrastructure

Loading ⏳
Please wait...

🍪 This website uses cookies

No personal data is stored on our servers however third party tools Google Analytics cookies to measure traffic and improve your website experience. Learn more

Cover Image for AWS Global Infrastructure And Management

AWS Global Infrastructure And Management

Introduction to AWS Global Infrastructure, Route 53, CloudFront, Scalability, Disaster Recovery and Infrastructure Management

Hitesh Sahu
Hitesh Sahu

Mon Sep 29 2025

AWS Global Infrastructure

Global Infrastructure insure High availability in case one Data Center die

  • Available in 245 Countries

  • AVAILABILITY ZONE (AZ)

    84 Group of Data Center with redundant compute power and data in a Region.

    • AVZ should not be far apart more than 10 Miles to reduce latency.
    • Redundant deploy app in 2 AVZ in a Region for Disaster Recovery(DR) planning
  • REGION:

    26 Group of AVZ near high traffic demands in a geographic isolated location

    • Regions are connected using fiber optics with each other
    • Factors for choosing Regions
      • Data Compliance: Data can't move between data center without explicit permission to export data
      • Pricing: AWS cost less in USA than Brazil due to tax structure.
      • Feature: Some feature might not be available in few region eg Quantum computing.
      • Proximity:Close to customer = low latency
  • AWS Wavelength

    Deploy AWS Service & Infrastructure on the Edge of 5G Network

    • User on 5G can access edge location to give low latency to network user
    • High bandwidth & secure connection to region
  • AWS Local Zone

    Extend AWS Network to allow compute close to users for low latency application eg low-latency game play

    • extend VPC to more locations in AVZ by creating a subnet and including local zone to AVZ
    • Allow to extend AVZ for a latency sensitive users

Out of Regions

EDGE LOCATION

Store Data in near by location to reduce latency

  • 216 Location around the world
  • Preprocess data
  • Transcode Data in advance

AMAZON CLOUDFRONT (Amazon CDN)

Local Cache of Data closer to customer. Can server Data, Video etc at low latency

  • Global Service make use of Global Edge location
  • DDos Protection with Shield & WAF integration
  • Allow HTTPS with Certificate
  • Content can have TTL
  • Great for distributing S3 bucket contents across globe

Cloudfront Caching

Cache policy is used to define minimum-maximum & default TTL

  • Goal is to maximize cache hit to minimize hit & load on origin
  • TTL 0- 1Year, invalidate using CacheInvalidation API

CloudFront Cache content on the edge location based on

1. Header

  • Cloud front control TTL using:
    • Control Control Header
    • Cache Expiration Header
  • CF Header Setting can be configured to:
    1. All

      Forward all headers from Request Header to origin:

      • No Caching
      • TTL =0
    2. Whitelist

      forward whitelisted header from Request Header to origin

      • caching based on values in specified header
    3. None

      Forward only default CF header, don't forward any Request Header

      • no caching based on request Header => best caching experience
    4. Origin Custom Header:

      CF add custom constant header with all request

2. Session Cookies & Query String Parameter

  • Cookies are like header as key value pair
  • CF Setting can be configured to:
    1. Default:

      Don't pass any Cookies/Query to CF.

      • Caching is independent of Cookies/Query
    2. Whitelist

      Send some Cookies/Query to Origin

      • Caching based on values in Cookies/Query
    3. All

      Forward all Cookies/Query

      • Caching based on all Cookies/Query
      • Worst Caching

CloudFront Origins

OAI Origin Access Identity

IAM role for cloudfront origin to allow access to S3 Bucket

Cloudfront Origin Group

High Availability with failover.

  • if Primary origin fail, send request to secondary origin
  • Work with EC2 & S3 Bucket

Cloudfront Multi Origin

Cache Behavior & direct origin based on path pattern to EC2 or S3

Origin Types:

1. S3 Bucket:

Connect with S3 Bucket to cache data or distribute across region

  • for distributing files & caching them at edge
  • as ingress to speed up upload of file paired with Accelerator
  • Great for static content for real time dynamic content use S3 cross region replica

Cloudfront as S3 origin

Cloudfront S3 cross region replica
Use global edge network Must be setup for each region
Cached for TTL Real time
Static content cache for all around the globe Dynamic content at low latency for selected regions

2. Custom Origin

HTTP End Point to

  • Can be: EC2, S3 Website or ALB
  • To enable HTTP end point:
    • EC2 Instance/ ELB must be public
    • Security group must allow public IP of all EDGE locations
    • S3 Website must be enable as S3 website

Cloudfront as EC2 origin

Cloudfront GeoRestriction

Blacklist/ Whitelist country from accessing content using third party Geo IP database

Cloudfront Signed URL/Cookie

Signed URL Give access to single file while Cookie can give access to multiple files as cookie can be reused

  • Set URL Expiration
  • Set IP Range who can access content URL
  • Work for both S3 & HTTP url
  • Can filter by IP, Path, Expiration

Cloudfront Signed URL/Cookie

CF Pre-Sign URL S3 Pre-Sign URL
Work for S3 & HTTP Work with S3 only
Can be cached on Edge Location Use IAM key for signing with Expiry Time
Allow Filtering based on IP, Geolocation, Expiration Direct sharing of file with shared IAM credential

Sign URL Keys

1. Public Key & Key Group

Any user can add RSA keys in public kay and group them in key group

  • key group can contain upto 5 keys
  • Any IAM user can create key
  • New way of generating keys

2. Cloudfront Key-Pairs

public & private Key pair created by root user

  • Old way of creating key pair for presign URL
  • Only root user can generate Key pair

Cloudfront Pricing Class

  1. Price Class All:

    Whole world, best performance but expensive

  2. Price Class 200:

    Top 200 Locations, exclude expensive location

  3. Price Class 100:

    Least expensive regions

S3 Transfer Acceleration

Speed up Global Download/Upload files to S3 bucket

  • Temporary store files to edge location to copy them to S3 Bucket
  • Speed up file Upload Process

Global Accelerator

Use AWS Network to speed up application delivery across globe

  • Intelligent routing to lower latency: Consistent Performance
  • Perform Health Check to provide automated Health check
  • Failover less than 1 Minute
  • DDOS protection through AWS shield
  • Work with :ALB, NLB, Elastic IP, EC2

Problem with current Internet model

  • Conventional internet slow down the application delivery over globe
  • AWS Network improve speed using global infrastructure

Use Case:

  • Global Accelerator is a good fit for non-HTTP use cases, such as gaming (UDP), IoT (MQTT), or Voice over IP, as well as for HTTP use cases that specifically require static IP addresses or deterministic, fast regional failover. Both services integrate with AWS Shield for DDoS protection.
  • Use Global Accelerator to provide a low latency way to distribute live sports results

Global Accelerator vs CloudFront

S3 Access Logs

Unicast IP

One server hold 1 IP address

AnyCast IP Address

All server hold same IP address & client will be route to nearest server

  • Anycast IP send traffic to nearest EDGE location
  • Global acc make use of Anycast IP

AWS OUTPOST

  • Used for Hybrid Cloud(Private + Public cloud)
  • Private Mini Region for private customer in a building premiss
  • Isolated AWS Instance for specific use cases
  • Use same AWS infrastructure and service as AWS cloud to simplify process.
  • Data Center is responsible for security

Scalability

Scalability mean capacity to handle high load by scaling hardware

Scaling Group

  • Vertical Scalability:

    Increase size of instance by upgrading hardware

    • bound by hardware limit
  • Horizontal Scalability:

    Increase number of instances.

    • Distributed System
    • Auto Scaling Group
    • Load Balancer

Availability

Running app in at least 2 AVZ to avoid data center loss

  • Secondary AVZ could be Passive or Active
  • Auto Scaling Group in AZ
  • Load Balancer in multi AZ

Disaster Recovery DR

Type of Disaster Recovery

  • Traditional: On premises to on premises Data center
  • Hybrid: on premises Data Center to Cloud
  • Full Cloud: Cloud to Cloud

Tips:

  • Backup regularly: EBS Snapshot, RDS backup, S3, Life Cycle Policy, Cross region replication
  • Use High Available Resources: Route 53, EFS, Elastic Cache, Direct Connect or site to site vpn
  • Replicate data: Storage Gateway, multi region replication
  • Automate as much as you can: Cloudformation, Elasticbeanstalk, AWS Lambda, Cloudwatch
  • Chaos Test infrastructure

RPO: Recovery Point Objective

How much Data Loss can happen after Disaster

RTO: Recovery Time Objective

How much Downtime can happen before Recovery

RPO & RTO

Single Region Single AZ Single Region Multi AZ Multi Region Active-Passive (RW-R) Multi Region Active-Active (RW-RW)
Easy Setup Simple Setup Difficult Higher Difficult
Low Availability High Availability High availability High availability
High Global Latency High Global Latency Low Global Read latency but high Global Write Latency Low Red & Write Latency Globally

Disaster Strategy

Disaster Plan

Multi AZ Deployment Multi Region Deployment Read Replica
High Availability Disaster Recovery Scalability

1. BackUP & Restore

  • Recreate infrastructure on disaster
  • Easy to setup & less expansive
  • Recover from snapshot or SnowMobile

2. Pilot Light

Small version of app always run on Cloud with database ready to go on cloud

  • Only for critical workload

3. Warm Standby

Full System up & running at minimum size and can scale to full size to handle Disaster

  • Costly but smaller RPO

4. MultiSite Active/Active

Full System at full scale ready to go

  • Very low IPO but expensive

ROUTE 53

highly available and scalable managed web service for cloud Domain Name System (DNS)

  • Translate website name with IP address of site
  • Buy and manage domain in AWS $12/year
  • Only service with 100% Available SLA

DNS Terminology

  • Domain Registrar: GoDaddy, Route 53
  • Zone File: hold DNS record
  • Name Server(NS): Server who resolve DNS query
  • Root: . at the end of api.www.example.com.
  • Top Level Domain(TLD): .com in api.www.example.com
  • Second Level Domain(SLD): example.com in api.www.example.com
  • Subdomain: www.example.com in api.www.example.com
  • Domain Name: api.www.example.com
  • Fully Qualified Domain Name (FQDN): http://api.www.example.com. DNS Parts

Common Records:

  • A Record :

    IPV4-> Site

  • AAAA Record :

    IPV6-> Host

  • CName: Work for Non Root domain

    Host Name -> Host Name.

  • Alias Record: Work with Root & Non Root Domain

    Host Name -> AWS Resource

    • Free of charge
    • Route 53 Automatically recognize change in IP Address
    • Always of type A/AAAA(IPV4/6)
    • Native Health check capability
    • TTL is set automatically by AWS we cant set TTL
    • Target can be:
      • ELB, Cloudfront,
      • S3 Website
      • API Gateway, Global Acc,
      • VPC Interface EndPoint

Hosted Zone

Container for record to route traffic to domain & its sub domain

  • $.50 per year
  1. Public Hosted Zone:

    Hold record for route traffic on Internet

    • Resolve anyone to query public resources
  2. Private Hosted Zone:

    Hold record to route traffic in VPC(private domain name)

    • Resolve Private cloud resources

Record TTL

Cache period for DNS record on client side to reduce load on DNS server

  • Low TTL = More Request on route 53
  • High TTL = Outdated record on Client side

Route 53 Health Check 🩺

Route traffic to Healthy DNS server based on health of server

  • Check Health of Public Resources
  • Help with AUtomated DNS Failover

1. Monitor DNS Endpoint

Do HTTP request to estimate health of resource

  • 15 Health checker send request to check response code 2XX or 3XX
  • Threshold (3 default)
  • Protocols: HTTP/S, TCP
  • Interval for health check: 30 Sec Default (10Sec at high cost)
  • Checker can parse 5120 byte of response
  • Resource Must Allow End point request from Route 53

2. Monitor other Health checks

Combine health check of child

  • Can monitor upto 256 Child Health check
  • Can use AND, OR, NOT to define condition

3. Monitor Cloudwatch Alarms

Use Cloud Watch Metrics to set cloud watch alarm for health checker

  • Check Health of Private resource

Routing Policy

How Route 53 respond to DNS queries

1. Simple Routing Policy:

Direct connection to Route 53

  • if more than 1 response returned from DNS server than Client choose randomly
  • 🩺 No Health check

2. Weighted Routing Policy: Load Balance

Direct traffic to end points based on Weight of EC2 Instance to balance load

  • 🩺 Can be associated with Health check
  • Weight of 0 = no traffic
  • All Weight 0 = Equal traffic to all Nodes
  • Weight sum not need to be 100

3. Latency Routing Policy: Minimize latency

Direct Traffic based on Latency

  • 🩺 Can be associated with Health check

4. Failover Routing Policy: Help Disaster Recovery

Route based on health check to route traffic to healthy resource

  • 🩺 Can be associated with Health check

5. Geolocation DNS

Routing based on location of user to localize an app

  • 🩺 Can be associated with Health check
  • Default record in case no match found

6. GeoProximity

Routing based on location of user & resource

  • Shift traffic to resource based on Bias value(-99,+99)
  • High Bias Value (1, 99) more traffic
  • Negative Bias (-1, -99) Value less Traffic

7. Multi Value

Can return Multiple resources based on health check

  • Can return upto 8 Resources
  • 🩺 Can be associated with Health check
  • Return Healthy Instances
  • Client can check record from Multi Values returned

Traffic Flow

Visual Editor to create complex Routing Tree

  • Can be converted to traffic flow policy
  • Can apply different rules in Editor
  • Can be versioned

Infrastructure Management

System Manager(SSM)

mange EC2 on scale

AWS System Manager

Ops Works:

Configuration management service that provides managed instances of Chef and Puppet.

  • Managed Chef & puppetto perform server setup & perform repetitive tasks
  • Alternate to AWS SSM

AWS Resource Access Manager (AWS RAM)

helps you securely share AWS resources within organization or organizational units (OUs) and with AWS Accounts.

  • You can also share resources with IAM Roles and IAM Users.