AWS Global Infrastructure And Management
Introduction to AWS Global Infrastructure, Route 53, CloudFront, Scalability, Disaster Recovery and Infrastructure Management

Hitesh Sahu
AWS Global Infrastructure
Global Infrastructure insure High availability in case one Data Center die
-
Available in
245
Countries -
AVAILABILITY ZONE (AZ)
84 Group of Data Center with redundant compute power and data in a Region.
- AVZ should not be far apart more than 10 Miles to reduce latency.
- Redundant deploy app in 2 AVZ in a Region for Disaster Recovery(DR) planning
-
REGION:
26 Group of AVZ near high traffic demands in a geographic isolated location
- Regions are connected using fiber optics with each other
- Factors for choosing Regions
- Data Compliance: Data can't move between data center without explicit permission to export data
- Pricing: AWS cost less in USA than Brazil due to tax structure.
- Feature: Some feature might not be available in few region eg Quantum computing.
- Proximity:Close to customer = low latency
-
AWS Wavelength
Deploy AWS Service & Infrastructure on the Edge of
5G
Network- User on 5G can access edge location to give low latency to network user
- High bandwidth & secure connection to region
-
AWS Local Zone
Extend AWS Network to allow compute close to users for low latency application eg low-latency game play
- extend VPC to more locations in AVZ by creating a subnet and including local zone to AVZ
- Allow to extend AVZ for a latency sensitive users
Out of Regions
EDGE LOCATION
Store Data in near by location to reduce latency
- 216 Location around the world
- Preprocess data
- Transcode Data in advance
AMAZON CLOUDFRONT (Amazon CDN)
Local Cache of Data closer to customer. Can server Data, Video etc at low latency
- Global Service make use of Global Edge location
- DDos Protection with
Shield
&WAF
integration - Allow HTTPS with Certificate
- Content can have TTL
- Great for distributing S3 bucket contents across globe
Cloudfront Caching
Cache policy
is used to define minimum-maximum & default TTL
- Goal is to maximize cache hit to minimize hit & load on origin
- TTL
0- 1Year
, invalidate usingCacheInvalidation
API
CloudFront Cache content on the edge location based on
1. Header
- Cloud front control TTL using:
- Control Control Header
- Cache Expiration Header
- CF Header Setting can be configured to:
- All
Forward all headers from Request Header to origin:
- No Caching
- TTL =0
- Whitelist
forward whitelisted header from Request Header to origin
- caching based on values in specified header
- None
Forward only default CF header, don't forward any Request Header
- no caching based on request Header => best caching experience
- Origin Custom Header:
CF add custom constant header with all request
- All
2. Session Cookies & Query String Parameter
- Cookies are like header as key value pair
- CF Setting can be configured to:
- Default:
Don't pass any Cookies/Query to CF.
- Caching is independent of Cookies/Query
- Whitelist
Send some Cookies/Query to Origin
- Caching based on values in Cookies/Query
- All
Forward all Cookies/Query
- Caching based on all Cookies/Query
- Worst Caching
- Default:
CloudFront Origins
OAI Origin Access Identity
IAM role for cloudfront origin to allow access to S3 Bucket
Cloudfront Origin Group
High Availability with failover.
- if Primary origin fail, send request to secondary origin
- Work with EC2 & S3 Bucket
Cloudfront Multi Origin
Cache Behavior & direct origin based on path pattern to EC2 or S3
Origin Types:
1. S3 Bucket:
Connect with S3 Bucket to cache data or distribute across region
- for distributing files & caching them at edge
- as ingress to speed up upload of file paired with Accelerator
- Great for static content for real time dynamic content use S3 cross region replica
Cloudfront | S3 cross region replica |
---|---|
Use global edge network | Must be setup for each region |
Cached for TTL | Real time |
Static content cache for all around the globe | Dynamic content at low latency for selected regions |
2. Custom Origin
HTTP End Point to
- Can be:
EC2, S3 Website or ALB
- To enable HTTP end point:
- EC2 Instance/ ELB must be public
- Security group must allow public IP of all EDGE locations
- S3 Website must be enable as S3 website
Cloudfront GeoRestriction
Blacklist/ Whitelist country from accessing content using third party Geo IP database
Cloudfront Signed URL/Cookie
Signed URL Give access to single file
whileCookie can give access to multiple files
as cookie can be reused
- Set URL Expiration
- Set IP Range who can access content URL
- Work for both S3 & HTTP url
- Can filter by IP, Path, Expiration
CF Pre-Sign URL | S3 Pre-Sign URL |
---|---|
Work for S3 & HTTP | Work with S3 only |
Can be cached on Edge Location | Use IAM key for signing with Expiry Time |
Allow Filtering based on IP, Geolocation, Expiration | Direct sharing of file with shared IAM credential |
Sign URL Keys
1. Public Key & Key Group
Any user can add RSA keys in public kay and group them in key group
- key group can contain upto
5
keys - Any IAM user can create key
- New way of generating keys
2. Cloudfront Key-Pairs
public & private Key pair created by
root
user
- Old way of creating key pair for presign URL
- Only root user can generate Key pair
Cloudfront Pricing Class
- Price Class All:
Whole world, best performance but expensive
- Price Class 200:
Top 200 Locations, exclude expensive location
- Price Class 100:
Least expensive regions
S3 Transfer Acceleration
Speed up Global Download/Upload files to S3 bucket
- Temporary store files to edge location to copy them to S3 Bucket
- Speed up file Upload Process
Global Accelerator
Use AWS Network to speed up application delivery across globe
- Intelligent routing to lower latency: Consistent Performance
- Perform Health Check to provide automated Health check
- Failover less than 1 Minute
- DDOS protection through AWS shield
- Work with :
ALB, NLB, Elastic IP, EC2
Problem with current Internet model
- Conventional internet slow down the application delivery over globe
- AWS Network improve speed using global infrastructure
Use Case:
- Global Accelerator is a good fit for non-HTTP use cases, such as gaming (UDP), IoT (MQTT), or Voice over IP, as well as for HTTP use cases that specifically require static IP addresses or deterministic, fast regional failover. Both services integrate with AWS Shield for DDoS protection.
- Use Global Accelerator to provide a low latency way to distribute live sports results
Unicast IP
One server hold 1 IP address
AnyCast IP Address
All server hold same IP address & client will be route to nearest server
- Anycast IP send traffic to nearest EDGE location
- Global acc make use of Anycast IP
AWS OUTPOST
- Used for Hybrid Cloud(Private + Public cloud)
- Private Mini Region for private customer in a building premiss
- Isolated AWS Instance for specific use cases
- Use same AWS infrastructure and service as AWS cloud to simplify process.
- Data Center is responsible for security
Scalability
Scalability mean capacity to handle high load by scaling hardware
- Vertical Scalability:
Increase size of instance by upgrading hardware
- bound by hardware limit
- Horizontal Scalability:
Increase number of instances.
- Distributed System
- Auto Scaling Group
- Load Balancer
Availability
Running app in at least 2 AVZ to avoid data center loss
- Secondary AVZ could be Passive or Active
- Auto Scaling Group in AZ
- Load Balancer in multi AZ
Disaster Recovery DR
Type of Disaster Recovery
- Traditional: On premises to on premises Data center
- Hybrid: on premises Data Center to Cloud
- Full Cloud: Cloud to Cloud
Tips:
- Backup regularly: EBS Snapshot, RDS backup, S3, Life Cycle Policy, Cross region replication
- Use High Available Resources: Route 53, EFS, Elastic Cache, Direct Connect or site to site vpn
- Replicate data: Storage Gateway, multi region replication
- Automate as much as you can: Cloudformation, Elasticbeanstalk, AWS Lambda, Cloudwatch
- Chaos Test infrastructure
RPO: Recovery Point Objective
How much Data Loss can happen after Disaster
RTO: Recovery Time Objective
How much Downtime can happen before Recovery
Single Region Single AZ | Single Region Multi AZ | Multi Region Active-Passive (RW-R) | Multi Region Active-Active (RW-RW) |
---|---|---|---|
Easy Setup | Simple Setup | Difficult | Higher Difficult |
Low Availability | High Availability | High availability | High availability |
High Global Latency | High Global Latency | Low Global Read latency but high Global Write Latency | Low Red & Write Latency Globally |
Disaster Strategy
Multi AZ Deployment | Multi Region Deployment | Read Replica |
---|---|---|
High Availability | Disaster Recovery | Scalability |
1. BackUP & Restore
- Recreate infrastructure on disaster
- Easy to setup & less expansive
- Recover from snapshot or SnowMobile
2. Pilot Light
Small version of app always run on Cloud with database ready to go on cloud
- Only for critical workload
3. Warm Standby
Full System up & running at minimum size and can scale to full size to handle Disaster
- Costly but smaller RPO
4. MultiSite Active/Active
Full System at full scale ready to go
- Very low IPO but expensive
ROUTE 53
highly available and scalable managed web service for cloud Domain Name System (DNS)
- Translate website name with IP address of site
- Buy and manage domain in AWS $12/year
- Only service with 100% Available SLA
DNS Terminology
- Domain Registrar: GoDaddy, Route 53
- Zone File: hold DNS record
- Name Server(NS): Server who resolve DNS query
- Root: . at the end of
api.www.example.com.
- Top Level Domain(TLD):
.com
inapi.www.example.com
- Second Level Domain(SLD):
example.com
inapi.www.example.com
- Subdomain:
www.example.com
inapi.www.example.com
- Domain Name:
api.www.example.com
- Fully Qualified Domain Name (FQDN):
http://api.www.example.com.
Common Records:
- A Record :
IPV4-> Site
- AAAA Record :
IPV6-> Host
- CName: Work for Non Root domain
Host Name -> Host Name.
- Alias Record: Work with Root & Non Root Domain
Host Name -> AWS Resource
- Free of charge
- Route 53 Automatically recognize change in IP Address
- Always of type A/AAAA(IPV4/6)
- Native Health check capability
- TTL is set automatically by AWS we cant set TTL
- Target can be:
- ELB, Cloudfront,
- S3 Website
- API Gateway, Global Acc,
- VPC Interface EndPoint
Hosted Zone
Container for record to route traffic to domain & its sub domain
- $.50 per year
-
Public Hosted Zone:
Hold record for route traffic on Internet
- Resolve anyone to query public resources
-
Private Hosted Zone:
Hold record to route traffic in VPC(private domain name)
- Resolve Private cloud resources
Record TTL
Cache period for DNS record on client side to reduce load on DNS server
- Low TTL = More Request on route 53
- High TTL = Outdated record on Client side
Route 53 Health Check 🩺
Route traffic to Healthy DNS server based on health of server
- Check Health of Public Resources
- Help with AUtomated DNS Failover
1. Monitor DNS Endpoint
Do HTTP request to estimate health of resource
- 15 Health checker send request to check response code 2XX or 3XX
- Threshold (3 default)
- Protocols:
HTTP/S, TCP
- Interval for health check:
30 Sec
Default (10Sec
at high cost) - Checker can parse
5120
byte of response - Resource Must Allow End point request from Route 53
2. Monitor other Health checks
Combine health check of child
- Can monitor upto 256 Child Health check
- Can use
AND
,OR
,NOT
to define condition
3. Monitor Cloudwatch Alarms
Use Cloud Watch Metrics to set cloud watch alarm for health checker
- Check Health of Private resource
Routing Policy
How Route 53 respond to DNS queries
1. Simple Routing Policy:
Direct connection to Route 53
- if more than 1 response returned from DNS server than Client choose randomly
- 🩺 No Health check
2. Weighted Routing Policy: Load Balance
Direct traffic to end points based on Weight of EC2 Instance to balance load
- 🩺 Can be associated with Health check
- Weight of 0 = no traffic
- All Weight 0 = Equal traffic to all Nodes
- Weight sum not need to be 100
3. Latency Routing Policy: Minimize latency
Direct Traffic based on Latency
- 🩺 Can be associated with Health check
4. Failover Routing Policy: Help Disaster Recovery
Route based on health check to route traffic to healthy resource
- 🩺 Can be associated with Health check
5. Geolocation DNS
Routing based on location of user to localize an app
- 🩺 Can be associated with Health check
- Default record in case no match found
6. GeoProximity
Routing based on location of user & resource
- Shift traffic to resource based on Bias value
(-99,+99)
- High Bias Value (1, 99) more traffic
- Negative Bias (-1, -99) Value less Traffic
7. Multi Value
Can return Multiple resources based on health check
- Can return upto
8
Resources - 🩺 Can be associated with Health check
- Return Healthy Instances
- Client can check record from Multi Values returned
Traffic Flow
Visual Editor to create complex Routing Tree
- Can be converted to traffic flow policy
- Can apply different rules in Editor
- Can be versioned
Infrastructure Management
System Manager(SSM)
mange EC2 on scale
Ops Works:
Configuration management service that provides managed instances of Chef and Puppet.
- Managed
Chef & puppet
to perform server setup & perform repetitive tasks - Alternate to AWS SSM
AWS Resource Access Manager (AWS RAM)
helps you securely share AWS resources within organization or organizational units (OUs) and with AWS Accounts.
- You can also share resources with IAM Roles and IAM Users.