Some notes so far learning S3 and Cloudfront on AWS
S3 General
- Object-based storage — some might call it blob, can be any bytes, immutable once created.
- Durability by duplication across the network
- 0–5TB file limit
- No limit on storage per account
- Buckets are flat, unique in name across all AWS
- Buckets can be accessed via URL
- Uploading to S3 returns a 200
S3 SLA
- Designed for 99.99% availability
- Guarantee 99.9% availability
- Guarantee 99.999999999% durability
S3 Behaviours
- S3 gives you read after write consistency, creating new content means you get that back right away
- S3 gives you eventual consistency from updating existing objects. This is because of the duplication. It takes time to propagate.
- There are some different storage classes, some include Infrequent Access and Glacier. The trade-offs are availability and minimum storage duration and cost.
- Objects are key-value storage with metadata and custom data. Version ID’s if you enable versioning and an ACL (access control list).
S3 Versioning
- Versioning can be turned on but only suspended once on.
- Versioning a bucket means a delete is a ‘delete marker’
Life Cycle Management
- Life Cycle Management is setting up transitions for objects to move objects into IA Access etc.
- Life Cycle has an expiry (delete) transition too.
S3 Cross-Region
- Copies objects from one bucket to another in different regions
- Requires versioning turned on
- Does not replicate retrospectively so decide upfront if you need it probably.
- Used to place the assets closer to the user.
- Delete markers are replicated, but deleting the delete markers does not replicate… (circa 2018 — need to verify this myself still)
S3 => Cloudfront
- Used to reduce the object retrieval time
- Origin — location of the objects the CDN will distribute, can be S3, could be a load balancer, EC2, server etc etc
- Edge Location — AWS server where the object will be cached
- Distribution — A CDN configuration of edge locations used for a particular CloudFront e.g. Web HTTP/S
- Edge locations can accept uploads
- TTL is the cache time
- Caches can be invalidated to force the requester to go to the origin
S3 Security
- All buckets are born Private
- Buckets access can be done in the bucket itself on a per bucket basis
- Access Control Lists let you control policy for a user in a more bucket instance agnostic way e.g. Read from Buckets, Write to Buckets
- Objects can be controlled at the ACL level.
- You can log bucket activity (into a separate bucket)
- You can set up events when bucket activities are performed e.g. add a new item
S3 Security / Encryption
- In transit, the objects are protected by SSL/TLS
- At rest, there are some encryption options you can enable
- SSE-S3 has its own built-in management
- SSE-KMS (Key management store) another service can be used to protect and manage your own keys
- SSE-C (Custom) lets you set your own keys and basically roll your own encryption.
S3 Transfer Acceleration
- Lets you transfer data faster (lower latency)
- Uses AWS’s own infrastructure
- Costs x2.. not always worth it, so check.
S3 Static Website
- Host an entire website
- Requires no virtual website
- Scales automatically