HA basically is -> across 3 AZ, up to 15 (read) replicas.
AWS Databases
- RDS
- Aurora
ElstiCache
- DynamoDB
- S3
- DocumentDB (MongoDB)
- Neptune
- fully managed
graph
database - a popular graph dataset would be a social network
- fully managed
- Keyspaces (Apache Cassandra)
- QLDB (Quantum Ledger Database)
- A ledger is a book recording financial transactions
- Fully Managed, Serverless, High available, Replication across 3 AZ
- Timestream
- time-series db
Snow Family
High-secure, portable devices to collect and process data at the edge, and migrate data into and out of [[AWS]]. OpsHub
- Data migration:
snowcone
, snowball edge, snow mobile - edge computing:
snowcone
, snowball edge
Snowball edge (data transfer)
- physical data transport solution:
TBs
/PBs
(in or out of AWS) - pay per data transfer job
- block storage and s3-compatible object storage
- storage optimized
- 80TB HDD
- compute optimized
- 42TB HDD
- 28TB NVMe
Snowcone
& Snowcone
SSD
- Small, portable computing, anywhere, rugged & secure, withstands harsh environments
Snowcone
HDD 8TBSnowcone
SSD 14TB- must provide your own battery / cables
- can be sent back to AWS offline, or connect it to internet and use AWS
DataSync
to send data
Snow mobile (truck)
- transfer exabytes of data (1 EB = 1000PB = 1000000TB)
- each snow mobile has 100PB of capacity (use multiple in parallel)
- high security: temperature controlled, GPS, 24/7 video surveillance
- better than snowball if you transfer more than 10PB
Usage Process
- Request Snowball devices from the AWS console for delivery
- Install the snowball client / AWS OpsHub on your servers
- Connect the snowball to your servers and copy files using the client
- Ship back the device when you’re done (goes to the right AWS facility)
- Data will be loaded into an S3 bucket
- Snowball is completely wiped
Edge Computing
Process data while it’s being created on an edge location.
Use cases
- Preprocess data
- Machine Learning at the edge
- transcoding media streams
Snow Family - all can run EC2 instances & AWS Lambda functions (using AWS IoT Greengrass)
snowcone
&& SSD version (smaller)- snowball edge - compute optimized
- snowball edge - storage optimized
OpsHub
A software to manage your snow family device
- unlocking and configuring single or clustered devices
- transferring files
- launching and managing instance running on snow family devices
- monitor device metrics
- launch compatible AWS services on your devices (EC2, DataSync, NFS)
Snowball into Glacier
Snowball cannot import to Glacier directly
Snowball (import) -> S3 (life cycle policy) -> Amazon Glacier
FSx
Launch 3-rd party high-performance file systems on AWS, fully managed service
- Windows File Server (Windows)
- supports SMB, NTFS
- can be mounted on Linux EC2
- Active Directory integration, ACLs, user quotas
- supports Microsoft’s Distributed File System Namespaces
- can be accessed from your on-premises infra (VPN, DX)
- can be configured to be multi-AZ
- data is backed-up daily to S3
FSx
for Lustre (Linux Cluster)- ML, HPC - video processing, financial modeling, electronic design automation
- seamless integration with [[aws-s3|S3]]
- scratch FS (temporary storage)
- persistent FS (long-term storage)
- NetApp ONTAP
- supports NFS, SMB, iSCSI
- move workloads running on ONTAP or NAS to AWS
- point-in-time instantaneous cloning
- OpenZFS
- NFS compatible
- Up to 1000000 IOPS with < 0.5ms latency
- snapshots, compressions, low-cost
- point-in-time instantaneous cloning
Storage Gateway
Bridge between on-premises data and cloud data. hybrid cloud storage
Use cases
- disaster recovery
- backup & restore
- tiered storage
- on-premises cache & low-latency file access
Types
- S3 File
- NFS/SMB protocol (AD)
- Most recently used data is cached in the file gateway
- supports s3 standard, s3 Standard IA, S3 OZ IA, S3 Intelligent Tiering
- bucket access using IAM roles for each file gateway
FSx
- native access to Amazon
FSx
for Windows File Server - Local cache for frequently accessed data
- windows native compatibility (SMB, NTFS, AD)
- native access to Amazon
- Volume
- Block storage using iSCSI protocol backed by S3
- backed by EBS snapshots
- cached volume - to store the most frequently accessed results locally for low-latency access while storing the full volume with all results in its S3 service bucket
- Tape - iSCSI VTL
- Hardware Appliance
Transfer Family
- A fully-managed service for file transfers into and out of Amazon S3 or Amazon EFS using the FTP protocol
- Supported Protocols
- AWS Transfer for FTP (File Transfer Protocol (FTP))
- AWS Transfer for FTPS (File Transfer Protocol over SSL (FTPS))
- AWS Transfer for SFTP (Secure File Transfer Protocol (SFTP))
- Managed infrastructure, Scalable, Reliable, Highly Available (multi-AZ)
- Pay per provisioned endpoint per hour + data transfers in GB
- Store and manage users’ credentials within the service
- Integrate with existing authentication systems (Microsoft Active Directory, LDAP,
Okta
, AmazonCognito
, custom) - Usage: sharing files, public datasets, CRM, ERP, …
DataSync
- one-time or infrequent large data transfer
- Move large amount of data to and from
- On-premises / other cloud to AWS (NFS, SMB, HDFS, S3 API…) – needs agent
- AWS to AWS (different storage services) – no agent needed
- Can synchronize to:
- Amazon S3 (any storage classes – including Glacier)
- Amazon EFS
- Amazon FSx (Windows, Lustre, NetApp, OpenZFS…)
- Replication tasks can be
scheduled
hourly, daily, weekly - File permissions and metadata are preserved (NFS POSIX, SMB…)
- One agent task can use 10 Gbps, can setup a bandwidth limit
Storage Comparison
- S3: Object Storage
- S3 Glacier : Object Archival
- EBS volumes: Network storage for one EC2 instance at a time
- Instance Storage: Physical storage for your EC2 instance (high IOPS)
- EFS: Network File System for Linux instances, POSIX filesystem
- FSx for Windows: Network File System for Windows servers
- FSx for Lustre: High Performance Computing Linux file system
- FSx for NetApp ONTAP: High OS Compatibility
- FSx for OpenZFS: Managed ZFS file system
- Storage Gateway: S3 & FSx File Gateway, Volume Gateway (cache & stored), Tape Gateway
- Transfer Family: FTP, FTPS, SFTP interface on top of Amazon S3 or Amazon EFS
- DataSync: Schedule data sync from on-premises to AWS, or AWS to AWS
Snowcone
/ Snowball / Snowmobile: to move large amount of data to the cloud, physically- Database: for specific workloads, usually with indexing and querying
Quick Catchups
- Amazon Aurora is a MySQL and PostgreSQL-compatible relational database. It features a distributed, fault-tolerant, self-healing storage system that auto-scales up to 128TB per database instance. It delivers high performance and availability with up to 15 low-latency read replicas, point-in-time recovery, continuous backup to Amazon S3, and replication across 3 AZs.
- Amazon Neptune is a fast, reliable, fully-managed graph database service that makes it easy to build and run applications that work with highly connected datasets.