AWS Storage

HA basically is -> across 3 AZ, up to 15 (read) replicas.

AWS Databases

  • RDS
  • Aurora
  • ElstiCache
  • DynamoDB
  • S3
  • DocumentDB (MongoDB)
  • Neptune
    • fully managed graph database
    • a popular graph dataset would be a social network
  • Keyspaces (Apache Cassandra)
  • QLDB (Quantum Ledger Database)
    • A ledger is a book recording financial transactions
    • Fully Managed, Serverless, High available, Replication across 3 AZ
  • Timestream
    • time-series db

Snow Family

High-secure, portable devices to collect and process data at the edge, and migrate data into and out of [[AWS]]. OpsHub

  • Data migration: snowcone, snowball edge, snow mobile
  • edge computing: snowcone, snowball edge

Snowball edge (data transfer)

  • physical data transport solution: TBs / PBs (in or out of AWS)
  • pay per data transfer job
  • block storage and s3-compatible object storage
  • storage optimized
    • 80TB HDD
  • compute optimized
    • 42TB HDD
    • 28TB NVMe

Snowcone & Snowcone SSD

  • Small, portable computing, anywhere, rugged & secure, withstands harsh environments
  • Snowcone HDD 8TB
  • Snowcone SSD 14TB
  • must provide your own battery / cables
  • can be sent back to AWS offline, or connect it to internet and use AWS DataSync to send data

Snow mobile (truck)

  • transfer exabytes of data (1 EB = 1000PB = 1000000TB)
  • each snow mobile has 100PB of capacity (use multiple in parallel)
  • high security: temperature controlled, GPS, 24/7 video surveillance
  • better than snowball if you transfer more than 10PB

Usage Process

  1. Request Snowball devices from the AWS console for delivery
  2. Install the snowball client / AWS OpsHub on your servers
  3. Connect the snowball to your servers and copy files using the client
  4. Ship back the device when you’re done (goes to the right AWS facility)
  5. Data will be loaded into an S3 bucket
  6. Snowball is completely wiped

Edge Computing

Process data while it’s being created on an edge location.

Use cases

  • Preprocess data
  • Machine Learning at the edge
  • transcoding media streams

Snow Family - all can run EC2 instances & AWS Lambda functions (using AWS IoT Greengrass)

  • snowcone && SSD version (smaller)
  • snowball edge - compute optimized
  • snowball edge - storage optimized

OpsHub

A software to manage your snow family device

  • unlocking and configuring single or clustered devices
  • transferring files
  • launching and managing instance running on snow family devices
  • monitor device metrics
  • launch compatible AWS services on your devices (EC2, DataSync, NFS)

Snowball into Glacier

Snowball cannot import to Glacier directly

Snowball (import) -> S3 (life cycle policy) -> Amazon Glacier

FSx

Launch 3-rd party high-performance file systems on AWS, fully managed service

  • Windows File Server (Windows)
    • supports SMB, NTFS
    • can be mounted on Linux EC2
    • Active Directory integration, ACLs, user quotas
    • supports Microsoft’s Distributed File System Namespaces
    • can be accessed from your on-premises infra (VPN, DX)
    • can be configured to be multi-AZ
    • data is backed-up daily to S3
  • FSx for Lustre (Linux Cluster)
    • ML, HPC - video processing, financial modeling, electronic design automation
    • seamless integration with [[aws-s3|S3]]
    • scratch FS (temporary storage)
    • persistent FS (long-term storage)
  • NetApp ONTAP
    • supports NFS, SMB, iSCSI
    • move workloads running on ONTAP or NAS to AWS
    • point-in-time instantaneous cloning
  • OpenZFS
    • NFS compatible
    • Up to 1000000 IOPS with < 0.5ms latency
    • snapshots, compressions, low-cost
    • point-in-time instantaneous cloning

Storage Gateway

Bridge between on-premises data and cloud data. hybrid cloud storage

Use cases

  • disaster recovery
  • backup & restore
  • tiered storage
  • on-premises cache & low-latency file access

Types

  • S3 File
    • NFS/SMB protocol (AD)
    • Most recently used data is cached in the file gateway
    • supports s3 standard, s3 Standard IA, S3 OZ IA, S3 Intelligent Tiering
    • bucket access using IAM roles for each file gateway
  • FSx
    • native access to Amazon FSx for Windows File Server
    • Local cache for frequently accessed data
    • windows native compatibility (SMB, NTFS, AD)
  • Volume
    • Block storage using iSCSI protocol backed by S3
    • backed by EBS snapshots
    • cached volume - to store the most frequently accessed results locally for low-latency access while storing the full volume with all results in its S3 service bucket
  • Tape - iSCSI VTL
  • Hardware Appliance

Transfer Family

  • A fully-managed service for file transfers into and out of Amazon S3 or Amazon EFS using the FTP protocol
  • Supported Protocols
    • AWS Transfer for FTP (File Transfer Protocol (FTP))
    • AWS Transfer for FTPS (File Transfer Protocol over SSL (FTPS))
    • AWS Transfer for SFTP (Secure File Transfer Protocol (SFTP))
  • Managed infrastructure, Scalable, Reliable, Highly Available (multi-AZ)
  • Pay per provisioned endpoint per hour + data transfers in GB
  • Store and manage users’ credentials within the service
  • Integrate with existing authentication systems (Microsoft Active Directory, LDAP, Okta, Amazon Cognito, custom)
  • Usage: sharing files, public datasets, CRM, ERP, …

DataSync

  • one-time or infrequent large data transfer
  • Move large amount of data to and from
    • On-premises / other cloud to AWS (NFS, SMB, HDFS, S3 API…) – needs agent
    • AWS to AWS (different storage services) – no agent needed
  • Can synchronize to:
    • Amazon S3 (any storage classes – including Glacier)
    • Amazon EFS
    • Amazon FSx (Windows, Lustre, NetApp, OpenZFS…)
  • Replication tasks can be scheduled hourly, daily, weekly
  • File permissions and metadata are preserved (NFS POSIX, SMB…)
  • One agent task can use 10 Gbps, can setup a bandwidth limit

Storage Comparison

  • S3: Object Storage
  • S3 Glacier : Object Archival
  • EBS volumes: Network storage for one EC2 instance at a time
  • Instance Storage: Physical storage for your EC2 instance (high IOPS)
  • EFS: Network File System for Linux instances, POSIX filesystem
  • FSx for Windows: Network File System for Windows servers
  • FSx for Lustre: High Performance Computing Linux file system
  • FSx for NetApp ONTAP: High OS Compatibility
  • FSx for OpenZFS: Managed ZFS file system
  • Storage Gateway: S3 & FSx File Gateway, Volume Gateway (cache & stored), Tape Gateway
  • Transfer Family: FTP, FTPS, SFTP interface on top of Amazon S3 or Amazon EFS
  • DataSync: Schedule data sync from on-premises to AWS, or AWS to AWS
  • Snowcone / Snowball / Snowmobile: to move large amount of data to the cloud, physically
  • Database: for specific workloads, usually with indexing and querying

Quick Catchups

  • Amazon Aurora is a MySQL and PostgreSQL-compatible relational database. It features a distributed, fault-tolerant, self-healing storage system that auto-scales up to 128TB per database instance. It delivers high performance and availability with up to 15 low-latency read replicas, point-in-time recovery, continuous backup to Amazon S3, and replication across 3 AZs.
  • Amazon Neptune is a fast, reliable, fully-managed graph database service that makes it easy to build and run applications that work with highly connected datasets.
Licensed under CC BY-NC-SA 4.0
Get Things Done
Built with Hugo
Theme Stack designed by Jimmy