SORT 2025

Introduction

chef, puppet and ansible as cloud tools

glacier deep archive, like glacier, is s3 compatible?

distributed notes h3 on distributed data storing data in a distribted way Hadoop Distributed File System (HDFS): Apache Cassandra

accessing data stored in a distributed way Hadoop spark apache kafka?

kubernetes: how to use GPUs, CUDA

microk8s ctr image import myimage.tar (can do from docker export?)

kubectl. kubectl apply on yaml files kubectl create deployment kubectl scale deployment kubectl expose deployment

for distributed: + rancher + docker swarm

Apache kafka (messaging thing?) terraform on cloud computing.

distributed datasets: hadoop distribed file system (HDFS), Apache HBase, Apache Cassandra (all based on bigtable?)

mapreduce (and apache hadoop, an implementation) apache spark (builds off mapreduce, better?)

CAP theorem: only 2 of 3: consistency, availablility, partition tolerance.

google cloud h3

Microsoft h3

microsfot snowflake

Amazon

ec2 stuff + Amazon Machine Images (AMI) + Storage * instance storage * simple storage service (s3) 1. object storage. non-filesystem * elastic file system (efs) 1. designed to be used with multiple ec2 instances * elastic block store (ebs) 1. can have snapshot backups 2. paired with ec2 instances 3. designed for one instance, but can now be used by multiple? 4. options: 1) Cold HDD (sc1) Aws: patr on elastiv compute cloud (ec2) Elastic beanstalk meta load balancer and instances Lambda. Functions as a service Athena page? Simple storage service (S3) Elastic block store Ec2 instance store Glacier Elastic file system Databases: simple db; Dynamo dB; document dB; rds Data warehouses: redshift Data lakes page Page on graph database in amazon Elastic search Amazon glue Amazon aurora Machine learning: sagemaker; rekognition; lex; Tools: IAM aws: + page on elastic compute cloud (ec2) + page on load balancers (beanstalk) aws + aws ec2 + aws ec2 describe-instances + aws s3 + aws s3 ls s3://mybucket amazon ec2: + Identity and Access Management (IAM)