Tutorials Logic, IN info@tutorialslogic.com
Navigation
Home About Us Contact Us Blogs FAQs
Tutorials
All Tutorials
Services
Academic Projects Resume Writing Website Development
Practice
Quiz Challenge Interview Questions Certification Practice
Compiler Tools
AWS SAA-C03

Top 50 AWS Solutions Architect Associate Interview Questions

Scenario-based questions covering all four SAA-C03 exam domains — secure, resilient, high-performing, and cost-optimized architectures on AWS.

01

What is the AWS Shared Responsibility Model and how does it differ for EC2 vs RDS?

The Shared Responsibility Model divides security between AWS and the customer. AWS is responsible for security OF the cloud (hardware, infrastructure, physical security). The customer is responsible for security IN the cloud (data, IAM, OS patches, network config, encryption).

  • EC2 (IaaS) — customer manages OS patching, security groups, application security, and data encryption.
  • RDS (Managed service) — AWS manages OS and database engine patching. Customer manages data, IAM access, and network security.
02

What is the difference between IAM identity-based policies and resource-based policies?

  • Identity-based policies — attached to IAM users, groups, or roles. Define what actions the identity can perform on which resources.
  • Resource-based policies — attached directly to a resource (e.g., S3 bucket policy, KMS key policy, Lambda resource policy). Define who can access the resource.
  • Cross-account access requires BOTH the identity-based policy (in the source account) AND the resource-based policy (in the target account) to allow the action.
03

What is an IAM Permission Boundary and when would you use it?

An IAM Permission Boundary is an advanced feature that sets the maximum permissions an IAM entity (user or role) can have. Even if a policy grants broader permissions, the boundary limits what is actually allowed. Use it to safely delegate permission management — for example, allow developers to create IAM roles for their applications without being able to grant themselves admin access.

04

A company needs EC2 instances to access S3 securely without storing credentials. What is the best solution?

Attach an IAM Role to the EC2 instance with the required S3 permissions. The EC2 instance automatically retrieves temporary credentials via the Instance Metadata Service (IMDS). This is the AWS best practice — never store long-term access keys on EC2 instances. The credentials rotate automatically and expire.

05

What is the difference between a VPC Security Group and a Network ACL?

  • Security Group — stateful firewall at the instance/ENI level. Return traffic is automatically allowed. Rules are allow-only (no deny). All rules are evaluated.
  • Network ACL — stateless firewall at the subnet level. Must explicitly allow both inbound and outbound traffic. Supports both allow and deny rules. Rules are evaluated in order by rule number (lowest first).
  • Best practice: use both for defense in depth. Security Groups for instance-level control; NACLs for subnet-level blocking (e.g., blocking a specific IP range).
06

What is the difference between a VPC Gateway Endpoint and an Interface Endpoint?

  • Gateway Endpoint — free. Supports only S3 and DynamoDB. Added as a route in the route table. Traffic stays on the AWS network without internet or NAT.
  • Interface Endpoint (AWS PrivateLink) — hourly cost + data processing fee. Supports most AWS services. Creates an ENI with a private IP in your subnet. More flexible — works across VPCs and on-premises via Direct Connect/VPN.
  • Use Gateway Endpoints for S3 and DynamoDB to save cost. Use Interface Endpoints for all other services.
07

What is the difference between VPC Peering and AWS Transit Gateway?

  • VPC Peering — direct one-to-one connection between two VPCs. NOT transitive (A↔B and B↔C does not mean A↔C). Works across accounts and regions. No bandwidth limit. No additional cost beyond data transfer.
  • Transit Gateway — hub-and-spoke model. One Transit Gateway connects thousands of VPCs and on-premises networks. Supports transitive routing. Simplifies complex network topologies. Hourly attachment cost.
  • Use VPC Peering for simple, few-VPC setups. Use Transit Gateway for large-scale, complex multi-VPC architectures.
08

How do you encrypt an existing unencrypted RDS database?

You cannot enable encryption on an existing unencrypted RDS instance directly. The process is:

  • 1. Take a snapshot of the unencrypted DB instance.
  • 2. Copy the snapshot and enable encryption (using a KMS key) during the copy.
  • 3. Restore a new DB instance from the encrypted snapshot.
  • 4. Update your application connection string to point to the new encrypted DB.
  • 5. Delete the old unencrypted DB instance.
09

What is the difference between AWS KMS Customer Managed Keys (CMK) and AWS Managed Keys?

  • AWS Managed Keys — created and managed by AWS for use with specific services (e.g., aws/s3, aws/rds). Free. Automatic annual rotation. You cannot manage or delete them.
  • Customer Managed Keys (CMK) — you create, manage, and control. Can set custom rotation policies (annual or on-demand). Can define key policies and grants. Can be used across services and accounts. $1/month per key.
  • Use CMKs when you need audit trails, cross-account access, custom rotation, or the ability to disable/delete the key.
10

What is the difference between AWS Secrets Manager and AWS Systems Manager Parameter Store?

  • Secrets Manager — designed for secrets (DB credentials, API keys). Supports automatic rotation via Lambda. Charges per secret per month ($0.40). Cross-account access supported.
  • Parameter Store — stores configuration data and secrets. Standard tier is free (up to 10,000 parameters). Advanced tier supports larger values and policies. No built-in automatic rotation.
  • Use Secrets Manager when you need automatic rotation. Use Parameter Store for non-sensitive config data or when cost is a concern.
11

What is the difference between Multi-AZ RDS and RDS Read Replicas?

  • Multi-AZ — synchronous replication to a standby in a different AZ. Automatic failover on primary failure (1–2 min). Standby cannot serve read traffic. Purpose: high availability and disaster recovery.
  • Read Replicas — asynchronous replication. Can serve read traffic to offload the primary. Up to 5 replicas for RDS, 15 for Aurora. Can be in different regions. Can be promoted to standalone DB. Purpose: read scalability and performance.
  • Multi-AZ = HA. Read Replicas = performance. You can use both simultaneously.
12

What are the four Disaster Recovery strategies in order of cost and RTO?

  • Backup & Restore — cheapest. Backup to S3/Glacier. Restore from scratch. RTO: hours. RPO: hours.
  • Pilot Light — core services (DB) always running. Scale up app servers on disaster. RTO: 10–30 min. RPO: minutes.
  • Warm Standby — scaled-down but fully functional copy always running. Scale up to full capacity. RTO: minutes. RPO: seconds.
  • Multi-Site Active/Active — full capacity in multiple regions simultaneously. Route 53 routes to both. RTO: seconds. RPO: near zero. Most expensive.
13

What is the difference between Amazon SQS Standard and FIFO queues?

  • Standard Queue — at-least-once delivery (duplicates possible), best-effort ordering, nearly unlimited throughput. Use for high-volume, order-insensitive workloads.
  • FIFO Queue — exactly-once processing, strict first-in-first-out ordering, up to 3,000 messages/second with batching. Use for order-sensitive workloads (financial transactions, inventory updates).
  • SQS Dead Letter Queue (DLQ) — messages that fail processing N times (maxReceiveCount) are moved to a DLQ for debugging. Works with both Standard and FIFO.
14

What is the SNS fan-out pattern and when would you use it?

The SNS fan-out pattern involves publishing a single message to an SNS topic, which then delivers it to multiple SQS queues (or other subscribers) simultaneously. Use it when you need to process the same event in multiple ways in parallel — for example, an order placed event that needs to trigger inventory update, email notification, and analytics processing independently and concurrently.

15

What is the difference between Amazon CloudFront and AWS Global Accelerator?

  • CloudFront — CDN. Caches content at 400+ edge locations. Best for HTTP/HTTPS content delivery (static assets, APIs, video). Reduces origin load via caching. Supports Lambda@Edge.
  • Global Accelerator — no caching. Routes TCP/UDP traffic through the AWS global network to the nearest healthy endpoint. Provides 2 static anycast IPs. Best for non-HTTP workloads, gaming, IoT, or when you need static IPs.
  • Both improve global performance, but CloudFront caches; Global Accelerator proxies.
16

What are the Route 53 routing policies and when would you use each?

  • Simple — single resource. No health checks on the record itself.
  • Weighted — distribute traffic by percentage. Use for A/B testing or gradual blue/green deployments.
  • Latency — route to the region with lowest latency for the user. Use for global apps.
  • Failover — active-passive DR. Route to primary; failover to secondary if health check fails.
  • Geolocation — route based on user's geographic location. Use for content localization or compliance.
  • Geoproximity — route based on resource location with optional bias to shift traffic. Use with Traffic Flow.
  • Multi-Value — return multiple healthy records. Basic load balancing with health checks.
17

What is the difference between EC2 Spot Instances, Reserved Instances, and Savings Plans?

  • Spot Instances — up to 90% discount. Use spare AWS capacity. Can be interrupted with 2-minute warning. Use for fault-tolerant batch jobs, big data, CI/CD.
  • Reserved Instances — up to 72% discount. 1 or 3 year commitment. Standard RI: specific instance type/region. Convertible RI: can change instance type. Use for steady-state workloads.
  • Savings Plans — up to 72% discount. 1 or 3 year commitment. Compute Savings Plans: flexible across EC2, Lambda, Fargate, any region. EC2 Instance Savings Plans: specific instance family in a region.
18

What is the difference between EBS gp2 and gp3 volumes?

  • gp2 — IOPS tied to volume size (3 IOPS/GB, up to 16,000 IOPS). Burst credits for small volumes. Cost: $0.10/GB/month.
  • gp3 — baseline 3,000 IOPS and 125 MB/s throughput regardless of size. IOPS and throughput can be provisioned independently (up to 16,000 IOPS, 1,000 MB/s). 20% cheaper than gp2 ($0.08/GB/month).
  • Always prefer gp3 for new volumes — better performance, lower cost, and no burst credit complexity.
19

What is the difference between Amazon EBS, EFS, and S3?

  • EBS (Elastic Block Store) — block storage. Attached to a single EC2 instance. Stays in one AZ. Like a hard drive. Use for OS volumes, databases, applications requiring low-latency block I/O.
  • EFS (Elastic File System) — managed NFS file system. Mountable on multiple EC2 instances simultaneously across multiple AZs. Scales automatically. Use for shared file storage, CMS, home directories.
  • S3 (Simple Storage Service) — object storage. Accessed via HTTP API. Unlimited capacity. Not mountable as a file system natively. Use for backups, static assets, data lakes, media storage.
20

What is Amazon Aurora and how does it differ from standard RDS?

Amazon Aurora is an AWS-built relational database compatible with MySQL and PostgreSQL. Key differences from standard RDS:

  • Performance — 5x faster than MySQL, 3x faster than PostgreSQL.
  • Storage — auto-scales in 10GB increments up to 128TB. 6 copies of data across 3 AZs.
  • Read Replicas — up to 15 (vs 5 for RDS). Failover to replica in under 30 seconds.
  • Aurora Global Database — primary region + up to 5 secondary regions with <1 second replication lag.
  • Aurora Serverless v2 — scales compute in fine-grained increments (0.5 ACU). Scales to zero when idle.
21

What is Amazon DynamoDB DAX and when should you use it?

DynamoDB Accelerator (DAX) is a fully managed, in-memory cache for DynamoDB. It reduces read latency from single-digit milliseconds to microseconds. DAX is a write-through cache — writes go to both DAX and DynamoDB. Use DAX when your application is read-heavy, requires microsecond latency, or you want to reduce DynamoDB read costs. DAX is NOT suitable for strongly consistent reads or write-heavy workloads.

22

What is the difference between ElastiCache Redis and Memcached?

  • Redis — supports complex data structures (lists, sets, sorted sets, hashes), persistence (RDB/AOF), pub/sub, Lua scripting, cluster mode (horizontal scaling), Multi-AZ with automatic failover, read replicas. Use for session management, leaderboards, real-time analytics.
  • Memcached — simpler, multi-threaded, no persistence, no replication, no failover. Use for simple caching of objects where you need maximum throughput and multi-threading.
  • Choose Redis for most use cases — it is more feature-rich. Choose Memcached only if you need multi-threading and simplicity.
23

What is the difference between AWS Direct Connect and Site-to-Site VPN?

  • Direct Connect — dedicated private physical connection. Consistent bandwidth and latency. Does NOT encrypt by default (add VPN over Direct Connect for encryption). Takes weeks to provision. Use for large data transfers, compliance, or latency-sensitive workloads.
  • Site-to-Site VPN — encrypted IPSec tunnel over the public internet. Quick to set up (minutes). Subject to internet variability. Lower cost. Use for smaller offices or as a backup for Direct Connect.
  • Best practice: use Direct Connect as primary + Site-to-Site VPN as failover backup.
24

What is Amazon API Gateway and what are the differences between REST API and HTTP API?

  • REST API — full feature set: usage plans, API keys, request/response transformation, caching, WAF integration, resource policies. Higher cost.
  • HTTP API — ~70% cheaper, lower latency, simpler. Supports Lambda and HTTP backends, JWT authorizers, CORS. No usage plans or request transformation. Best for most Lambda-backed APIs.
  • WebSocket API — real-time two-way communication. Use for chat apps, live dashboards, multiplayer games.
  • Use HTTP API by default unless you need REST API-specific features like usage plans or request transformation.
25

What is AWS CloudFormation and what is a Stack?

AWS CloudFormation is an Infrastructure as Code (IaC) service that lets you define AWS resources in JSON or YAML templates. A Stack is a collection of AWS resources managed as a single unit — create, update, or delete all resources together. Key features:

  • Change Sets — preview changes before applying them to a stack.
  • StackSets — deploy stacks across multiple AWS accounts and regions from a single template.
  • Drift Detection — identify resources that have been manually changed outside CloudFormation.
  • Rollback — automatically rolls back on failure to the last known good state.
26

What is the difference between horizontal and vertical scaling, and which does AWS prefer?

  • Vertical scaling (scale up) — increase the size of an existing instance (more CPU, RAM). Has limits. Requires downtime for EC2.
  • Horizontal scaling (scale out) — add more instances to distribute load. No practical limits. No downtime. Handled by Auto Scaling Groups.
  • AWS prefers horizontal scaling — it is more resilient, cost-effective, and aligns with the cloud-native design principle of designing for failure.
27

What is an Auto Scaling Group and what are the scaling policy types?

An Auto Scaling Group (ASG) maintains a desired number of EC2 instances, automatically replacing unhealthy ones and scaling based on demand. Scaling policy types:

  • Target Tracking — maintain a specific metric value (e.g., keep CPU at 50%). Simplest and recommended.
  • Step Scaling — scale by a specific amount based on CloudWatch alarm thresholds. More granular control.
  • Scheduled Scaling — scale at specific times (e.g., scale up every Monday at 8am for business hours).
  • Predictive Scaling — uses ML to forecast demand and proactively scale. Best for cyclical traffic patterns.
28

What is the S3 Intelligent-Tiering storage class and when should you use it?

S3 Intelligent-Tiering automatically moves objects between access tiers based on changing access patterns — no retrieval fees, no minimum storage duration for the Frequent and Infrequent Access tiers. It monitors access patterns and moves objects that have not been accessed for 30 days to the Infrequent Access tier, and after 90 days to the Archive Instant Access tier. Use it for data with unknown or unpredictable access patterns where you want automatic cost optimization without operational overhead.

29

What is AWS Lambda Provisioned Concurrency and when should you use it?

Provisioned Concurrency pre-initializes a specified number of Lambda execution environments, keeping them warm and ready to respond immediately. This eliminates cold starts — the latency caused by Lambda initializing a new execution environment. Use Provisioned Concurrency for latency-sensitive applications (APIs, real-time processing) where cold start delays are unacceptable. Note: Provisioned Concurrency has an additional cost beyond standard Lambda pricing.

30

What is the difference between AWS Elastic Beanstalk and AWS CloudFormation?

  • Elastic Beanstalk — PaaS. You provide application code; AWS handles infrastructure (EC2, ALB, Auto Scaling, RDS). Opinionated, less control. Best for developers who want to deploy quickly without infrastructure expertise.
  • CloudFormation — IaC. You define every resource in a template. Full control over all AWS resources. More complex but more flexible. Best for infrastructure teams managing complex, multi-service architectures.
  • Elastic Beanstalk actually uses CloudFormation under the hood to provision its resources.
31

A company needs to migrate 50TB of on-premises data to S3. The internet connection is slow. What is the best solution?

Use AWS Snowball Edge (Storage Optimized). It is a physical device with 80TB of usable storage. AWS ships the device to your location, you load the data, and ship it back. AWS then imports the data into S3. This avoids slow internet transfers. For data larger than 100PB, use AWS Snowmobile (a shipping container). For smaller amounts (up to 8TB), use AWS Snowcone.

32

What is the most cost-effective architecture for a web application with unpredictable traffic spikes?

A serverless architecture: Route 53 → CloudFront → API Gateway → Lambda → DynamoDB (On-Demand mode). This architecture:

  • Scales automatically from zero to millions of requests with no capacity planning.
  • Costs nothing when idle — pay only per request.
  • CloudFront caches responses at the edge, reducing Lambda invocations.
  • DynamoDB On-Demand scales instantly without pre-provisioning.
  • No EC2 instances to manage, patch, or pay for when idle.
33

What is AWS Cost Explorer and how does it differ from AWS Budgets?

  • Cost Explorer — visualization and analysis tool. Shows historical spending, identifies trends, provides rightsizing recommendations for EC2, and forecasts future costs. Use it to understand where your money is going.
  • AWS Budgets — proactive alerting tool. Set thresholds for cost, usage, reservation coverage, or Savings Plans utilization. Sends alerts via email or SNS when you exceed or are forecasted to exceed thresholds. Use it to prevent bill surprises.
  • Use both together: Cost Explorer to analyze and optimize, Budgets to monitor and alert.
34

What is Amazon Kinesis Data Streams and when would you use it over SQS?

  • Kinesis Data Streams — real-time streaming. Multiple consumers can read the same data simultaneously. Data retained up to 365 days. Ordered within a shard. Throughput: 1 MB/s in, 2 MB/s out per shard.
  • SQS — message queue. Each message consumed by one consumer (unless fan-out via SNS). Messages deleted after processing. Better for decoupling microservices.
  • Use Kinesis when multiple consumers need to process the same stream (analytics + archiving + ML simultaneously). Use SQS for task queues where each message is processed once.
35

What is AWS Step Functions and what are the two workflow types?

AWS Step Functions is a serverless orchestration service that coordinates multiple AWS services into workflows using state machines. Two workflow types:

  • Standard Workflows — long-running (up to 1 year), exactly-once execution, full execution history. Use for business-critical workflows, order processing, data pipelines.
  • Express Workflows — high-volume (up to 5 minutes), at-least-once execution, lower cost. Use for IoT data ingestion, streaming data processing, high-frequency event processing.
36

What is Amazon ECS and what is the difference between EC2 launch type and Fargate launch type?

  • EC2 launch type — you provision and manage EC2 instances in the cluster. More control over instance type, OS, and networking. You pay for EC2 instances whether containers are running or not.
  • Fargate launch type — serverless. AWS manages the underlying infrastructure. You define CPU and memory per task. Pay only for resources containers use. No cluster management.
  • Use EC2 launch type for cost optimization at scale or when you need specific instance types (GPU). Use Fargate for simplicity, variable workloads, or when you want zero infrastructure management.
37

What is Amazon CloudWatch and what are its main components?

  • Metrics — time-series data points from AWS services and custom applications. Default metrics (CPU, network) are free; detailed monitoring (1-minute intervals) costs extra.
  • Alarms — trigger actions (SNS notification, Auto Scaling, EC2 action) when a metric crosses a threshold. States: OK, ALARM, INSUFFICIENT_DATA.
  • Logs — centralized log storage. Log Groups contain Log Streams. Use CloudWatch Logs Insights for SQL-like queries.
  • Dashboards — customizable visualizations of metrics across services and regions.
  • Events/EventBridge — respond to state changes in AWS resources or schedule automated actions (cron).
38

What is AWS Systems Manager Session Manager and why is it preferred over a Bastion Host?

  • Session Manager provides browser-based or CLI shell access to EC2 instances without opening inbound ports, managing SSH keys, or maintaining a Bastion Host.
  • No need to open port 22 (SSH) or 3389 (RDP) in security groups — reduces attack surface.
  • All session activity is logged to CloudTrail and optionally to S3 or CloudWatch Logs for auditing.
  • Works for instances in private subnets with no internet access (via VPC endpoint for SSM).
  • Bastion Hosts require managing EC2 instances, SSH keys, and open inbound ports — more operational overhead and security risk.
39

What is Amazon S3 Object Lock and what are the two retention modes?

S3 Object Lock implements WORM (Write Once Read Many) protection, preventing objects from being deleted or overwritten for a specified retention period. Two modes:

  • Compliance Mode — no user, including the root account, can delete or overwrite the object until the retention period expires. Cannot be shortened. Use for strict regulatory compliance (SEC, FINRA).
  • Governance Mode — users with special IAM permissions (s3:BypassGovernanceRetention) can override or delete. More flexible. Use for internal data protection policies.
  • Legal Hold — independent of retention period. Prevents deletion until the hold is removed. No expiry date.
40

What is the difference between AWS CloudTrail and Amazon CloudWatch?

  • CloudTrail — records WHO did WHAT and WHEN. Logs every API call made to AWS services (console, CLI, SDK). Used for security auditing, compliance, and forensic investigation. Answers: "Who deleted that S3 bucket?"
  • CloudWatch — monitors HOW resources are performing. Collects metrics, logs, and events. Used for operational monitoring, alerting, and auto-scaling. Answers: "Is my CPU too high?"
  • Use both together: CloudWatch for operational health, CloudTrail for security and compliance auditing.
41

What is Amazon GuardDuty and what data sources does it analyze?

Amazon GuardDuty is an intelligent threat detection service that uses machine learning and threat intelligence to identify malicious activity and unauthorized behavior in your AWS account. It analyzes:

  • AWS CloudTrail Events — detects unusual API calls, unauthorized deployments.
  • VPC Flow Logs — detects unusual network traffic, port scanning, data exfiltration.
  • DNS Logs — detects communication with known malicious domains.
  • S3 Data Events — detects unusual S3 access patterns (optional).
  • GuardDuty requires no agents, no software to deploy. Enable with one click. Findings sent to Security Hub and EventBridge for automated remediation.
42

What is Amazon Cognito and what is the difference between User Pools and Identity Pools?

  • User Pools — user directory for authentication. Handles sign-up, sign-in, MFA, password policies, social identity providers (Google, Facebook, Apple), and SAML federation. Returns JWT tokens (ID, access, refresh).
  • Identity Pools (Federated Identities) — provides temporary AWS credentials (via STS) to authenticated or unauthenticated users to access AWS services directly (S3, DynamoDB, API Gateway).
  • Common pattern: User Pool authenticates the user → Identity Pool exchanges the JWT for temporary AWS credentials → user accesses AWS resources directly.
43

What is AWS WAF and what types of rules can you create?

AWS WAF (Web Application Firewall) protects web applications from common exploits. It works with CloudFront, ALB, API Gateway, and AppSync. Rule types:

  • IP Set Rules — allow or block requests from specific IP addresses or CIDR ranges.
  • Geo Match Rules — block or allow requests from specific countries.
  • Rate-Based Rules — automatically block IPs that exceed a request rate threshold (DDoS protection).
  • Managed Rule Groups — pre-built rules from AWS or AWS Marketplace (OWASP Top 10, SQL injection, XSS, bot control).
  • String/Regex Match Rules — inspect request components (URI, headers, body, query string) for specific patterns.
44

What is Amazon Redshift and when would you use it instead of RDS?

  • Redshift — columnar storage, massively parallel processing (MPP) data warehouse. Optimized for OLAP (Online Analytical Processing) — complex queries over large datasets (petabytes). Use for business intelligence, reporting, and analytics.
  • RDS — row-based storage optimized for OLTP (Online Transaction Processing) — frequent reads/writes of individual records. Use for application databases.
  • Redshift Spectrum — query data directly in S3 without loading it into Redshift. Use for data lake analytics.
  • Rule of thumb: RDS for your application database; Redshift for your analytics warehouse.
45

What is Amazon EMR and when would you use it?

Amazon EMR (Elastic MapReduce) is a managed big data platform that runs Apache Hadoop, Spark, Hive, HBase, Flink, and other frameworks on EC2 clusters. Use EMR for:

  • Large-scale data processing and transformation (ETL).
  • Machine learning at scale using Spark MLlib.
  • Log analysis and clickstream analytics.
  • Genomics and scientific computing.
  • EMR stores data in S3 (EMRFS) or HDFS. Use Spot Instances for task nodes to reduce cost by up to 90%.
46

What is Amazon Athena and what are its key characteristics?

Amazon Athena is a serverless interactive query service that analyzes data directly in Amazon S3 using standard SQL. Key characteristics:

  • Serverless — no infrastructure to manage. Pay per query ($5 per TB of data scanned).
  • Supports multiple formats — CSV, JSON, ORC, Avro, Parquet. Use columnar formats (Parquet, ORC) and partitioning to reduce data scanned and lower costs.
  • Integrates with AWS Glue Data Catalog for schema management.
  • Use for ad-hoc queries on S3 data lakes, log analysis, and cost-effective analytics without loading data into a database.
47

What is AWS Glue and what is the Glue Data Catalog?

  • AWS Glue — serverless ETL (Extract, Transform, Load) service. Automatically discovers data schemas (Glue Crawlers), generates ETL code (Python/Scala), and runs ETL jobs on a managed Spark environment.
  • Glue Data Catalog — central metadata repository. Stores table definitions, schemas, and partition information for data in S3, RDS, Redshift, and other sources. Used by Athena, EMR, and Redshift Spectrum as a unified metadata store.
  • Use Glue for data lake ETL pipelines. Use the Data Catalog as the single source of truth for your data schemas.
48

What is the difference between AWS Elastic Beanstalk deployment policies?

  • All at once — deploys to all instances simultaneously. Fastest. Causes downtime. Use for dev/test.
  • Rolling — deploys in batches. Reduced capacity during deployment. No downtime. Slower.
  • Rolling with additional batch — launches new instances first, then deploys in batches. Maintains full capacity. No downtime.
  • Immutable — launches a new set of instances with the new version. Zero downtime. Fastest rollback. Most expensive.
  • Blue/Green — deploy to a separate environment, then swap CNAMEs. Zero downtime. Easy rollback. Use for production.
49

What is Amazon CloudFront Origin Access Control (OAC) and why is it important?

Origin Access Control (OAC) is the recommended way to restrict S3 bucket access so that only CloudFront can read from it — preventing users from bypassing CloudFront and accessing S3 directly. With OAC:

  • The S3 bucket policy allows only the specific CloudFront distribution to access objects.
  • The S3 bucket can remain private (Block Public Access enabled).
  • Users must go through CloudFront, ensuring WAF rules, signed URLs, and caching policies are enforced.
  • OAC replaces the older Origin Access Identity (OAI) and supports all S3 regions, SSE-KMS encryption, and dynamic requests.
50

What is the difference between AWS Config and AWS CloudTrail?

  • CloudTrail — records API activity (who did what, when). Answers: "Who changed this security group?" Event-based audit log.
  • AWS Config — records resource configuration state over time. Answers: "What did this security group look like last Tuesday?" Evaluates compliance against Config Rules and can trigger auto-remediation via SSM Automation.
  • Use CloudTrail for security auditing and forensics. Use Config for compliance, configuration history, and drift detection.

Previous Next

Ready to Level Up Your Skills?

Explore 500+ free tutorials across 20+ languages and frameworks.