Devops Interview Question
By: zigmoid
Posted on: 07/10/2025
2️⃣ How are your day-to-day activities as a DevOps Engineer?
- Morning stand-ups — you know, pretending you know what you’re doing 😆
- Monitoring infra: checking dashboards, alerts.
- Managing CI/CD pipelines: debugging failed builds because someone pushed broken YAML at 2 AM.
- Writing/maintaining IaC (Terraform, CloudFormation).
- Patching servers, rotating secrets, yelling at Jenkins.
- Automating repetitive tasks with Ansible/Bash/Python.
- Reviewing logs, scaling clusters, keeping K8s happy.
- Helping devs deploy code that worked on their machine.
- Firefighting incidents — PagerDuty is your frenemy.
3️⃣ What are NAT Gateways?
- NAT = Network Address Translation.
- In AWS or cloud infra, a NAT Gateway lets instances in a private subnet access the internet (for updates, repo pulls, etc.) without exposing them directly to incoming traffic.
- They translate private IP → public IP for outbound traffic.
4️⃣ What are pre-requisites to upgrade a K8s cluster?
- Backup everything: etcd snapshots, manifests, secrets.
- Validate current version compatibility.
- Upgrade
kubectl
& client tools first. - Upgrade master nodes before worker nodes.
- Test on a staging cluster.
- Drain nodes properly.
- Make sure all addons/CRDs are compatible.
- Plan rollback.
5️⃣ What is a Pod Disruption Budget (PDB) in K8s?
- A PDB limits how many pods can be voluntarily disrupted at once.
- Example: draining nodes, upgrades.
- E.g.
maxUnavailable: 1
→ always keeps at least one pod running. - Helps maintain service availability during maintenance.
6️⃣ Shell script for factorial:
bashCopy code#!/bin/bash
echo "Enter a number:"
read num
fact=1
for (( i=1; i<=num; i++ ))
do
fact=$((fact * i))
done
echo "Factorial of $num is $fact"
7️⃣ Tell me about VPC structure setup in your project.
- Multiple VPCs for dev, staging, prod.
- Public & private subnets across multiple AZs.
- Internet Gateway for public subnets.
- NAT Gateway in public subnet for private subnet access.
- Route tables: public subnets → IGW, private subnets → NAT.
- Security groups & NACLs.
- Peering for cross-VPC communication.
8️⃣ CI/CD pipeline & security tools integrated?
- Git → Jenkins/GitLab → Docker build → push to ECR/ACR → deploy to K8s.
- Stages: lint, test, build, scan, deploy.
- Security tools:
- Snyk/Trivy for container image scanning.
- SonarQube for code quality.
- HashiCorp Vault for secrets.
- Static code analysis.
- Secrets detection (like GitSecrets).
9️⃣ How do you manage them?
- Pipelines as code (Jenkinsfile/GitLab CI).
- Version-controlled scripts.
- RBAC for pipeline access.
- Rotate creds, use Vault.
- Automated rollback on failure.
- Dashboards for status.
🔟 Rough pipeline script for microservices arch (pseudo Jenkinsfile):
groovyCopy codepipeline {
agent any
stages {
stage('Checkout') {
steps {
git 'https://repo.url'
}
}
stage('Build') {
steps {
sh 'mvn clean package'
}
}
stage('Docker Build & Push') {
steps {
sh '''
docker build -t myapp:latest .
docker tag myapp:latest repo/myapp:latest
docker push repo/myapp:latest
'''
}
}
stage('Deploy to K8s') {
steps {
sh 'kubectl apply -f k8s/deployment.yaml'
}
}
}
}
1️⃣1️⃣ What is multi-stage Docker build?
- Multiple
FROM
instructions in a Dockerfile. - Compile in one stage, copy only the final build/artifact to the final image.
- Reduces image size.
- E.g. build a Go binary in
golang:alpine
→ copy toscratch
oralpine
.
1️⃣2️⃣ What are manifest files?
- YAML or JSON files that define K8s resources: Deployments, Services, ConfigMaps, etc.
- Describes desired state.
- Example:
deployment.yaml
declares how many replicas, containers, env vars, volumes.
1️⃣3️⃣ What is Ansible Vault?
- Encrypts sensitive data: passwords, API keys.
ansible-vault encrypt secrets.yml
- Decrypt at runtime or when editing.
- Keeps secrets out of plain text.
1️⃣4️⃣ How to make a K8s cluster highly available?
- Multiple master nodes spread across AZs.
- External etcd cluster with odd number of members.
- Load balancer in front of API servers.
- Worker nodes spread across AZs.
- Use anti-affinity rules for pods.
- Backups for etcd.
1️⃣5️⃣ Monitoring tools & common pod errors:
- Tools: Prometheus, Grafana, ELK/EFK, Alertmanager, Datadog.
- Alerts on CPU, memory, pod restarts.
- Common pod headaches:
- CrashLoopBackOff → bad configs, failed init containers.
- ImagePullBackOff → wrong image tag, missing creds.
- Pending → insufficient node resources.
- OOMKilled → container ran out of memory.
1️⃣6️⃣ Terraform script for VPC (rough):
hclCopy codeprovider "aws" {
region = "ap-south-1"
}
resource "aws_vpc" "prod_vpc" {
cidr_block = "10.0.0.0/16"
}
resource "aws_subnet" "public" {
vpc_id = aws_vpc.prod_vpc.id
cidr_block = "10.0.1.0/24"
map_public_ip_on_launch = true
}
resource "aws_internet_gateway" "gw" {
vpc_id = aws_vpc.prod_vpc.id
}
resource "aws_route_table" "public" {
vpc_id = aws_vpc.prod_vpc.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.gw.id
}
}
resource "aws_route_table_association" "a" {
subnet_id = aws_subnet.public.id
route_table_id = aws_route_table.public.id
}
1️⃣7️⃣ How many objects can an S3 bucket store?
- Unlimited.
- Seriously, AWS will gladly take your money for trillions of objects.
1️⃣8️⃣ IAM Roles and Policies?
- Roles: Temporary permissions, assumed by users/services.
- Policies: JSON docs that define what actions are allowed or denied.
- Policies attach to roles, users, or groups.
1️⃣9️⃣ What are artifacts?
- Build outputs: binaries, Docker images, JARs, config packages.
- Stored for deploy/reuse.
- E.g.
.jar
file from Maven build → uploaded to Nexus/Artifactory.
2️⃣0️⃣ SATS and DATS?
- Trick one — are you asking about ‘Stateful Application Tests’ and ‘Data Application Tests’?
- Or SAT = System Acceptance Test, DAT = Data Acceptance Test?
- Or you mean System Acceptance Testing (SAT) and Design Acceptance Testing (DAT) — both are QA/validation phases.
2️⃣1️⃣ How do you find errors in pipelines?
- Logs. Lots of logs.
- Jenkins/GitLab has logs for each stage.
- Debug with
echo
orset -x
. - Look at failed step’s stdout/stderr.
- Use pipeline notifications & Slack/Webhooks.
2️⃣2️⃣ What are Ansible Roles?
- Pre-structured way to organize playbooks.
- Roles = reusable units: tasks, vars, handlers, templates, files.
- Example: bashCopy code
roles/ nginx/ tasks/main.yml handlers/main.yml templates/nginx.conf.j2
- Makes your playbooks modular, reusable, DRY.