In an e-commerce platform, the checkout service is a critical component responsible for processing orders. This service must handle tasks like validating shopping cart items, calculating prices, managing inventory, and processing payments.
Due to seasonal peaks (e.g., Amazon Big Billion Days, holiday sales), the checkout service needs to be scalable, resilient, and responsive. In this case study, we’ll analyze how to determine the optimal number of pods for the checkout service to handle both regular and peak traffic loads, ensuring smooth customer experiences and minimizing latency during high-demand periods.

Requirements for the Checkout Service

High Availability and Fault Tolerance

  • The service should be distributed across multiple nodes to avoid a single point of failure.
  • It should handle failures seamlessly to maintain a smooth user experience.

Scalability

  • The system should be able to scale up during peak demand (e.g., sale events) and scale down during off-peak periods to save costs.

Low Latency:

  • As the checkout process directly affects user satisfaction, it must respond quickly, ideally under 200 ms for 95% of transactions.

High Throughput

  • It should be able to handle a high number of concurrent users during peak times without degradation in performance.

Service Level Objectives (SLOs)

  • Response time <200 ms for 95% of requests.
  • Availability >99.9% for the checkout service.

Step 1: Profiling the Workload

Normal Traffic Load:

  • During regular hours, the checkout service handles around 500 requests per second.

Peak Traffic Load:

  • During peak sale events, traffic can increase by 10x, reaching up to 5000 requests per second.

Average Resource Consumption Per Request:

  • CPU Usage: 0.2 CPU cores
  • Memory Usage: 250 MB
  • Latency target: <200 ms

From load testing, it’s determined that each pod can process up to 50 requests per second under typical configurations without exceeding latency or CPU limits.

Step 2: Estimating Pod Count

Base Pod Count Calculation

Normal Load Requirement

Pods required = Normal Requests per second / Requests per Pods per second i.e 500 / 50 which equals to 10 Pods.

Peak Load Requirement

Pods required = Peak requests per second / Requests per Pods per second i.e. 5000 / 50 which equals to 100 Pods.

Establishing Pod Autoscaling Ranges

  • Minimum Pods: 10 (for normal traffic)
  • Maximum Pods: 100 (for peak traffic)

By configuring Kubernetes' Horizontal Pod Autoscaler (HPA), we allow the checkout service to dynamically adjust the number of pods between these values based on actual CPU usage and traffic.

Step 3: Implementing Autoscaling

Kubernetes’ Horizontal Pod Autoscaler (HPA) can scale the number of pods based on custom metrics, typically CPU or request rate.

Autoscaling Configuration for the Checkout Service

CPU-Based Autoscaling

  • Set target CPU utilization to 60%. This threshold ensures that as traffic rises, the number of pods will increase to maintain optimal CPU usage.

Latency-Based Autoscaling (if supported):

  • Use custom metrics if available to trigger autoscaling when request latency approaches 200 ms, adding more pods to reduce latency.

Example HPA Configuration:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: checkout-service-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: checkout-service
minReplicas: 10
maxReplicas: 100
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60

Step 4: Ensuring High Availability

To improve availability and prevent single-point failures:

Pod Distribution Across Zones

  • Ensure that pods are spread across multiple availability zones to minimize the impact of zone-specific outages.

Pod Disruption Budget (PDB)

  • Configure PDBs to maintain a minimum number of available pods during rolling updates or node failures.

Node Affinity and Anti-Affinity:

  • Use node affinity to spread pods across different nodes and prevent them from clustering on a single node.

Example Pod Disruption Budget

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: checkout-pdb
spec:
minAvailable: 80%
selector:
matchLabels:
app: checkout-service

Step 5: Monitoring and Optimization

Monitoring Metrics:

  • Use monitoring tools (e.g., Prometheus, Grafana) to track CPU, memory, and latency metrics for the checkout service.

Autoscaling Adjustment:

  • Periodically review the scaling thresholds and adjust HPA configurations based on observed traffic patterns and resource usage.

Cost Optimization:

  • Evaluate usage patterns to fine-tune resource requests and limits, balancing cost with performance.
💡
By using Kubernetes autoscaling, resource requests, and high-availability configurations, the e-commerce checkout service can dynamically adjust its pod count to handle variable traffic loads, from normal operation to peak sales.

This approach not only ensures that the checkout service meets latency and availability requirements but also helps in managing costs efficiently by scaling down during off-peak times.

This scalable, responsive architecture makes the checkout service resilient and ensures a seamless shopping experience for users, even during high-demand events.