Pinterest's big data infrastructure utilizes frameworks like MapReduce™, Spark™, and Flink™ on Hadoop™ YARN™, spread across multiple clusters on AWS with Auto Scaling Groups.
Challenges faced included limitations in IP addresses, costs of running parallel clusters, and risks involved in switching over applications to new clusters.
The migration strategy involved introducing a new Auto Scaling Group (ASG) for in-place migration with the desired configuration.
Hadoop Control Center (HCC) was introduced to automate cluster operations and simplify scaling processes like scale-in and scale-out.
HCC streamlines cluster administration, provides tools for resizing ASGs, monitoring nodes, reporting, and managing Hadoop-related tasks.
HCC's architecture includes a main manager node, worker nodes per VPC, and adopts a process for efficient scale-in through automation.
HCC's decommissioning process involves queues, instance selection based on container count, and monitoring to prevent data loss during scale-in.
HCC integrates with AWS(™) APIs to manage ASG resizing directly, reducing the dependency on Terraform for ASG size changes.
Future capabilities for HCC include ingesting AWS™ events, node rotation, bad node detection, and automated node remediation.
HCC aims to streamline cluster operations, automate scaling processes, and simplify cluster management for Pinterest's big data infrastructure.