Deploy Spark/ML Jobs on Kubernetes using Dockers
Apache Spark is a data processing framework that can quickly perform complex processing tasks on very large datasets, and it can also distribute these tasks across multiple computers, either on its own or in tandem with other distributed computing tools. Its speed, ease of use, and support for diverse data sources make it a popular choice of framework.
Kubernetes provides you with a portable, extensible, open-source framework to run distributed systems resiliently. It takes care of scaling and failover for your application, provides deployment patterns, and more.
Docker is an open platform for developing, shipping, and running applications. It enables you to separate your applications from your infrastructure so you can deliver software quickly. It assists in fast, consistent delivery of applications, has responsive deployment and scaling and yet, is lightweight.
All these three powers merge together to form a resilient ecosystem of Spark/ML jobs that run on a reliable cluster/infrastructure.