"Apache Spark™ is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters."
How to create an instance of Spark Cluster in the KASI Cloud
- Choose a spark cluster template
- Use the default settings in most cases
- Choose a flavor and Set the number of slaves (minons), then Create
Connect to the Master-Node and Run some basic scripts
Introduction to Apache Spark
- Using Apache Spark for Scientific Research: Basic Concepts and Scientific Examples
- Jupyter Notebooks: csv-to-parquet, SDSS, HR4