Hi Susan,

I am just starting out with Machine Learning in Spark and this guide was a great introduction. Really great walk through!

Just wanted to add something I came across on the web: In Spark 2.0, you can create a SparkContext in a more concise way:

from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()

SparkSession essentially condenses SparkConf, SparkContext, sqlContext all into one unified API

Thanks again for the article!

Stumbled into a data-centric role several years ago and have not looked back! Passionate about leveraging technology to uncover answers and improve the world.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store