Spark Structured Streaming - Hands-on session - Part One

Date

Thursday, 15 Mar 2018 7:00 PM

Venue

NJ Hadoop – Big Data Meetup

Spark Structured Streaming Series – Deep dive sessions

We are planning to host a series of three sessions to cover Spark Structured Streaming in greater detail. The objective is to familiarize users with the design and development of streaming applications using Kakfa and Spark.

The session is limited in number (50)

Agenda (Hands-on session) – Basic – Part One

• Introduction to structured streaming

• Unified batch and streaming API

• Define source – file source, socket source

• Define sink – memory, console, file sink

• Typed vs Untyped

• Defining a schema

• Spark SQL and streaming

• Aggregations – max, avg, etc

• Reading from kafka topic

• Writing to a kafka topic

Speaker:

Vishnu Viswanath

Data Engineer, Media Math

Hands-on session instruction

https://github.com/soniclavier/bigdata-notebook/tree/master/spark_23/spark-kafka-docker
https://github.com/soniclavier/spark-kafka