by the airisDATA Team
At our recent Scala + Spark SQL Workshop we have introduction workshops for Scala and Apache Spark. A number of questions were brought up, I have summarized a lot of the answers here along with some additional resources. The Scala presentation is here. Additional materials from the meetup will be posted soon. Check out both the NJ Hadoop / Big Data Meetup and the NJ Data Science Spark Meetupfor more great workshops and talks.
Github Resources
Setup Resources
Download Apache Spark 1.6 (Prebuild with Hadoop)
Setup
Make sure you have the Java 7 or Java 8 SDK installed for your platform (Mac, Linux or Windows). Then you’ll need to download Scala 2.10.x, SBT and then Spark. I also recommend downloading Maven. See our article on DZONE about setting up a developer machine.
Solving Local Spark Issues
export SPARK_MASTER_IP=127.0.0.1
export SPARK_LOCAL_IP=127.0.0.1
export SCALA_HOME=~/Downloads/scala-2.10.6
export PATH=$PATH:$SCALA_HOME/bin
For Windows, use SET instead of EXPORT and ; and not :.
Training Resources
- Scala for Java programmers – http://www.scala-lang.org/docu/files/ScalaTutorial.pdf – very quick document 13 page pdf, covers quite a lot, I’d say essential reading if you have never seen Scala
- Scala by example – http://www.scala-lang.org/docu/files/ScalaByExample.pdf – by Martin Odersky, some notes are similar to both of his classes on Coursera
- 12 steps to learn Scala – http://www.artima.com/scalazine/articles/steps.html – excellent introduction and quick reading (1-2 hrs)!!! – Kristina made Anki flashcards out of this material
- Effective Scala – http://twitter.github.io/effectivescala/ – nice and concise! Many good tips about good programming practices.
- Scala Collections – http://www.scala-lang.org/docu/files/collections-api/collections.html – this is a must read for anybody who wants to understand the main reason behind
- Scala cheatsheet – http://docs.scala-lang.org/cheatsheets/ – good and bad ways of using the syntax
- The Scala School – https://twitter.github.io/scala_school/ – teaches Scala as a new language – more extensive – I have not read it, but seems like a good reference if stuck on something. It is based on lectures given at Twitter to their own engineers.
Books
Scala by Example by Odersky – most material is from or similar to material covered in both of his classes on Coursera.
Scala Overview by Odersky et al.
Programming in Scala, First Edition by Odersky.
Structure and Interpretation of Computer Programs A classic computer science text that teaches some advanced programming concepts using Lisp and the basis of Martin Odersky‘s coursera class. Formerly the MIT standard text on programming. View on Amazon
A big list of Scala books linked at Scala-Lang
Tutorials
Scala for Java Programmers
Scala Tutorial
Effective Scala (Twitter)
Scala Tour
E-Books
Books at Lightbend (Typesafe)
AtomicScala (sample)
Scala Koans/Exercises
Scala Exercises
Scala Koans
Resources
Scala Roundup for Java Engineers
Scala Info at StackOverflow
Scala Cheetsheats
Scala Notes
Cake Solutions Blog
Scala School (Twitter)
Functional Programming in Scala
How to Learn Scala
Scala Lang Overviews
Learning Scala in Small Bites
Online Free Courses – Scala
Functional Programming with Scala
Reactive Programming with Scala
Online Free Courses – Spark
Big Data Analysis with Spark
Distributed Machine Learning with Spark
Introduction to Spark
Spark Fundamentals
Data Science / Engineering Spark
CS100
CS190