Rock The JVM

Rock The JVM – Spark Optimization with Scala

I N F O R M A T I O N:

MP4 Video: h264, 1280×720 Audio: AAC, 44.1 KHz, 2 Ch
Genre: eLearning Language: English Duration: 27 Lessons (9h) Size: 1.44 GB

Go fast or go home.

Learn the ins and outs of Spark and get the best out your code.
In this course, we cut the weeds at the root. We dive deep into Spark and understand how it works under the hood. We'll see that we have incredible leverage, IF we write intelligent code, and you will do exactly that. You will learn 20+ techniques and optimization strats. Each of them individually can give at least a 2x perf boost for your jobs (some of them even 10x), and I show it on camera.

You'll understand Spark internals and how Spark works behind the scenes

You'll be able to predict in advance if a job will take a long

You'll diagnose performance problems in the Spark UI

You'll write smart joins with no shuffles

You'll organize your data intelligently so expensive operations are no longer a problem

You'll use RDD capabilities for bespoke, high-performance jobs

You'll leverage the JVM for performance-critical applications

You'll save hours of computation in this course alone (let alone in prod!)

Plus some extra perks:

You'll have access to the entire code I write on camera (~1400 LOC)

You'll be invited to our private Slack room where I'll share latest updates, discounts, talks, conferences, and recruitment opportunities

(soon) You'll have access to the takeaway slides

(soon) You'll be able to the videos for your offline view

Deep understanding of Spark internals so you can predict job performance

stage & task decomposition

reading query plans before jobs will run

reading DAGs while jobs are running

performance differences between the different Spark APIs

packaging and deploying a Spark app

configuring Spark in 3 different ways

DataFrame and Spark SQL Optimizations

understanding join mechanics and why they are expensive

writing broadcast joins, or what to do when you join a large and a small DataFrame

write pre-join optimizations: column pruning, pre-partitioning

bucketing for fast access

fixing data skews, "straggling" tasks and OOMs

Optimizing RDDs

using broadcast joins "manually"

cogrouping RDDs in multi-way joins

fixing data skews

writing optimizations that Spark doesn't generate for us

Optimizing key-value RDDs, as most useful transformations need them

using the different _byKey methods intelligently

reusing JVM objects for when performance is critical and even a few seconds count

using the powerful iterator-to-iterator pattern for arbitrary efficient processing

This course is for Scala and Spark programmers who need to improve the run of their jobs. If you've never done Scala or Spark, this course is not for you.

Authors: Udemy

Date: 2020

Upload Date: 9/15/2020 3:08:31 AM

Format: MP4




Language: English

ISBN / ASIN: 0000000000


[ARSocial_Lite_Locker id=1]
Please click here——->Free down


This website is authorized using the BY-NC-SA 4.0Authorization by agreement.