What are the new features of Apache Spark 2.4, which will be released in 2018?

What are the new features of Apache Spark 2.4, which will be released in 2018?

This article is from the Apache Spark Meetup held at Adobe Systems Inc on September 19, 2018.

The upcoming Apache Spark 2.4 release is the fifth in the 2.x series. This article provides an overview of the key features and enhancements in Apache Spark 2.4.

  • The new scheduling model (Barrier Scheduling) enables users to properly embed distributed deep learning training into Spark stages to simplify the distributed training workflow.
  • Added 35 higher-order functions for array/map operations in Spark SQL.
  • Added a new native AVRO data source based on Databricks' spark-avro module.
  • PySpark also introduces eager evaluation mode for all operations for teaching and debuggability.
  • Spark on K8S supports PySpark and R, and supports client-mode.
  • Various enhancements to Structured Streaming. For example, stateful operators in continuous processing.
  • Various performance improvements to built-in data sources. For example, Parquet nested schema pruning.
  • Support for Scala 2.12.

Click on Shishuo.com to download this PPT.

Summarize

The above is what I introduced to you about the new features of Apache Spark 2.4, which will be launched in 2018. I hope it will be helpful to you. If you have any questions, please leave me a message and I will reply to you in time. I would also like to thank everyone for their support of the 123WORDPRESS.COM website!

You may also be interested in:
  • How to use Spark and Scala to analyze Apache access logs
  • Apache Spark 2.0 jobs take a long time to finish when they are finished

<<:  Notes on using $refs in Vue instances

>>:  How to change mysql password under Centos

Recommend

HTML implements the function of detecting input completion

Use "onInput(event)" to detect whether ...

How to prevent hyperlink redirection using JavaScript (multiple ways of writing)

Through JavaScript, we can prevent hyperlinks fro...

MySQL full-text search usage examples

Table of contents 1. Environmental Preparation 2....

How to implement a binary search tree using JavaScript

One of the most commonly used and discussed data ...

Four data type judgment methods in JS

Table of contents 1. typeof 2. instanceof 3. Cons...

Vue implements two-way data binding

This article example shares the specific code of ...

How to process local images dynamically loaded in Vue

Find the problem Today I encountered a problem of...

Solve the mysql user deletion bug

When the author was using MySQL to add a user, he...

JavaScript canvas Tetris game

Tetris is a very classic little game, and I also ...

MySQL daily statistics report fills in 0 if there is no data on that day

1. Problem reproduction: Count the total number o...

JavaScript Basics: Immediate Execution Function

Table of contents Immediately execute function fo...

Detailed explanation of the basic usage of the Linux debugger GDB

Table of contents 1. Overview 2. gdb debugging 2....

Design theory: On the issues of scheme, resources and communication

<br />This problem does not exist in many sm...