How to install the standalone version of spark in linux environment without using hadoop

Big data continues to heat up, and if you are not familiar with several big data components, you don’t even have a catchphrase to show off. At the very least, you should be able to speak hadoop, hdfs, mapreduce, yarn, kafka, spark, zookeeper, neo4j. These are essential skills for showing off.

There are a lot of detailed introductions about spark on the Internet. Just search for it. Next, let’s talk about the installation and brief use of the stand-alone version of spark.

0. Install JDK. Since I already have JDK on my machine, I can skip this step. JDK is already a cliché, needless to say, it is indispensable when using Java/Scala.

ubuntu@VM-0-15-ubuntu:~$ java -version
openjdk version "1.8.0_151"
OpenJDK Runtime Environment (build 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12)
OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)
ubuntu@VM-0-15-ubuntu:~$

1. You don't necessarily need to install hadoop, you just need to choose a specific spark version. You don't need to download Scala, because Spark will come with Scala shell by default. Go to Spark official website to download it. In an environment without Hadoop, you can choose: spark-2.2.1-bin-hadoop2.7, and then unzip it as follows:

ubuntu@VM-0-15-ubuntu:~/taoge/spark_calc$ ll
total 196436
drwxrwxr-x 3 ubuntu ubuntu 4096 Feb 2 19:57 ./
drwxrwxr-x 9 ubuntu ubuntu 4096 Feb 2 19:54 ../
drwxrwxr-x 13 ubuntu ubuntu 4096 Feb 2 19:58 spark-2.2.1-bin-hadoop2.7/
-rw-r--r-- 1 ubuntu ubuntu 200934340 Feb 2 19:53 spark-2.2.1-bin-hadoop2.7.tgz

2. Spark has Python and Scala versions. Next, I will use the Scala version of the shell, as follows:

ubuntu@VM-0-15-ubuntu:~/taoge/spark_calc/spark-2.2.1-bin-hadoop2.7$ bin/spark-shell 
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
18/02/02 20:12:16 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using built-java classes where applicable
18/02/02 20:12:16 WARN Utils: Your hostname, localhost resolves to a loopback address: 127.0.0.1; using 172.17.0.15 instead (on interface eth0)
18/02/02 20:12:16 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Spark context Web UI available at http://172.17.0.15:4040
Spark context available as 'sc' (master = local[*], app id = local-1517573538209).
Spark session available as 'spark'.
Welcome to
   ____ __
   / __/__ ___ _____/ /__
  _\ \/ _ \/ _ `/ __/ '_/
  /___/ .__/\_,_/_/ /_/\_\ version 2.2.1
   /_/
Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_151)
Type in expressions to have them evaluated.
Type :help for more information.
scala>

To perform simple operations:

scala> val lines = sc.textFile("README.md")
lines: org.apache.spark.rdd.RDD[String] = README.md MapPartitionsRDD[1] at textFile at <console>:24
scala> lines.count()
res0: Long = 103
scala> lines.first()
res1: String = # Apache Spark
scala> :quit
ubuntu@VM-0-15-ubuntu:~/taoge/spark_calc/spark-2.2.1-bin-hadoop2.7$ 
ubuntu@VM-0-15-ubuntu:~/taoge/spark_calc/spark-2.2.1-bin-hadoop2.7$ 
ubuntu@VM-0-15-ubuntu:~/taoge/spark_calc/spark-2.2.1-bin-hadoop2.7$  
ubuntu@VM-0-15-ubuntu:~/taoge/spark_calc/spark-2.2.1-bin-hadoop2.7$ wc -l README.md 
103 README.md
ubuntu@VM-0-15-ubuntu:~/taoge/spark_calc/spark-2.2.1-bin-hadoop2.7$ head -n 1 README.md 
# Apache Spark
ubuntu@VM-0-15-ubuntu:~/taoge/spark_calc/spark-2.2.1-bin-hadoop2.7$

Let's take a look at the visual web page. On Windows, enter: http://ip:4040

OK, this article is just a simple installation, we will continue to introduce spark in depth later.

Summarize

The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. Thank you for your support of 123WORDPRESS.COM. If you want to learn more about this, please check out the following links

You may also be interested in:

Detailed steps to install Hadoop cluster under Linux
Hadoop 2.7.3 installation and setup process under Linux
Detailed graphic explanation of hadoop installation and configuration based on Linux7
Steps to build Hadoop service in Centos7 in Linux
Detailed steps to install and configure hadoop cluster in Linux
Sharing the steps of building a hadoop environment under Linux
Detailed explanation of installing Hadoop true distributed cluster on Linux system

<<: Sample code for implementing interface signature with Vue+Springboot

>>: Download MySQL 5.7 and detailed installation diagram for MySql on Mac

Implementation of vue3.0+vant3.0 rapid project construction

Build a stable and highly available cluster based on mysql+mycat, load balancing, master-slave replication, read-write separation operation

Blog

How to install the standalone version of spark in linux environment without using hadoop

Implementation of vue3.0+vant3.0 rapid project construction

Detailed explanation of how to prevent content from being selected, copied, or right-clicked in HTML pages

Block-level and line-level elements, special characters, and nesting rules in HTML

Summarize the common properties of BigIn functions in JavaScript

Solution to forgetting mysql database password

How to install mysql5.7 in windows

Detailed explanation of Linux tee command usage

Example code for converting Mysql query result set into JSON data

Detailed explanation of MySQL batch SQL insert performance optimization

Build a stable and highly available cluster based on mysql+mycat, load balancing, master-slave replication, read-write separation operation

Recommend

Complete steps to configure basic user authentication at the Nginx level

JS implements a simple brick-breaking pinball game

Installation of mysql-community-server. 5.7.18-1.el6 under centos 6.5

CSS automatically intercepts the specified length string and displays the end... Support FF browser

18 common commands in MySQL command line

Detailed explanation of CSS3 to achieve responsive accordion effect

CSS3 uses scale() and rotate() to achieve zooming and rotation

Markup language - CSS layout

MySQL 5.7 mysql command line client usage command details

How to Apply for Web Design Jobs

Detailed example of how to implement transaction commit and rollback in mysql

What to do after installing Ubuntu 20.04 (beginner's guide)

Detailed process of using nginx to build a webdav file server in Ubuntu

Detailed analysis of MySQL 8.0 memory consumption

Sample code for deploying Spring-boot project with Docker