Big data continues to heat up, and if you are not familiar with several big data components, you don’t even have a catchphrase to show off. At the very least, you should be able to speak hadoop, hdfs, mapreduce, yarn, kafka, spark, zookeeper, neo4j. These are essential skills for showing off. There are a lot of detailed introductions about spark on the Internet. Just search for it. Next, let’s talk about the installation and brief use of the stand-alone version of spark. 0. Install JDK. Since I already have JDK on my machine, I can skip this step. JDK is already a cliché, needless to say, it is indispensable when using Java/Scala. ubuntu@VM-0-15-ubuntu:~$ java -version openjdk version "1.8.0_151" OpenJDK Runtime Environment (build 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12) OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode) ubuntu@VM-0-15-ubuntu:~$ 1. You don't necessarily need to install hadoop, you just need to choose a specific spark version. You don't need to download Scala, because Spark will come with Scala shell by default. Go to Spark official website to download it. In an environment without Hadoop, you can choose: spark-2.2.1-bin-hadoop2.7, and then unzip it as follows: ubuntu@VM-0-15-ubuntu:~/taoge/spark_calc$ ll total 196436 drwxrwxr-x 3 ubuntu ubuntu 4096 Feb 2 19:57 ./ drwxrwxr-x 9 ubuntu ubuntu 4096 Feb 2 19:54 ../ drwxrwxr-x 13 ubuntu ubuntu 4096 Feb 2 19:58 spark-2.2.1-bin-hadoop2.7/ -rw-r--r-- 1 ubuntu ubuntu 200934340 Feb 2 19:53 spark-2.2.1-bin-hadoop2.7.tgz 2. Spark has Python and Scala versions. Next, I will use the Scala version of the shell, as follows: ubuntu@VM-0-15-ubuntu:~/taoge/spark_calc/spark-2.2.1-bin-hadoop2.7$ bin/spark-shell Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 18/02/02 20:12:16 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using built-java classes where applicable 18/02/02 20:12:16 WARN Utils: Your hostname, localhost resolves to a loopback address: 127.0.0.1; using 172.17.0.15 instead (on interface eth0) 18/02/02 20:12:16 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address Spark context Web UI available at http://172.17.0.15:4040 Spark context available as 'sc' (master = local[*], app id = local-1517573538209). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.2.1 /_/ Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_151) Type in expressions to have them evaluated. Type :help for more information. scala> To perform simple operations: scala> val lines = sc.textFile("README.md") lines: org.apache.spark.rdd.RDD[String] = README.md MapPartitionsRDD[1] at textFile at <console>:24 scala> lines.count() res0: Long = 103 scala> lines.first() res1: String = # Apache Spark scala> :quit ubuntu@VM-0-15-ubuntu:~/taoge/spark_calc/spark-2.2.1-bin-hadoop2.7$ ubuntu@VM-0-15-ubuntu:~/taoge/spark_calc/spark-2.2.1-bin-hadoop2.7$ ubuntu@VM-0-15-ubuntu:~/taoge/spark_calc/spark-2.2.1-bin-hadoop2.7$ ubuntu@VM-0-15-ubuntu:~/taoge/spark_calc/spark-2.2.1-bin-hadoop2.7$ wc -l README.md 103 README.md ubuntu@VM-0-15-ubuntu:~/taoge/spark_calc/spark-2.2.1-bin-hadoop2.7$ head -n 1 README.md # Apache Spark ubuntu@VM-0-15-ubuntu:~/taoge/spark_calc/spark-2.2.1-bin-hadoop2.7$ Let's take a look at the visual web page. On Windows, enter: http://ip:4040 OK, this article is just a simple installation, we will continue to introduce spark in depth later. Summarize The above is the full content of this article. I hope that the content of this article will have certain reference learning value for your study or work. Thank you for your support of 123WORDPRESS.COM. If you want to learn more about this, please check out the following links You may also be interested in:
|
<<: Sample code for implementing interface signature with Vue+Springboot
>>: Download MySQL 5.7 and detailed installation diagram for MySql on Mac
It's the end of the year and there are fewer ...
Ubuntu 20.04 has been officially released in Apri...
This article shares the specific code of JavaScri...
Without further ado, I'll go straight to the ...
Install crontab yum install crontabs CentOS 7 com...
This article shares a collection of Java problems...
1. Arrow Function 1. Take advantage of the fact t...
1. To build a PPTP VPN, you need to open port 172...
background Today, while cooperating with other pr...
Because some dependencies of opencv could not be ...
1. Always use :key in v-for Using the key attribu...
What is a covering index? Creating an index that ...
FastDFS & Nginx Integration: The tracker is c...
1. iframe definition and usage The iframe element...
1. Business Background Using a mask layer to shie...