# Spark-learning **Repository Path**: compasslebin_admin/Spark-learning ## Basic Information - **Project Name**: Spark-learning - **Description**: No description available - **Primary Language**: Scala - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-07-31 - **Last Updated**: 2023-11-13 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README ## spark-learning 学习spark基础理论知识 ## 本地运行Spark方法 - 下载spark安装包 - 解压spark安装包 - 进入spark解压目录下,运行: ```bash $ bin/spark-shell ``` - 在命令行提示符下拷贝以下代码并查看执行结果 ```scala import scala.math.random val tasks = 10 val n = tasks * 100000 val count = sc.parallelize(1 until n, tasks).map { i => val x = random * 2 - 1 val y = random * 2 - 1 if (x*x + y*y <= 1) 1 else 0 }.reduce(_ + _) println("Pi is roughly " + 4.0 * count / n ) ``` ## 分布式运行Spark方法 ### 搭建hadoop集群 Hadoop YARN/HDFS配置文件参考:conf/hadoop目录 ### 配置Spark客户端,并启动spark history server - Spark客户端配置文件参考:conf/spark目录 - 启动spark history server: sbin/start-history-server.sh ### 将spark-shell运行在yarn client或cluster模式 - yarn client模式:bin/spark-shell --master yarn --deploy-mode client - yarn cluster:bin/spark-shell --master yarn --deploy-mode cluster