九月 14, 2024
安装Spark 官网下载,解压缩
配置环境变量JAVA_HOME,以及添加spark到系统路径
~/.bashrc
export SPARK_HOME=/data/download/spark-3.5.1-bin-hadoop3 export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin export JAVA_HOME=/usr/lib/jvm/TencentKona-8.0.17-402 export PATH=$PATH:$JAVA_HOME/bin spark-shell 启动sparkshell
spark-shell 单词计数程序测试
var hFile = sc.textFile("hdfs://localhost:9000/user/wjrtest/input/capacity-scheduler.xml") val wc = hFile.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_ + _) wc.take(5) pyspark pyspark shell 运行pyspark
pyspark 启动失败,出现segmentation fault
增加core文件的大小限制
ulimit -c unlimited 再次运行pyspark,触发Segmentation fault错误
使用gdb调试
gdb -c core (gdb) bt 结果如下:
(base) [root@xxx-tencentos /data/download/spark-3.5.1-bin-hadoop3]# gdb -c core.17222 GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-120.tl2 Copyright (C) 2013 Free Software Foundation, Inc.
继续阅读