Troubleshoot Environment - TheLadders/pipeline GitHub Wiki
- It's likely that you have old, unused containers from each
docker run
command - These don't get garbage collected automatically as Docker assumes you may want to start them again
- Use the following command to clean them out
docker rm `docker ps -aq`
NOTE: If docker fills up the root partition of your VM, then docker daemon might not start, in which case running any docker commands will tell you that the daemon is not running.
- Confirm out of disk space using
df -l
- Blow away the Docker working dirs
sudo rm -rf /var/lib/docker
- Pull the pipeline image again and start over.
- Make sure you've run the following
eval "$(docker-machine env pipelinebythebay)"
java.nio.channels.ClosedChannelException
at org.apache.spark.streaming.kafka.KafkaUtils$$anonfun$createDirectStream$2.apply(KafkaUtils.scala:416)
at org.apache.spark.streaming.kafka.KafkaUtils$$anonfun$createDirectStream$2.apply(KafkaUtils.scala:416)
at scala.util.Either.fold(Either.scala:97)
at org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:415)
at com.bythebay.pipeline.spark.streaming.StreamingRatings$.main(StreamingRatings.scala:39)
at com.bythebay.pipeline.spark.streaming.StreamingRatings.main(StreamingRatings.scala)
- You likely have not started your services using
bythebay-start.sh
. - Or there was an issue starting your Spark Master and Worker services.
Caused by: java.io.FileNotFoundException: datasets/dating/ratings.csv (No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:146)
at scala.io.Source$.fromFile(Source.scala:90)
at scala.io.Source$.fromFile(Source.scala:75)
at scala.io.Source$.fromFile(Source.scala:53)
at com.bythebay.pipeline.akka.feeder.FeederActor.initData(FeederActor.scala:34)
at com.bythebay.pipeline.akka.feeder.FeederActor.<init>(FeederActor.scala:23)
- You likely have not run
bythebay-config.sh
orbythebay-setup.sh
as the required datasets have not been uncompressed.
- Run the following to repair your busted boot2docker:
macosx-laptop$ sudo /Library/Application\ Support/VirtualBox/LaunchDaemons/VirtualBoxStartup.sh restart
More docs here
- Re-run the following including the
-v
flag
macosx-laptop$ boot2docker stop
macosx-laptop$ boot2docker destroy
macosx-laptop$ boot2docker -v --memory=8192 --disksize=20000 init
macosx-laptop$ boot2docker up
- You likely need to remove an existing directory and re-initialize boot2docker:
macosx-laptop$ rm -rf /Users/<user-name>/.boot2docker/certs/boot2docker-vm/
macosx-laptop$ boot2docker stop
macosx-laptop$ boot2docker destroy
macosx-laptop$ boot2docker -v --memory=8192 --disksize=20000 init
macosx-laptop$ boot2docker up
TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
- You likely have not configured your VM environment to have enough cores to run the Spark jobs
- Also, check spark-defaults.conf has the following:
spark.executor.cores=2
spark.cores.max=2