-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
maven packages spark.jars.packages doesn't loaded into executers classpath #59
Comments
Hi @BenMizrahiPlarium , one question to clarify your setup: do you configure |
Hi @jahstreet, I did the setup using the config section in Livy create session request - in other use cases I think that the jar artifacts downloaded from maven is shared via HDFS and available in the driver and worker - in this usecase it’s only available in the driver maven local repository. I see that the artifacts download- but doesn’t available in the executers classpath If you have any idea - it will be very helpful:) |
Currently running in the same issue.
The jar is also not uploaded to Passing the http link to file in the
|
I really think it’s because Spark has no shared file system between workers and driver. As a workaround for now - I solve it using gsfuse by mounting google bucket in driver and workers and configure both maven local repository to point to this folder and add the folder to spark extra classpath. so finally the bucket is mounted to /etc/google in both drivers and executers spark.driver.extraClassPath=/etc/google/jars/* and it works, but my problem with that is that it’s not temporary it’s persistent between all sessions- and it’s not isolated per user. |
I understood from the documentation that I set mine
This is the relevant snippet from the driver configmap. I can also see from the driver and executors logs that both of them downloads the jars files:
but the ones passed by the |
In don’t think that spark.kubernetes.file.upload.path is related to external maven dependencies- it’s related to the spark application jar uploaded to S3. Spark downloading maven dependencies into local maven repository and using it in the class path. as far as I see, jars are loaded into the driver and every work done by the driver is fully supported, but when the actual task being execute in one of the executers- task failed due to ClassNotFound exception - it means that the jar isn’t available in the executer classpath |
Hi
I'm having an issue while loading maven packages dependencies, while using SparkMagic and this helm chart for livy and spark in k8s.
In spark config I set the config:
spark.jars.packages=org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.1
the dependencies downloaded into /root/.ivy2/jars but dosn't included into spark classpath and when trying to execute action I'm getting the following error:
21/01/05 11:22:15 INFO DAGScheduler: ShuffleMapStage 1 (take at :30) failed in 0.197 s due to Job aborted due to stage failure: Task 1 in stage 1.0 failed 4 times, most recent failure: Lost task 1.3 in stage 1.0 (TID 17, 10.4.187.11, executor 2): java.lang.ClassNotFoundException: org.apache.spark.sql.kafka010.KafkaSourceRDDPartition
Do you have any suggestions ?
Thanks
The text was updated successfully, but these errors were encountered: