Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.ClassNotFoundException: Failed to find data source: tfrecord. Please find packages at http://spark.apache.org/third-party-projects.html #68

Open
Ethanhack opened this issue Jul 18, 2023 · 3 comments

Comments

@Ethanhack
Copy link

Hello authors,there exits some confuse when i try to run :
df.write.format("tfrecord").save("hdfs://***/a")

java.lang.ClassNotFoundException: Failed to find data source: tfrecord. Please find packages at http://spark.apache.org/third-party-projects.html

And it's confuse that when i remove spark-mllib dependency the error mistakes . it's is any conflict between mllib and spark-tfrecord? I will appreciate that if you can handle my problem.Thanks again!

my dependency settings:

<java.version>1.8</java.version>
<maven.compiler.source>${java.version}</maven.compiler.source>
<maven.compiler.target>${java.version}</maven.compiler.target>
UTF-8
<scala.version>2.11.12</scala.version>
<scala.binary.version>2.11</scala.binary.version>
<spark.version>2.2.0</spark.version>

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-mllib_${scala.binary.version}</artifactId>
        <version>2.4.3</version>
    </dependency>

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-mllib_${scala.binary.version}</artifactId>
        <version>2.4.3</version>
    </dependency>
@Ethanhack
Copy link
Author

my spark-tfrecord dependency:

com.linkedin.sparktfrecord
spark-tfrecord_2.11
0.2.6

@junshi15
Copy link
Contributor

I am not aware of any conflict with spark-mllib.
Your error seems to suggest spark was not able to find the spark-tfrecord jar file.

@mizhou-in
Copy link
Contributor

Hi @Ethanhack , please use the correct spark versions as shown in the README.md,you may want to use spark 2.4, we don't support spark 2.2.

Version 0.1.x targets Spark 2.3 and Scala 2.11
Version 0.2.x targets Spark 2.4 and both Scala 2.11 and 2.12
Version 0.3.x targets Spark 3.0 and Scala 2.12
Version 0.4.x targets Spark 3.2 and Scala 2.12
Version 0.5.x targets Spark 3.2 and Scala 2.13
Version 0.6.x targets Spark 3.4 and both Scala 2.12 and 2.13

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants