logo
Tags down

shadow

Getting ClassCastException while trying to save file in avro format in spark


By : Jay Hancock
Date : October 17 2020, 08:10 AM
wish helps you Indeed, you have to define an avro schema, or use external librairies such as avro4s, to get it from your case class.
Using native Avro:
code :
val schema = "{\"type\":\"record\",\"name\":\"TrafficSchema\",\"namespace\":\"your.project.namespace\",\"fields\":[{\"name\":\"str\",\"type\":\"string\"},{\"name\":\"i\",\"type\":\"int\"},{\"name\":\"i1\",\"type\":\"int\"},{\"name\":\"i2\",\"type\":\"int\"},{\"name\":\"fl\",\"type\":\"float\"}]}"

val trafficSchema = new Schema.Parser().parse(schema)
import com.sksamuel.avro4s.AvroSchema

val trafficSchema: Schema = AvroSchema[TrafficSchema]


Share : facebook icon twitter icon

How to save a spark rdd to an avro file


By : user3376746
Date : March 29 2020, 07:55 AM
will be helpful for those in need I am trying to save an rdd to a file in avro format. This is how my code looks like: , So I managed to find a 'workaround'.
code :
 val job = new Job(spark.hadoopConfiguration)
  AvroJob.setOutputKeySchema(job, PageViewEvent.SCHEMA$)

  val output = s"/avro/${date.toString(dayFormat)}"
  rmr(output)
  rdd.coalesce(64).map(x => (new AvroKey(x._1), x._2))
    .saveAsNewAPIHadoopFile(
    output,
    classOf[PageViewEvent],
    classOf[org.apache.hadoop.io.NullWritable],
    classOf[AvroKeyOutputFormat[PageViewEvent]],
    job.getConfiguration)

Resolve spark-avro error = Failed to load class for data source: com.databricks.spark.avro


By : Huang Xiao
Date : March 29 2020, 07:55 AM
I hope this helps . "sbt package" will not include your dependencies, try sbt-assembly instead.

Spark: Reading avro file without com.databricks.spark.avro


By : Badhrinath Srinivasa
Date : March 29 2020, 07:55 AM
I wish this helpful for you First of all its not --package it's --packages.
Secondly, version seems to be incomplete.

Avro: ClassCastException while serializing / deserializing a file that contains an Enum value


By : Artur Chyży
Date : March 29 2020, 07:55 AM
it should still fix some issue I believe the main issue here is that $ has a special meaning in Java classes, and less important is that package names are typically lowercased.
So, you should at least edit the namespaces to remove the $

How to use spark-avro package to read avro file from spark-shell?


By : user260710
Date : March 29 2020, 07:55 AM
fixed the issue. Will look into that further tl;dr Since Spark 2.4.x+ provides built-in support for reading and writing Apache Avro data, but the spark-avro module is external and not included in spark-submit or spark-shell by default, you should make sure that you use the same Scala version (ex. 2.12) for the spark-shell and --packages.
shadow
Privacy Policy - Terms - Contact Us © voile276.org