logo
down
shadow

APACHE-SPARK QUESTIONS

Median / quantiles within PySpark groupBy
Median / quantiles within PySpark groupBy
will be helpful for those in need I guess you don't need it anymore. But will leave it here for future generations (i.e. me next week when I forget).
TAG : apache-spark
Date : November 25 2020, 03:01 PM , By : Bob Wiltion
Spark DF to Tableau TDE
Spark DF to Tableau TDE
fixed the issue. Will look into that further I fixed this by placing all required jars(the ones including in the package werneckpaiva:spark-to-tableau:0.1.0) in spark bin folder and calling the jars as below
TAG : apache-spark
Date : November 24 2020, 03:01 PM , By : Hari Mohanty
Why am I not able to override num-executors option with spark-submit?
Why am I not able to override num-executors option with spark-submit?
I wish this helpful for you I am trying to override spark properties such as num-executors while submitting the application by spark-submit as below
TAG : apache-spark
Date : November 23 2020, 03:01 PM , By : Luis Ch
Spark SQL UNION - ORDER BY column not in SELECT
Spark SQL UNION - ORDER BY column not in SELECT
This might help you I'm doing a UNION of two temp tables and trying to order by column but spark complains that the column I am ordering by cannot be resolved. Is this a bug or I'm missing something?
TAG : apache-spark
Date : November 17 2020, 03:01 PM , By : Ramya
Are recompiled-from-source classes in Spark jars breaking sbt's merge?
Are recompiled-from-source classes in Spark jars breaking sbt's merge?
fixed the issue. Will look into that further Attempting to create a fat jar with sbt gives an error like this: , If that's the wrong path, what's a cleaner way?
TAG : apache-spark
Date : November 16 2020, 03:01 PM , By : Patrik Bobor
spark access first n rows - take vs limit
spark access first n rows - take vs limit
I hope this helps you . This is because predicate pushdown is currently not supported in Spark, see this very good answer. Actually, take(n) should take a really long time as well. I just tested it, however, and get the same results as you do - take
TAG : apache-spark
Date : November 15 2020, 03:01 PM , By : Amit Pawar
Spark Dataset selective recompute
Spark Dataset selective recompute
fixed the issue. Will look into that further Turns out I need to either use job-server or use IgniteRDD support from Apache Ignite:
TAG : apache-spark
Date : November 13 2020, 03:01 PM , By : Charmaine Wooden
Spark Memory Usage Concentrated on Driver / Master
Spark Memory Usage Concentrated on Driver / Master
will be helpful for those in need I ended up solving this issue. Here's how I addressed it: I made an incorrect assertion in stating the problem: there was a collect statement at the beginning of the Spark program.
TAG : apache-spark
Date : November 07 2020, 03:01 PM , By : user7451551
How to set Unmodifiable collection serializer of Kryo in Spark code
How to set Unmodifiable collection serializer of Kryo in Spark code
around this issue In case anybody else face this issue, here is the solution - I got it working by using javakaffee kryo serializers.Add the following maven dependency:
TAG : apache-spark
Date : November 05 2020, 03:01 PM , By : Jan
get latest schema for partitionned parquet dataframe
get latest schema for partitionned parquet dataframe
will be helpful for those in need I am writing this in pyspark. Should be applicable for other language.
TAG : apache-spark
Date : November 04 2020, 03:01 PM , By : bernice agyei
How to export data from Cassandra to BigQuery
How to export data from Cassandra to BigQuery
I wish this help you We decided to move 5 years of data from Apache Cassandra to Google BigQuery. The problem was not just transferring the data or export/import, the issue was the very old Cassandra!After extensive research, we have planned the migr
TAG : apache-spark
Date : November 02 2020, 03:01 PM , By : user7450289
saveAsTable ends in failure in Spark-yarn cluster environment
saveAsTable ends in failure in Spark-yarn cluster environment
I wish this helpful for you The way to get rid of the problem is to provide "path" option prior to "save" operation as shown below:
TAG : apache-spark
Date : October 29 2020, 04:01 PM , By : Gork
How to get non-null sorted ascending data from Spark DataFrame?
How to get non-null sorted ascending data from Spark DataFrame?
it fixes the issue I load the data into data frames where one of the columns is zipCode (String type). I wonder how to get non-null values for that column in ascending order in Scala? Many thanks in advance.
TAG : apache-spark
Date : October 15 2020, 08:10 PM , By : user2175301
How to catch a casting issue in SparkSQL
How to catch a casting issue in SparkSQL
seems to work fine You can try registering a UDF to catch errors and call it while casting.
TAG : apache-spark
Date : October 14 2020, 02:21 PM , By : Viktoria Dukova
shadow
Privacy Policy - Terms - Contact Us © voile276.org