Click here to Skip to main content
15,885,546 members
Everything / Spark

Spark

spark

Great Reads

by Sacha Barber
Looking at Spark/Cassandra working together
by Sacha Barber
Looking at Spark/Cassandra working together
by Sacha Barber
Examiniation of Apache Spark Databricks platform on Azure
by Nick Veld
Jupyter + HDFS + YARN + Spark and only one open port using NGINX

Latest Articles

by Nick Veld
Jupyter + HDFS + YARN + Spark and only one open port using NGINX
by MehreenTahir
This article will give you a gentle introduction and quick getting started guide with Apache Spark for .NET for Big Data Analytics.
by Sacha Barber
Examiniation of Apache Spark Databricks platform on Azure
by Mallanagouda Patil
This article helps to setup Apache Spark on Windows in easy steps.

All Articles

Sort by Score

Spark 

20 Jan 2016 by Sacha Barber
Looking at Spark/Cassandra working together
10 Feb 2016 by Sacha Barber
Looking at Spark/Cassandra working together
15 May 2018 by Sacha Barber
Examiniation of Apache Spark Databricks platform on Azure
22 Apr 2021 by Nick Veld
Jupyter + HDFS + YARN + Spark and only one open port using NGINX
4 Feb 2017 by Mallanagouda Patil
This article helps to setup Apache Spark on Windows in easy steps.
5 May 2019 by Kornfeld Eliyahu Peter
Impala and Spark are two separate SQL engines for use with Hadoop... One can not use features from the other!!! So, no if you use Impala there is no Spark, if you use Spark there is no Impala...
26 Mar 2017 by Emmanuel Portelli
I am trying to implement an FPGrowth algorithm using Spark's MLLIB but do not know how to proceed. I have seen multiple examples but do not include cross validation where a data set is split into training and test. // Recommendation engine can be per league// "Ligue 1"// "Bundesliga "...
30 May 2017 by Richard MacCutchan
Go to www.google.com[^]
9 Jul 2017 by Smartguy3k
Hi, I am trying to run few spark commands using SparkR (from local R-GUI). For setting up the spark cluster on EC2 I used most of the commands from ( https://edgarsdatalab.com/2016/08/25/setup-a-spark-2-0-cluster-r-on-aws/) with little modification to install the latest versions. All I was...
3 Apr 2018 by Member 13760762
I want save data in libsvm format by python. So I choose to use pyspark to finish this task. But the data I saved was not in the libsvm format. Here is my code. from pyspark.mllib.util import MLUtils from pyspark.mllib.regression import LabeledPoint d = c.map(lambda line:...
23 Apr 2018 by Fares hussein
Hi i have a DataSet of Track.class i want to merge all tracks that are within same interval of time for example 5 min .i.e any tracks start after a track that ends within 5 min before will be the same track.its look like fusion task. my input : ...
16 Jun 2018 by Richard MacCutchan
The data contains an item that is not a number, so you need to strip that out of your list before trying to convert.
5 May 2019 by Jackie Lloyd
Could somebody please help me with this query :). We use Impala to query data, with Sentry to restrict access to data at column level. We use Spark to write code to query data stored in files. My understanding is that Sentry roles cannot control access at column level when used with Spark....
5 Dec 2020 by Member 15012716
Hi Guys, I have a master calender starting from year 1991 to 2050. Now I want to create e new field weeknumberasyear where if w1 contains
5 Dec 2020 by Richard MacCutchan
Take a look at Week Numbers according to ISO8601[^]. It should be fairly easy to modify it to any day of the week.
16 Mar 2021 by Member 12645291
I want to create 2 data frames out of the below list:- results = [ {'type': 'check_datatype', 'kwargs': {'table': 'cars', 'columns': ['car_id','index'], 'd_type': 'str'}, 'datasource_path': '/cars_dataset_ok/', ...
26 Apr 2021 by Muhanned Shahada
I just bought a new Macbook M1 and i am having a hard time to set up pyspark in MacOs , it would be appreciated if anyone can help me in this as i am completely new to Mac. I have followed many online tutorials and instructions but for some...
26 Apr 2021 by ak1996
Hi I am trying to read hive table through spark but spark is giving below error. Please find below spark-submit command i used. spark-submit --master yarn --deploy-mode cluster --jars PATH_TO_APPJAR,path/hive-jdbc-1.2.1.spark2.hdp.1.jar ...
20 Jun 2021 by EBRAHIM ANGOLKAR
So there is a match_id, batsman, and batsman_runs column, batsman_runs column consist of values where he scored a number of runs in a ball like 0,1,2,3,4,6. I need to find which batsman scored most of the number of runs in every match in...
4 Oct 2021 by Richard MacCutchan
The title says it all: Apache Spark™ - Unified Analytics Engine for Big Data[^]
29 Aug 2022 by Silpa Silpa
Input: 1. Given below input Partner data in comma separated file. (file name - Partner.csv). 2. Given List of Invalid Party Ids. Requirement: 1 Read the input Partner csv file and Invalid party ids list as Dataframes using spark APIs....
29 Aug 2022 by Sandeep Mewara
Here, it has an examples that you seek: Spark DataFrame Where Filter | Multiple Conditions - Spark by {Examples}[^] package com.sparkbyexamples.spark.dataframe import org.apache.spark.sql.{Row, SparkSession} import...
17 Jan 2023 by Maxwell Corner
So I have this spark dataframe with following schema: ``` root |-- id: string (nullable = true) |-- elements: struct (nullable = true) | |-- created: string (nullable = true) | |-- id: string (nullable = true) | |-- items: array...
27 May 2023 by Member 8840306
I am new in spark .I want to use function "take(3)" for getting(displaying) 3 row of my csv file It is collecting the record correctly raw_data =sc.textFile(“day.csv”) raw_data.collect() Quote: ['Name,Abbreviation,Numeric,Numeric-2',...
1 Dec 2019 by MehreenTahir
This article will give you a gentle introduction and quick getting started guide with Apache Spark for .NET for Big Data Analytics.
12 May 2021 by OriginalGriff
If you wrote that code, you should be able to very simply modify it - it's a less complicated task. And since that code is copy'n'pasted from here: Spark WordCount example - Java Developer Zone[^] I don't believe a word of what you are saying. ...
16 Jun 2018 by LearningSpark
Hi All, I am New to Big Data World.need urs help to make it real.here is myquestion I am Reading data from txt file(1,2,3,4,4,4,4) var file=sc.textFile("file:///home/cloudera/MyData/Lab1/numbers.txt") var number=file.flatMap(line=>line.split(",")) var...
12 May 2021 by ND Gaming
I wrote a spark program to count the words but now I want to count letters Instead of word. Can Anyone please tell me what do I have to change in the code word counting code to count letter instead. What I have tried: Here is the code to count...
21 Apr 2022 by Palkin Jangra
I have a housing dataset in which I have both categorical and numerical variables. Out of this dataset I created another dataset of numeric_attributes only in which I have numeric_attributes in an array. Dataset - Array values. Numeric_attributes...
27 May 2023 by Richard MacCutchan
Look at the error message: Python was not found; run without arguments to install from the Microsoft Store, or disable this shortcut from Settings > Manage App Execution Aliases. 23/05/27 22:29:16 ERROR Executor: Exception in task 0.0 in stage...