Click here to Skip to main content

Submit your article

Hadoop

hadoop

Great Reads

Hadoop For .Net & AngularJS Developers

by Bert O Neill

Query Hadoop using Microsoft oriented technologies (C#, SSIS, SQL Server, Excel etc.)

Hadoop Beginners Guide - How to Install

by Fazlur Rahman

Step by step procedure to install Hadoop 2.7.3 version on Ubuntu 16.04 operating system

Implementing Joins in Hadoop Map-Reduce

by Suffyan Asad

How to implement Joins in Hadoop Map-Reduce applications during Reduce and Map phases

Applying Lambda Architecture on Azure

by Vladimir Dorokhov

Design and development simple analytics system using Lambda Architecture principles and Microsoft Azure cloud

Latest Articles

Hadoop Beginners Guide - How to Install

by Fazlur Rahman

Step by step procedure to install Hadoop 2.7.3 version on Ubuntu 16.04 operating system

AWS Analyze Big Data with Terraform

by YegorDovganich

Following 'Infrastructure as Code' rules we get a real project sample from the scratch which describes EMR cluster deploying and running Hive script there. It describes Analyze Big Data with Hadoop project from AWS 'Learn to Build' section.

by Mahsa Hassankashi

It is almost everything about big data.

Working with Big Data on Alibaba Cloud

by Michael_Churchman

Alibaba Cloud offers a range of Big Data solutions. This article outlines them and explains which types of Big Data services on the Alibaba Cloud align with various workloads.

All Articles

Hadoop

How do I correct this indentation error?

11 Jul 2021 by Abhijit Dare

p="foo foo quux labs foo barquux".split() d={} s=[] count=1 for x in p: if x not in s: d.update({x:count}) s.append(x) else: d[x]+=1 print(d) What I have tried: Hello In this program, I intend to count the occurrence of each...

Technologies to learn to switch

10 May 2015 by Afzaal Ahmad Zeeshan

Hello Rajasekhar, I would give you an overview of two paths that you are looking at. First one is the new one that you want to move yourself into. Second path is the one you are already on. So, coming to the first one. If you seriously want to switch your career field, from one to...

hadoop installation in windows 7 64 bit

3 Oct 2015 by Afzaal Ahmad Zeeshan

Setting up JAVA_HOME variable is a first-step for any application or program that requires JDK to work with. There are many tutorials already provided, but I will try to provide the ones that suffice your needs and are standard based.Installing the JDK Software and Setting JAVA_HOME[^] (From...

How to select the right hadoop distribution?

20 Jul 2017 by Ailsa Harvey

Hi BuddyHow do we choose the right Hadoop distribution from the numerous options that would serve our purpose? Not all of the Hadoop distributions have the common components (but, they all consists of Hadoop’s core capabilities.ThanksWhat I have tried:I have tried to choose the...

What is Python code to use sqoop and flume import jobs?

3 Aug 2021 by Amel Hadfi

I've been able to use Sqoop & Flume import commands perfectly fine on Ubuntu terminal. But right now, I'm trying to do so on Jupyter notebook. 1) How can I import from MySQL to HDFS using Sqoop command on Jupyter notebook? 2) what is Flume...

.NET to Hadoop Connection using Kerberos Ticket

26 Mar 2016 by Amit Kumar Tiwari

.NET to Hadoop connection using Keytab file

Loaded library lib-native-libhadoop.so.1.0.0 might have disabled stack guard . How to resolve it?

25 Jul 2018 by anjitaa

Loaded library lib-native-libhadoop.so.1.0.0 might have disabled stack guard. How to resolve it? What I have tried: I have tried loaded library lib-native-libhadoop I think 1.0.0 might have disabled stack guard

In mapreduce how to sort intermediate output based on values?

27 Jul 2018 by anjitaa

"The MapReduce sort the intermediate data(between mapper and reducer phase) by key by default. If we want the data should be sort based on value, then we need secondary sorting. There are 2 approaches to fulfill the same. 1. If reducers will get all the value for a particular key and buffer...

Explain the word count implementation via hadoop framework?

28 Jul 2018 by anjitaa

What do you understand by Word Count implementation via Hadoop framework? Explain in detail What I have tried: I am not able to implement the Word Count implementation via the Hadoop framework?

Why not a single point of contact, only namenode can be used for handling all read/write requests in HDFS?

31 Jul 2018 by anjitaa

" No, it is not feasible given the distributed architecture of HDFS. If ‘n’ no of clients process read/write requests simultaneously, then it will increase overhead on Namenode.To avoid these bottlenecks, a distributed system of a computing architecture in master-slave fashion is proposed. "

How to enable trash/recycle bin in hadoop?

16 Aug 2018 by anjitaa

"To enable the trash feature and to set the time delay for the trash removal in Hadoop, we have to edit the fs.trash.interval property in core-site.xml to the delay (and this has to be in minutes). Ex: if you want users to have 10 hours (600 minutes) to restore a deleted file, you should specify...

How to configure hadoop to reuse JVM for mappers?

20 Aug 2018 by anjitaa

"To configure Hadoop to reuse JVM for mappers, we just need to add entry in the configuration file: $HADOOP_HOME/conf/mapred-site.xml mapred.job.reuse.jvm.num.tasks -1 We need to specify a number value how many times the JVM is to be reused...

How Do I Configure Eclipse For Hadoop In Linux .. Can Anyone Suggest Me A Link To Download A Eclipse Plugin

22 Sep 2015 by anto_bernad

Its urgent guys............... i need to know how to configure eclipse for hadoop in linux .. can anyone suggest me a link to download a eclipse plugin

In mapreduce how to sort intermediate output based on values?

27 Jul 2018 by Bansal himani

How to sort intermediate output based on values in MapReduce ? What I have tried: How to sort intermediate output based on values in MapReduce?

Explain the word count implementation via hadoop framework?

28 Jul 2018 by Bansal himani

"Word Count Implementation will be as follows: For ex: Input File 1 contains data: “This is December Month.” Input File 2 contains data: “December is the last month of the year.” Step 1: Mapper will generate the following below output: Input File 1 output ...

How to enable trash/recycle bin in hadoop?

16 Aug 2018 by Bansal himani

How can I enable Trash/Recycle Bin in Hadoop? What I have tried: I was not being able to enable Trash/Recycle Bin in Hadoop

Probleme of liberary hadoop java compilation

28 May 2018 by Bata Omou

i try to execute and compile this code java mapreduce on my eclipse in local, but this probleme is showed up please help where is the issue? and this is the error showed up: WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where...

Probleme of liberary hadoop java compilation

28 May 2018 by Bata Omou

yeah thank you realy it was the probleme that i didn't made a outputPath but it showed me another error alawys about native librery haddoop and another one: 2018-05-28 16:27:24,687 WARN [main] util.NativeCodeLoader (NativeCodeLoader.java:(62)) - Unable to load native-hadoop...

How to optimize query running on hive

12 Dec 2020 by BedantBiswal

Below is my query which takes around 5k mappers and 1k reducers and time taken is around 2.2 hours to finish. Any scope of optimization in here? What I have tried: SELECT sum(B.item_net_amount) net_amount, sum(B.item_gross_amount) gross_amount,...

Hadoop For .Net & AngularJS Developers

29 Dec 2015 by Bert O Neill

Query Hadoop using Microsoft oriented technologies (C#, SSIS, SQL Server, Excel etc.)

.NET-Compiler-Platform

Is there a clear procedure to install HDInsight on a Windows platform?

6 Mar 2015 by BillWoodruff

http://azure.microsoft.com/en-...

Cloudera hadoop - daemon process not running

4 Mar 2016 by Chendur Srinivasan

I'm self learning Hadoop and started of with installing Cloudera QuickStart on a VMware Workstation running CENT OS.I was under the impression that Quickstart VM has most the of configurations predefined. Do I need to set up any other configurations to set up data and name node? Reason being...

How do I correct this indentation error?

15 Jul 2021 by Dasisqo

format your code here Pythoniter - Pretty Python Online Formatter[python code formatter]

How to select the right hadoop distribution?

20 Jul 2017 by Eshika Roy

It’s all depends on your work and working environment There are 3 most usable distributions. Cloudera - you can choose when you need support from cloudera. They will charge for service-- partially open source Hortonworks - fully open source and user friendly (processing speed slow if you...

How to enable WASB on hadoop 2.7.1

20 Jul 2017 by Eshika Roy

First you need to follow some steps for enable WASB on Hadoop • We need to create an account on windows azure. • Than take service • Than we need to implement Hadoop. Follow this to better understanding:...

Beginners Guide - Introduction of Big Data & Hadoop

21 Jan 2017 by Fazlur Rahman

What is Big Data and how Hadoop been introduced to overcome the problems associated with Big Data?

Hadoop Beginners Guide - How To Setup Developer Environment

13 Feb 2017 by Fazlur Rahman

Step by step procedure to install NetBeans on Ubuntu 16.04 operating system with Hadoop 2.7.3 version. This may work for any other versions of Hadoop and Ubuntu.

Hadoop Beginners Guide - How to Install

22 May 2022 by Fazlur Rahman

Step by step procedure to install Hadoop 2.7.3 version on Ubuntu 16.04 operating system

How do I setup a Hadoop source code as a project in Eclipse?

17 Apr 2015 by Flowra white

I want to open hadoop source code as a project in Eclipse for the purpose of developing and studying.

How to select the right hadoop distribution?

5 May 2016 by George Jonsson

Here you can find information about different distributions: Welcome to Apache™ Hadoop®![^]Here you have a discussion forum for Hadoop: Discuss Hadoop[^]I guess your specific choice depends on your requirements.

BigDL – Scale-out Deep Learning on Apache Spark Cluster

12 Apr 2017 by Intel

BigDL is a distributed deep learning library for Apache Spark. With BigDL, users can write their deep learning applications as standard Spark programs, which can run directly on top of existing Spark or Hadoop clusters.

product-showcase

If I access data via spark, can I control database table access at column level with impala

5 May 2019 by Jackie Lloyd

Could somebody please help me with this query :). We use Impala to query data, with Sentry to restrict access to data at column level. We use Spark to write code to query data stored in files. My understanding is that Sentry roles cannot control access at column level when used with Spark....

Reading an excel sheet in hadoop using mapreduce and filtering a column in it?

29 May 2017 by Jayaprakash Manchi

For example, Let me explain it in detail. https://i.stack.imgur.com/DIlIT.png Like this data will be there in excel sheet as shown above with n number of rows typically huge data. Now we need to filter the column status with output as in different excel sheets or in same workbook as given...

Probleme of liberary hadoop java compilation

28 May 2018 by Jochen Arndt

Quote: the line error 63 is about the output format: FileOutputFormat.setOutputPath(conf, new Path(args[1])); and the error message is java.lang.ArrayIndexOutOfBoundsException So there is no second command line argument present when executing the application. You have to execute the...

The problem on processing jobs on Yarn of the Pseudo-Distributed Hadoop

23 Sep 2015 by Justin Zh.

Hi, all!Here is some information:Windows 10 with VMware 12Ubuntu 14.04.3 LTS with VMware tools.JDK1.8.0_60HADOOP-2.7.1It works perfectly when I try to process the job on HDFS of the Pseudo-Distributed Hadoop (without Yarn, and the job is done in several seconds). Once I have set...

Microsoft HDInsight Emulator for Windows Azure installation via WPI 5.0 returns installtion not successfuly error

31 Dec 2014 by kadriu

If you have JDK 8.x installed, uninstall and install JDK 7.x. This worked for me.

Loaded library lib-native-libhadoop.so.1.0.0 might have disabled stack guard . How to resolve it?

25 Jul 2018 by kasliwal aayush

"This error could be due to wrong JDK package. Hadoop runs on 64 bit ..so try to uninstall 32bit JDK and install 64 bit JDK8 Please add following variables to .bashrc environment file, export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_ROOT/lib/native export...

If I access data via spark, can I control database table access at column level with impala

5 May 2019 by Kornfeld Eliyahu Peter

Impala and Spark are two separate SQL engines for use with Hadoop... One can not use features from the other!!! So, no if you use Impala there is no Spark, if you use Spark there is no Impala...

Spark scala-count even numbers from from file

16 Jun 2018 by LearningSpark

Hi All, I am New to Big Data World.need urs help to make it real.here is myquestion I am Reading data from txt file(1,2,3,4,4,4,4) var file=sc.textFile("file:///home/cloudera/MyData/Lab1/numbers.txt") var number=file.flatMap(line=>line.split(",")) var...

How to enable WASB on hadoop 2.7.1

15 Sep 2017 by Leviya bl

hi here is the step by step process [^]WASB is automatically enabled in HDInsight clusters. But you can also mount a blob storage account manually to a [^]Hadoop Administration instance that lives anywhere as long as it has Internet access to the blob storage. Here are the steps: I assume...

Big Data MapReduce Hadoop Scala on Ubuntu Linux by Maven intellj Idea

12 Sep 2017 by Mahsa Hassankashi

This article is the most complete essay about big data from scratch to practical.

Big Data

3 Apr 2019 by Mahsa Hassankashi

It is almost everything about big data.

artificial-intelligence

image-recognition

image-processing

machine-learning

Debugging Hadoop HDFS using IntelliJ IDEA on Linux

27 Dec 2015 by Mallanagouda Patil

This article helps to setup debug environment for hadoop framework on Linux Ubuntu using IntelliJ IDEA

How to select the right hadoop distribution?

6 May 2016 by Mankuji87

Hi Ailsa i refer some helpful link.I hope it will help youSpoilt for Choice – How to choose the right Big Data / Hadoop Platform?[^]How to Choose a Hadoop Distribution - For Dummies[^]How to Choose the Right Hadoop Distribution?[^]Top 3 Hadoop distributions, which is right for...

Microsoft HDInsight Emulator for Windows Azure installation via WPI 5.0 returns installtion not successfuly error

30 Dec 2014 by Mansoor Alikhan K

Microsoft HDInsight Emulator for Windows Azure installation via WPI 5.0 returns installation not successfully: fatal errorError Logs are here=== Verbose logging started: 06/Dec/14 19:16:34 Build type: SHIP UNICODE 5.00.7601.00 Calling process: C:\Program Files\Microsoft\Web Platform...

hadoop installation in windows 7 64 bit

3 Oct 2015 by Mehdi Gholam

Start here : http://harishshan.blogspot.co.uk/2014/10/install-hadoop-251-on-windows-7-64bit.html[^]

Is there a clear procedure to install HDInsight on a Windows platform?

6 Mar 2015 by Mehdi_S

Hi,I have been trying to install HDInsight on a windows platform but without success. I'm wondering if there is a clear procedure to install it, which version of windows it is compatible with and if there is a direct link to download it (without using the web platform installer.Thank you...

Getting a error while using code "hadoop fs -mkdir /in"

21 Sep 2014 by Member 11097824

While installing hadoop in windows 8.1 pro and was ready to run mapreduce I got this error message.Unable to make directory and further more errors are mentioned below.-mkdir: java.net.URISyntaxException: Illegal character in hostname at index...

Is it good to use hadoop with mysql in android app

3 Jul 2015 by Member 11402033

I am trying to manage database of the android app. will it be good to use hadoop with mysql database for the android app

Sockettimeout error while running mapreduce job

16 Feb 2015 by Member 11456117

We are getting some warnings in our mapreduce job while reading and writing data from datanode, it is not aborting the job though. This error comes up at several places in the job. Looks like an issue with timeout variables in hdfs-site.xml and hbase-site.xml files.What timeout values should...

Hi I Am Unable To Set The Java Home For Hadoop hadoop during the installation of hadoop

24 Jan 2023 by Member 11622664

hi i am unable to set the java home for hadoop during the installation of hadoop

How do I create and access in hadoop tables in c# on windows 7 32bit, using cygwin,hive and hbase?

15 May 2015 by Member 11694565

I am trying to set hadoop in single-cluster node. And I need to create tables in hive and hbase inorder to handle the tables using c#.I have cygwin,hadoop-1.2.1 and hive-1.1.0 on windows 7 32bit.Running hadoop, it gives "Warning: $HADOOP_HOME is deprecated." still it works!!But when...

Hadoop Installation error in creating .jar files

3 Feb 2016 by Member 11726267

HI I am trying to install hadoop on windowsI am looking for the correct path for downloading the google-gson-2.2.4-release.zip file.I downloaded the file from couple of sites but not able to see the jar's files in the zip folder. I have only html,java,class files when extracted the...

Open windows SDK command prompt in windows 10

25 Apr 2016 by Member 11842305

I have Installed Windows SDK on windows 10 from herehttps://developer.microsoft.com/en-us/windows/downloads/windows-10-sdkBut I am unable to open Windows SDK command prompt to run my maven commands to install hadoop. I have searched online but didn't find anything useful. Please...

hadoop installation in windows 7 64 bit

3 Oct 2015 by Member 12029885

i can't run hadoop exe file it error comes java_home is incorrectly set

How to find min, max and mean of wordcount from text file in hadoop mapreduce

14 Oct 2015 by Member 12059854

public class MaxMinReducer extends Reducer {int max_sum=0; int mean=0;int count=0;Text max_occured_key=new Text();Text mean_key=new Text("Mean : ");Text count_key=new Text("Count : ");int min_sum=Integer.MAX_VALUE; Text min_occured_key=new Text(); public void reduce(Text key,...

How to enable WASB on hadoop 2.7.1

15 Sep 2017 by Member 13258163

I have setup a single node hadoop cluster (2.71.1) on windows 7 and now trying to establish it's connection with Azure storage (wasb) with no success. I am getting the error: No FileSystem for scheme: wasb I have been following several blogs but was focused on : articles/hadoopAndWasb.md at...

Hive query to find which month is highest paid salary by department

5 Jan 2018 by Member 13609332

here is the solution for the above problem select d.department, case when (d.maxJan>=d.maxFeb) and (d.maxJan>=d.maxMarch) then 'Jan' when (d.maxFeb>=d.maxJan) and (d.maxFeb>=d.maxMarch) then 'Feb' when (d.maxMarch>=d.maxJan) ...

While running Recipe.java, getting error that Mapper, Job package is not there.

26 Sep 2014 by Member 8899038

While running Recipe.java, getting error that Mapper, Job package is not there.

Compiling hadoop java files

4 Apr 2018 by Member Hemal

You have to give all of your source files to javac Example: javac -classpath /usr/local/hadoop/hadoop-core-1.0.4.jar -sourcepath src/ -d build/ MyMain.java

Error in this hive query :currently subquery expressions are only allowed as where clause predicates in hive

25 Jul 2021 by mgjsa

Hi, I have written a hive query language as below. It is giving me error as written in title. the query is : select clnt_nbr, case when clnt_nbr in (select clnt_NBR from crd_master where crd_typ = '198 or crd_typ = '199' ) then 1 else 0 end) as...

namenode and jobtracker dont start with start-all.sh

7 Mar 2015 by mibetty

Im trying to install hadoop single node.When I do start-all.sh name node and job tracker dont start.Do you see in my files what can be be wrong so Im having this result?Result of hadoop jps command:14878 Jps14823 TaskTracker14605 SecondaryNameNode14456...

Working with Big Data on Alibaba Cloud

13 Dec 2017 by Michael_Churchman

Alibaba Cloud offers a range of Big Data solutions. This article outlines them and explains which types of Big Data services on the Alibaba Cloud align with various workloads.

Where are the tables in hive are stored (location)

21 Nov 2014 by midhun3600

Hi,I am new to hadoop. I have managed to install and use hadoop HDFS,Hive. I am able to fetch data and insert data into hive using talend.My problem is when ever we create a table from talend (distribution: apache) it is creating in hive but i am unable to see the same in hive...

How can multiple users access hive

9 Jan 2015 by midhun3600

Hi,I am very new to Hadoop and some how we managed to install it with apache distribution and Derby database.My requirement is i need multi users to access hive at a time. But right now we are only able to work single user at a time.I searched some of the blogs but haven't found the...

Load data to hive installed on ambari using talend

14 Jan 2015 by midhun3600

Hi,I am trying to create a hadoop table and load data into using talend.I have successfully created table but was unable to load data to it.while i execute talend job i am getting following error.========================================================FAILED: Error in semantic...

processing inside data node of hadoop

5 Dec 2015 by mohitjain012

I was learning hadoop and I come to a doubt :Every slave node consists of a data node and task tracker, every data node consists of data blocks. Suppose we have a data node which has which has 10 data blocks of each size 64 MB.How the data of a data node is processed inside a slave node?...

How to overwrite an existing output file/dir during execution of mapreduce jobs?

1 Aug 2018 by patelsandeep

During execution of MapReduce jobs how to overwrite an existing output file/dir ? What I have tried: I am working on a MapReduce project and need to overwrite an existing output. I'm unaware of the procedure?

How to configure hadoop to reuse JVM for mappers?

20 Aug 2018 by patelsandeep

How we can configure Hadoop to reuse JVM for mappers? What I have tried: I am not able to configure Hadoop to reuse JVM for mappers

Configuration to speed up the topology in distributed mode

27 Feb 2016 by Patrice T

Problem can be anywhere. You have to define/search where is the bottle neck.Your network can be in downgraded mode because of bad wiring or bad switch.Computers can be slowed down because of lack of memory.Your programs can be artificially complicated or not optimized.It can be...

How do I correct this indentation error?

11 Jul 2021 by Patrice T

Quote: I dont understand why. Please explain Simple, you have mixed spaces and tab differently from previous line. p="foo foo quux labs foo barquux".split() d={} s=[] count=1 for x in p: if x not in s: d.update({x:count}) # this line...

HBase not connecting to ZooKeeper

20 Mar 2015 by Rabbits Foot

I am struggling for getting my HBase shell running. It throws me the above exception in subject line. I have checked that hbase-site.xml matches perfectly with hadoop one.Please help. I am struggling for 2 days and have a project due. I am attaching the two xml files of hadoop and...

Error in Splunk querry?

25 Sep 2015 by ravi30713

Problem replicating config (bundle) to search peer 'myserver.com:8089',Reading reply to upload: rv=-2, Receive from=https://myserver.com:8089 timed out; exceeded 60sec, as per=distsearch.conf/[replicationSettings]/sendRcvTimeout

Configuration to speed up the topology in distributed mode

27 Feb 2016 by rehabrish

I have a topology running with parallelism as (1,8,1)(spout,logic bolt, write bolt) with number of ackers set as 12( 12 are available slots in my cluster). The max spout pending is 200 and timeout.secs is 200. I have to process 14 lac inputs.My cluster consist of 1 nimbus & 3 supervisors (...

Open windows SDK command prompt in windows 10

25 Apr 2016 by Richard MacCutchan

See Visual Studio and Windows SDK Command Prompts[^].

Spark scala-count even numbers from from file

16 Jun 2018 by Richard MacCutchan

The data contains an item that is not a number, so you need to strip that out of your list before trying to convert.

How to overwrite an existing output file/dir during execution of mapreduce jobs?

1 Aug 2018 by Richard MacCutchan

MapReduce Tutorial[^]

How to enable trash/recycle bin in hadoop?

10 Aug 2018 by Richard MacCutchan

enable Trash/Recycle Bin in Hadoop - Google Search[^]

I am trying to install HDinsight emulator using web platform installer but I am getting this error in the log file:

12 Mar 2015 by RkRkRkRkk

(CAQuietExec: WINPKG: Unzip of C:\HadoopInstallFiles\HadoopPackages\hdp-2.1.3.0-winpkg.zip to C:\HadoopInstallFiles\HadoopPackages succeededCAQuietExec: WINPKG: UnzipRoot: C:\HadoopInstallFiles\HadoopPackages\hdp-2.1.3.0-winpkgCAQuietExec: WINPKG:...

When to use R, Cassandra,... for data mining?

18 Oct 2015 by Saman With You

Hello,We are going to start a research about data mining in our company. We've chosen Cassandra as our data store. I've heard that R tool is used for data mining too. But I don't know how I can relate these to each other? Would Cassandra be enough to do data mining or we have to use R or any...

Technologies to learn to switch

10 May 2015 by Sergey Alexandrovich Kryukov

There is no definition of "good technology". Only you can decide what's good for you.If you only want to choose something, no matter what, just to be on top of things, I'm afraid you are at wrong forum. This is the forum primarily oriented to professionals (even though some are students at...

how to use multipale class in map reduce programming

9 Apr 2015 by shivendrapandey

actually I want to write a code that uses hash-table for storing the data just before we process, I have Mapper output.but before we process this data I want to store it in hash-table(in Reducer) ..but I am not able to write the,

Using NiFi to Write to HDFS on the Hortonworks Sandbox

18 Feb 2016 by Simon Elliston Ball

How to use NiFi to write to HDFS on the Hortonworks Sandbox

HowSpark driver disassociated and removed by the master do I...

20 Aug 2015 by Sofia Panagiotidi

I have a cluster made by two slaves and one master and set up and I submit a jar (scala) to the spark master (192.168.1.64):spark-submit --master spark://spark-master:7077 --class tests.elements target/scala-2.10/zzz-project_2.10-1.0.jarAfter quite sometime running just fine it stops...

A Basic Introduction to Sqoop Import & Export for MySql Table to be Used as Hadoop Distributed File System (HDFS) in Cloudera

7 Jan 2017 by SrikantSahu

This tip gives basic commands to import table from Mysql to Hadoop File system and Import the files from HDFS back to Mysql.

Implementing Joins in Hadoop Map-Reduce

29 Jan 2015 by Suffyan Asad

How to implement Joins in Hadoop Map-Reduce applications during Reduce and Map phases

Implementing Joins in Hadoop Map-Reduce using MapFiles

16 Mar 2015 by Suffyan Asad

Implementing joins in Hadoop Map-Reduce applications during Map-phase using MapFiles

Hive query to find which month is highest paid salary by department

12 Apr 2017 by Sukanya Karri

0 down vote favorite my input Department Jan_sal Feb_sal Mar_sal civil 1 5 5 mech 2 7 2 civil 3 8 9 mech 6 4 4 mech 5...

Null Pointer Exception in hadoop

3 Nov 2015 by sunny_sharma123

Hello, I am trying to setup a multi node cluster of hadoop using two systems. Whenever I tried to format the hdfs there will be NullPointerException occurs. I am not happy to see this code again and again. If any one have solution of this then please reply...

Powerful, Easy-to-use Big Data Platform For Windows Developers

3 Dec 2014 by Syncfusion

With the Syncfusion Big Data Platform, you have complete access to the Hadoop environment. By adopting our platform, you are using an industry-tested solution currently employed by companies such as Microsoft, Facebook, Amazon, Adobe, Hulu, LinkedIn, and Yahoo.

product-showcase

Hi I Am Unable To Set The Java Home For Hadoop hadoop during the installation of hadoop

24 Jan 2023 by User 2753469

It's 2023 now and if you have linux and a program called 'alternatives' you can use the cmd $> alternatives --config java to find path to java versions on your machine and this program lets you choose which version you want to use if you...

MS SQL to HIVE code conversion

25 Mar 2022 by Viswanath Sitaraman

I'm trying to convert a piece of SQL code to HiveQL, and it's not working as expected. Please find below the code snippet in SQL that I'm attempting to convert: SQL Code:UPDATE C SET C.prod_l = P.prod_l, C.numprod = P.numprod, C.prod_cng...

Applying Lambda Architecture on Azure

23 Mar 2017 by Vladimir Dorokhov

Design and development simple analytics system using Lambda Architecture principles and Microsoft Azure cloud

AWS Analyze Big Data with Terraform

5 Jun 2019 by YegorDovganich

Following 'Infrastructure as Code' rules we get a real project sample from the scratch which describes EMR cluster deploying and running Hive script there. It describes Analyze Big Data with Hadoop project from AWS 'Learn to Build' section.

Map-Reduce to process classification data, problem areas, etc to identify the patterns and capture the reasons/responsible areas by using social websites.

12 Dec 2014 by ZurdoDev

The way this site works is we volunteer our time to help people that have gotten stuck on a specific code issue.In this case, you seem to be asking for someone to do everything for you and we don't do that.