How to partition athena tables based on particular column

0.00/5 (No votes)

See more:

I have a athena table with many columns which loads data from a s3 bucket location. Lets say the data size stored in athena table is 1 gb .

I want to query the table data based on a particular id. so for N number of id, i have to scan N* 1 gb amount of data.

To avoid this situation and reduce cost. I'd like to partition the table based on the column name id.

CREATE EXTERNAL TABLE `newtable`(
  `abc` int, 
  `bcd` string, 
  `cde` int, 
  `def` int, 
  `efg` timestamp,  
  `egh` int)
PARTITIONED BY ( 
  `id` int)
ROW FORMAT DELIMITED 
  FIELDS TERMINATED BY ',' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3://bucket/folder'

After creating the table with partitioning, i load all partitions. When i try to load the data.
It shows no records found.

What I have tried:

MSCK REPAIR TABLE seatdata_cas;

Posted 3-Apr-18 0:57am

Kaarthick Raman

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Last 24hrs

This month

CPallini	155
k5054	100
OriginalGriff	70
Graeme_Grant	45
M-Badger	35

Pete O'Hanlon	2,065
OriginalGriff	1,565
Graeme_Grant	1,075
Richard Deeming	893
Dave Kreskowiak	744