Click here to Skip to main content
15,885,757 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
I need advice on implementation of Data Lake. Any good references or examples of how to implement the data lake concept (tutorial) or pointing me to the right direction will suffice.

Thanks in advance

What I have tried:

I'm looking to set this up for my organization and I have no idea where to start. Any help will be appreciated
Posted
Comments
Mehdi Gholam 8-Jul-16 2:01am    
Google is your friend.
GoodyGoodyGoody 9-Jul-16 21:35pm    
i've actually come across 2 solutions, I was wondering who has tried or use any for on-premise storage:

1. Informatica Data Lake
2. Azure Data Lake
Sni.DelWoods 10-Aug-18 7:12am    
Depends on which data you have and what you want to do with that data.

I implemented a non-generic data lake as a separate sql database with multiple tables, all fields as nvarchar.

The data I get has a fixed structure. So I import the data from CSV files to the table (rawlist1.csv to table [rawlist]).
The CSV comes every day with a growing number of lines. The productive table gets only the new rows. So I import all lines to the data lake table and copy only the new rows to the productive table.

This helps to keep the productive table clear from obsolete data. The data lake table keeps only data from the last 10 days.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900