|
OriginalGriff wrote: As in "Cheaper to build a new building somewhere else" expensive, I suspect
I've got it - move the building into one of the big cloud providers' data centers.
Save the bandwidth, the data never needs to get exposed through the public internet, and it's still cloud-based. Win-win-win.
|
|
|
|
|
That's brilliant!
Solves all the problems with the cloud I can think of...
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
In hindsight, I should've just called it "redefining what 'on-prem' means".
|
|
|
|
|
|
Sounds funny, but they are definitely talking something similar. But I also feel if this is all about testing our solution/architecture abilities. The easiest answer I had was to try HPCs on Azure.
|
|
|
|
|
just because doing graphics processing what's the need for so much storage?
sounds like somebody got suckerd by the sales goons, or just some idiot trying to impress with big number-words.
for instance city-wide traffic monitoring systems that record plates etc don't need that much, city facial recognition systems don't need that much.
1. the figures quoted are just nonsense, throw those away
2. get rid of the goons that came up with that crap, clueless / out of their depth / making irrational crap up
then:
3. be more specific on the nature of processing, volume and retention to get useful recommendations
-- without that it's just other peoples guesses.
pestilence [ pes-tl-uh ns ] noun
1. a deadly or virulent epidemic disease. especially bubonic plague.
2. something that is considered harmful, destructive, or evil.
Synonyms: pest, plague, people
|
|
|
|
|
Depends what the system does though. CCTV for example would require the data to be kept, as least for X amount of time. No point in having CCTV if the video isn't there to review when you need it.
Imagine also a city-wide service that takes peoples faces and puts a hat on them. Then at any point in time, a citizen can log in and see how good/bad they look in a hat... gotta store all those images to do that. Silly example, but you see my point
|
|
|
|
|
Yes, 0.5 petabytes per week seems too high. For example, I had visited the Nuclear Medicine department of a Cancer hospital, where they have four PET-CT machines, each spewing out 300 MB imaging data every 15 minutes or so (say 5 GB per hour, with all the four machines running). With the hospital working for 12 hours every day, we have 60 GB accumulated from that one department alone on a single day. Then, the CT and MR machines themselves spew out comparable sizes of data. So, we have about 150 GB per day from all departments put together. Taking 6 days as a week, we have about 1 TB per week.
Much less than 0.5 petabytes per week.
|
|
|
|
|
Amarnath S wrote: I had visited the Nuclear Medicine department of a Cancer hospital
Using big scary words doesn't equate to big scary amounts of data. You seem to suggest there should be a correlation between importance of software and the amount of data it produces. That would probably make Youtube one of the most important bits of software in the universe
|
|
|
|
|
Bingo.
|
|
|
|
|
0.5 petabtyes per week, and you want to process and store it in the cloud
ok on a 10 gigabit link (which you won't have) it'll take around 937 days to upload 0.5 petabytes
yes, you would need 937 days per week to upload the weeks data
-- The fastest link currently available is 18.2Gb (South Korea), still looking at >400 days
-- 5g promises 100Gb, that's >90 days per week
your figures and information are just ridiculous, just pathetic.
clearly whoever is coming up with those is totally clueless
and you want to put it on the cloud or the clown???
don't care you crossed it out,
asking people for advice?
provide real and proper information, not this bogus crap
BINGO THAT!
pestilence [ pes-tl-uh ns ] noun
1. a deadly or virulent epidemic disease. especially bubonic plague.
2. something that is considered harmful, destructive, or evil.
Synonyms: pest, plague, people
|
|
|
|
|
I never thought that a Petabyte word could have troubled someone so much.
Please Ignore the message if it did not interest you. Just like I'm doing for your message now.
And go home and have a chill beer on my name
Bingo that?
|
|
|
|
|
Maybe I am doing the numbers wrong but I get 4.6 days to transfer 0.5 PB at 10Gb/s.
0.5 PB = 500,000 GB
10Gb = 1.25 GB
500,000 / 1.25 = 400,000 (seconds)
400,000 / 86400 = 4.63 days
|
|
|
|
|
Are your 10 Gb/s GigaBytes? Or GigaBits?
M.D.V.
If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about?
Help me to understand what I'm saying, and I'll explain it better to you
Rating helpful answers is nice, but saying thanks can be even nicer.
|
|
|
|
|
10 Gb = 10 Gbits (hence small b)
1.25 GB = 10 GBytes (hence big B)
|
|
|
|
|
I now realize the capitalization in your previous message... (it was late at night, when I wrote)
but I think your comparisons in this message are mixing them up as I I did yesterday
musefan wrote: 1.25 GB = 10 GBytes (hence big B)
I think you wanted to say
1,25 GB (big B) = 10 Gb (small b)
M.D.V.
If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about?
Help me to understand what I'm saying, and I'll explain it better to you
Rating helpful answers is nice, but saying thanks can be even nicer.
|
|
|
|
|
I was Solution Architect a couple of years ago for a system that stored 2PB per week of video and telemetry data from self driving car development.
You are correct that that is a large amount of data, roughly 3.6 GB/s if you think of it as a constant stream. In addition to the challenge of simply storing it, it also has to be simultaneously backed up and analysed.
Data arrive at the data center from the many cars in the field not via cables but on SSD based cartridges that have to be read in via reader stations. The cartridges are then returned to the field to collect more data.
Storage for production and "backup" is provided by hundreds of Dell-EMC Isilon NAS nodes. Analysis of the data is done in more than 300 servers each with 32 cores and 0.5 TB of RAM.
It sounds mind blowing but there really are companies collecting, storing and processing data on that scale.
Andy
|
|
|
|
|
musefan wrote: That would probably make Youtube one of the most important bits of software in the universe
And everybody knows that's FarceBok.
"I have no idea what I did, but I'm taking full credit for it." - ThisOldTony
AntiTwitter: @DalekDave is now a follower!
|
|
|
|
|
I concur. I developed a system for a hospital that recorded all activity from around 70 cameras 24/7 and it never went above needing about 800GB. Certainly less than a terabyte, never mind petabytes!
- I would love to change the world, but they won’t give me the source code.
|
|
|
|
|
Is the application 3rd party? If so, ask them for recommendations on how to balance workload across machines. Most applications that have high processing requirements will support this kind of shared workload scenario.
If it's in house, then the developers should have a pretty big say in the best way to maximise performance.
Any serious amounts of image processing are best done across multiple machines, but requires the application to support it.
Storage is a separate issue, so design it as so, you only really need to make sure the connection between processing machine(s) and storage machine(s) is fast enough to keep up. Other than that, it's 2 separate requirements.
|
|
|
|
|
musefan wrote: Is the application 3rd party?
The client is a subsidiary to one of the top Oil & Gas industry. They want to work with us for building the application. They've hired people from AMD on their side. I guess this is just for the hardware department. & They also own the AI/ML teams. We are just focusing on the application that collects data.
Now most probably, as I've updated on my OP, the data seems to be fairly huge. But the intent of the contact person from this company looks to be testing our capacity. He's watching if we'd run away looking at the scale of the application. We did not run, because we don't know what it means to handle Petabytes of data.
|
|
|
|
|
Considering the data requirements you need I suggest the following storage system:
1 - transport layer[^]
2 - storage[^] (note hack-proof encryption in progress)
Ravings en masse^ |
---|
"The difference between genius and stupidity is that genius has its limits." - Albert Einstein | "If you are searching for perfection in others, then you seek disappointment. If you seek perfection in yourself, then you will find failure." - Balboos HaGadol Mar 2010 |
|
|
|
|
|
I have the same system in place for work emails, and can confirm it is very effective
|
|
|
|
|
Load test it in the cloud, then buy a server with 2x the capacity of the cloud one to cover additional workload growth.
Did you ever see history portrayed as an old man with a wise brow and pulseless heart, weighing all things in the balance of reason?
Is not rather the genius of history like an eternal, imploring maiden, full of fire, with a burning heart and flaming soul, humanly warm and humanly beautiful?
--Zachris Topelius
Training a telescope on one’s own belly button will only reveal lint. You like that? You go right on staring at it. I prefer looking at galaxies.
-- Sarah Hoyt
|
|
|
|
|
The CERN experiments produce 1PB/second of data, which is reduced to 1PB/day for storage (CERN Data Centre passes the 200-petabyte milestone | CERN). This allows them to store the "interesting" results out of 1 billion collision events/second. Are you telling us that your DP and image processing needs are 10% of CERN's?
Freedom is the freedom to say that two plus two make four. If that is granted, all else follows.
-- 6079 Smith W.
|
|
|
|