Click here to Skip to main content
15,881,882 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
I am trying to make a 3D plot of a galaxy catalog and have a large amount of x,y,z coordinates and data value (w4) stored in seperate hdf5 files.

Since the data content is huge, I have tried binning them.

The output is however taking forever to load and I have tried various binning techniques (binned_statistics_dd, histogramdd etc) but nothing has worked.


Any help will be appreciated since I have been trying this for weeks now.

What I have tried:

The code is as:

    gaspos = np.array(gas['Coordinates'])*ckpc/h  ##coordinates
    x = gaspos[:,0]
    y = gaspos[:,1]
    z = gaspos[:,2]
    
    w4 = gas['MagneticField']*(h/a**2)*np.sqrt(1e10*Msun/kpc)*1e5/kpc
    w4 *= w4/(8*np.pi)
    w4 = (np.dot(w4,np.ones((3,1))).T)[0]  ## 1 dimension data
    
    hist, binedges = np.histogramdd(gaspos, normed=False)
    hist, binedges = np.histogramdd(w4, normed=False)
    
    fig = plt.figure(figsize = (16, 9))
    ax1 = plt.axes(projection ="3d")

    ax1.scatter3D(x,y,z,c = w4)
    plt.show()
Posted
Updated 2-Aug-21 1:31am
v2
Comments
Patrice T 2-Aug-21 7:23am    
What is 'binning' ?
How huge ?
nikita agarwal Jun2021 2-Aug-21 7:30am    
There are 11 hdf5 each of 500MB
Richard MacCutchan 2-Aug-21 9:05am    
I have no idea what an 'hdf5' is, but 11 x 500MB = 5500MB or 5.5GB, which is a lot of data, and will take a significant amount of time to load. If you are only using the 32bit version of Python then this is likely to overflow the allowable memory. You need to change your code to deal with smaller datasets.
nikita agarwal Jun2021 2-Aug-21 9:16am    
Thank you very much for your input. This may be the issue.
Could you suggest ways to reduce the data set size via python?
I mentioned this issue to my project supervisor and he suggested using statistical binning on 3D data which I tried.
However, have not got any significant results yet.
Richard MacCutchan 2-Aug-21 9:20am    
Sorry, I have no idea what your program is doing, so it is impossible to suggest anything. But trying to process that amount of data in memory on a small system is not a good idea. You need to start with a small amount of data and find ways to connect the results of one set with the others.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900