Click here to Skip to main content
15,888,065 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
The question which algorithm I need to solve this problem:

I have an array of floats:
10.21313
10.456
10.234324
10.45758
11.4564747
10.45647
10.32425
9.34536
9.4578689
100.345345
129.3453
1.456456
10.345
10.235363
10.23425

I need to extract from this array only elements, which are most common, like 10.*, 11.* and 9.* - they are the nearest in value to each other. At a conceptual level, is there already an algorithm to do that or I have to invent my own?

What I have tried:

I have not tried anything yet.
Posted
Updated 28-Sep-20 23:16pm

To an extent, it's going to depend on what exactly you mean by "near": 10.4 is "near" 10, but is 10.5 near 10 or 11? What about 10.4999? 10.50001?
Decide on what constitutes "nearness" and you can work from there.

Then the way I'd do it would depend on the language and / or framework I was working in: a C# solution could be one line of code, an SQL one would be longer but use the same ideas. A javascript solution would be very different!

For example, in C# it's just this:
C#
double [] data = { 10.21313, 10.456, 10.234324, 10.45758, 11.4564747, 10.45647, 10.32425, 9.34536, 9.4578689, 100.345345, 129.3453, 1.456456, 10.345, 10.235363, 10.23425 };
var common = data.GroupBy(d => GetNearest(d)).Select(g => g.Key);
You'd have to write or replace the GetNearest method yourself, based on your decision as above.
 
Share this answer
 
v2
Comments
csrss 29-Sep-20 4:53am    
Yes, the nearness should be calculated based on numbers in array. So we can see, that 10 is near 9 and near 11, but not near 100, or 1. From what I know, there are some clustering algorithms out there, but not sure if they will fit here?
Language is C#
Maciej Los 29-Sep-20 5:18am    
Please, see my answer. There you'll find an explanation what OriginalGriff has stated.
In addition to OriginalGriff's solution... Please, read carefully, what OriginalGriff wrote.

As a GetNearest method you can use one of the following:
Math.Floor Method (System) | Microsoft Docs[^]
Math.Ceiling Method (System) | Microsoft Docs[^]
See the difference:
Value          Ceiling          Floor
7.03                8              7
7.64                8              7
0.12                1              0
-0.12                0             -1
-7.1               -7             -8
-7.6               -7             -8


You can also use Math.Round Method (System) | Microsoft Docs[^] or explicit conversion[^] to integer:

C#
double [] data = { 10.21313, 10.456, 10.234324, 10.45758, 11.4564747, 10.45647, 10.32425, 9.34536, 9.4578689, 100.345345, 129.3453, 1.456456, 10.345, 10.235363, 10.23425 };

var NearestDown = data
	.GroupBy(x=> (int)x)
	.Select(grp=> new
	{
		Key = grp.Key,
		Count = grp.Count(),
		Values = string.Join(";", grp.Select(x=>x))
	})
	.OrderByDescending(x=>x.Count)
	.ToList();


Result:
Key Count Values
10   9    10.21313;10.456;10.234324;10.45758;10.45647;10.32425;10.345;10.235363;10.23425 
9    2    9.34536;9.4578689 
11   1    11.4564747 
100  1    100.345345 
129  1    129.3453 
1    1    1.456456
 
Share this answer
 
v2
Comments
csrss 29-Sep-20 5:20am    
Yes, but I know about finding most common numbers in int array. The question is about algorithm to cluster data probably. I just wanted to know, if there are any applicable.
Maciej Los 29-Sep-20 5:36am    
Well.. As to the data clustering algorithm - there's no set of in-built algorithms. Every time you want to group, split, or aggregate data, you need to define what you want to achieve; what dependencies you want to find, etc. I'd suggest to read about "big data" or "data meaning". In a short: while data meaning process, you usually have to use a set of methods. This set of "jobs to do" is known under common name: an algorithm. Good luck!

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900