Click here to Skip to main content
15,885,216 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
I'm working on a java project where I have to find the similarity between 2 documents
using min hashing algorithm to compute the Jaccard coefficient . I'm new to the concept of hashing and for my project I have to use 100 hash functions .In my project it is
said that a text of words S is distributed in k values min(hi(w) , wE(S)} for i = 1..100
. How can min(hi(w)) be computed for i=1...100 ?
My hash function is computed below . It takes a pair of integers (a,b) as a function .
Thank you for your time .

What I have tried:

Java
private IntegerPair hashFuncGen() 
	{		
		
        int a = rnd.nextInt(10000)+1; 
        int b = rnd.nextInt(10000)+1; 

		IntegerPair hashfunc = new IntegerPair(a,b);
		
		
		return hashfunc;
	}
	
	private  long hash(IntegerPair ip, int x) {
		
		double   L = 52.981;
		
		long  function = (long)(ip.getA()*x + ip.getB())%(long)L;
		
		
		return function;
	}
Posted
Updated 28-Dec-19 23:24pm
v2

1 solution

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900