Friday, April 20, 2007

Random strings and removing duplicates

Sometimes I need a large database with non-real data for playing. In fact it is not just playing, but testing database layer in real world comparable environment. Recently I came across with a problem : I heed a lot of unique random strings with random length. Here is the result of my work. I'm using Set to remove duplicates. According to docs HashSet is the fastest implementation of Set Interface. It is not guarantees the order of iteration, but in my case it is not an issue. By the way, you can compare this Java code with another one - in Python. See the difference ? :) Dont beat me if my solution is not very original... it is works for me. Can it be done better ? Any suggesions are welcome.

import java.util.Random;
import java.util.Set;
import java.util.HashSet;

class RandomStringRandomLength
{
private static final String alphabet[] = {"a","b","c","d","e","f"...};
public static void main(String args[])
{
if(args.length==0)
{
System.exit(-1);
}

int numberOfRandomStrings = 0;
int minLength = 0;
int maxLength = 0;

try
{
minLength = new Integer(args[0]);
maxLength = new Integer(args[1]);
numberOfRandomStrings = new Integer(args[2]);
}
catch(NumberFormatException NumEx)
{
System.out.println("Enter a numbers !");
}

Set<String> set = new HashSet<String>();

final Random random = new Random();
int rndStringLength = 0;
int rndIndex = 0;
StringBuffer rndString = null;
for(int i=0;i<numberOfRandomStrings;i++)
{
rndStringLength = minLength +
random.nextInt(maxLength-minLength);
rndString = new StringBuffer();
for(int j=0;j<rndStringLength;j++)
{
rndIndex = random.nextInt(alphabet.length);
rndString.append(alphabet[rndIndex]);
}
set.add(rndString.toString());
}

for(String s : set)
{
System.out.println(s);
}
}
}

No comments: