Thursday, December 07, 2006

Java Scanner : read whole page at once

One of new features of Java 5 is java.util.Scanner. It is great for parsing text files. Here is example how to parse the first column in csv file :

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;

public class TextScanner
{
public static void main(String[] args)
{
try
{
File file = new File("somefile.csv");
Scanner scanner = new Scanner(file);
scanner.useDelimiter("\\s*,\\d,\\d\\W");
String temp = null;
while (scanner.hasNext())
{
temp = scanner.next();
System.out.println(temp);
}
scanner.close();
}
catch (FileNotFoundException e)
{
e.printStackTrace();
}
}
}

I can use regular expression as a filter !
But there is a non-trivial use of this class. Reading whole page at once into single string object:
import java.net.URL;
import java.net.URLConnection;
import java.io.IOException;
import java.util.Scanner;

public class WholePage
{
public static void main(String[] args)
{
try
{
URLConnection connection =
new URL("http://sun.com").openConnection();
String text =
new Scanner(
connection.getInputStream()
)
.useDelimiter("\\Z").next();
System.out.println(text);
}
catch (IOException e)
{
e.printStackTrace();
}
}
}

1 comment:

Anonymous said...

Man. That was awesome. I have gone through so many sites on how to do this and you are a life saver.