Monday, December 25, 2006

300$ per hour is real ?

I have heared that some of SAP consultants gain 300$ per hour ! As for me it is extremely cool :) Such kind of pay rate per hour is not only in ERP field. Some people say that programmers also can gain 200$ per hour. It is interesting what is the top limit for IT people...

500$ ?

100,000 lines of code per week

eBay says that "We roll 100,000+ lines of code every two weeks". Hmm.. at first look it is a huge amount. But lets see.. 2 weeks = 10 working days for the 100 of programmers is... 100 lines of code for a programmer per day.

What can I say about that ? They working not very fast, but carefully :)

Friday, December 22, 2006

eBay Internals

http://www.addsimplicity.com/downloads/eBaySDForum2006-11-29.pdf - very interesting.
It is about some internals of eBay.
212,000,000 registered users, 2 Petabytes of data, 26 Billion SQL executions per day. But the most interesting for me is 3.3 million line C++ ISAPI dll - 150 MB. Oh man...

Python : Generate Random Strings

I like Python. More and more :) Here is example of random string generation :

import random
alphabet = 'abcdefghijklmnopqrstuvwxyz'
min = 5
max = 15
total = 1000000
string=''
FILE = open("filename.out","w")
for count in xrange(1,total):
for x in random.sample(alphabet,random.randint(min,max)):
string+=x
FILE.write(string+'\n')
string=''
FILE.close()

It's rather shorter than adequate Java application. It seems to be that doing less we can take more :). The key is the random.sample(A,N) function - it is return random subset of A with N elements. Since I need random strings with random length varies from 5 to 15, I'm using N as random.randint(min,max).
There may be more elegant solution but this is works for me :)

Thursday, December 07, 2006

MySQL InnoDB storage engine

MySQL supports a lot of storage engines as well as plugable storage engine architecture. What about referrential integrity of tables with different storage engines ? Foreign keys support avaliable for InnoDB and Falcon. I don't know nothing about Falcon, but InnoDB... It's interesting who is the owner. Answer : Oracle Corporation :) . By the way, if we compare storage engines by this criterias

  1. Storage Limits
  2. Tablespace Support
  3. Configurable Page Sizes
  4. Automatic Storage Extension
  5. ACID Transaction Support
  6. Distributed Transaction Support
  7. Locking Granularity
  8. Savepoint Support
  9. Crash Recovery
  10. Foreign Key Support
  11. B-Tree Indexes
  12. Hash Indexes
  13. Clustered Indexes
  14. Full Text Indexes
  15. Data Caches
  16. Index Caches
  17. Query Cache Support
  18. Online Parameter Support
  19. Geospatial Support
  20. Replication Support
  21. Backup/Point-in-Time Recovery
  22. Memory Footprint
  23. Bulk Insert Speed
we'll see that InnoDB has a most advanced features across competitors. It is the most widely-used. But the insert speed for InnoDB tables is very slow. But nobody forbids to mix up different storage engine types in one database but for different tables to gain owerall performance increase.

Java Scanner : read whole page at once

One of new features of Java 5 is java.util.Scanner. It is great for parsing text files. Here is example how to parse the first column in csv file :

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;

public class TextScanner
{
public static void main(String[] args)
{
try
{
File file = new File("somefile.csv");
Scanner scanner = new Scanner(file);
scanner.useDelimiter("\\s*,\\d,\\d\\W");
String temp = null;
while (scanner.hasNext())
{
temp = scanner.next();
System.out.println(temp);
}
scanner.close();
}
catch (FileNotFoundException e)
{
e.printStackTrace();
}
}
}

I can use regular expression as a filter !
But there is a non-trivial use of this class. Reading whole page at once into single string object:
import java.net.URL;
import java.net.URLConnection;
import java.io.IOException;
import java.util.Scanner;

public class WholePage
{
public static void main(String[] args)
{
try
{
URLConnection connection =
new URL("http://sun.com").openConnection();
String text =
new Scanner(
connection.getInputStream()
)
.useDelimiter("\\Z").next();
System.out.println(text);
}
catch (IOException e)
{
e.printStackTrace();
}
}
}

Tuesday, December 05, 2006

MySQL : InnoDB vs MyISAM vs Archive Benchmark

MySQL has a few storage engines. To list them type :

mysql> show engines;

First of all I'm interested in the INSERT speed of DML.
It is crytical for some type of applications.
So, here is my benchmark.
All of engines have different INSERT speed.
To test it, but not the speed of my
network connection or local loop interface,
I decided to create dummy stored
procedure for inserting a lot of data into some table.
Here is the code :

delimiter //
create procedure test_insert()
begin
declare counter mediumint;
set counter = 0;
while counter < 100000
do
INSERT INTO
test values
(1,'sample dummy text',now());
set counter = counter + 1;
end while;
end //
DELIMITER ;

The DDL of sample table is :
CREATE TABLE test
(
first int(11) default NULL,
second varchar(20) default NULL,
third date default NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

We can change the storage engine for existing table as so

mysql> alter table test engine=archive;
Query OK, 100000 rows affected (0.98 sec)
Records: 100000 Duplicates: 0 Warnings: 0


Really fast ! That is all what we need to do our benchmark.
The results (the number of rows 100 000):

  1. Archive : 5.20 sec
  2. MyISAM : 7.13 sec
  3. InnoDB : very slow !!!

Wednesday, November 29, 2006

MySQL at Google

Look at that ! Google is among MySQL AB customers ! Some people say that Google AdWords application is build on MySQL backend. I'm afraid to imagine HOW it is built. But it is simply amazing ! No Oracle, DB2 or other very cool stuff. Open Source rocks !

Friday, November 24, 2006

Online syntax highlighting

It is just nice thing : http://tohtml.com

Python and MySQL

I know how to work with JDBC in Java and DBI in perl.
But how about Python ? :)
Lets see... I have Windows box with Python 2.4 and MySQL 5.
OK. What do I need else ?
Some kind of driver, if Python-style interaction with DRBMS's
is similar to Java or Perl. I'm guess :) It is here.
So, we have "Python Database API Specification v2.0".
It is not a problem for now to write an dummy code
to play with API :


import
MySQLdb
import sys

try:
myDB = MySQLdb.connect(
host="127.0.0.1",
port=3306,user="root",
passwd="**********",
db="testdb")
cHandler = myDB.cursor()
cHandler.execute("SELECT * FROM testTable")

for a in range(cHandler.rowcount) :
result = cHandler.fetchone()
for x in range(len(result)) :
print result[x]
print "\n"

myDB.close()

except MySQLdb.Error, e:
print "Error %d: %s" % (e.args[0], e.args[1])
sys.exit(-1)

Wednesday, November 22, 2006

Google PageRank

Look at this site http://www.getrank.net. They can show you PageRank of your site. I'm interested how it's can be done... For now I know how, but can't publish the source code here since minning pageranks is against google's terms of service. But, I've got a lot of fun. Sometimes it's interesting to dig :)


It was interesting to compare some pageranks (x of 10):
http://www.microsoft.com : 10 !
http://www.linux.org : 9
http://www.freebsd.org : 9
http://www.redhat.com 9
http://www.gentoo.org 8
http://www.novell.com/linux 8

Wednesday, November 15, 2006

Refreshing page in browser: HTTP headers

Working on some task I'm interested in the way how the browser's cache is working. Especially what are particular http headers sent or not by the browser when user is refreshing the page in the browser.

So, here is my experiment. I have 3 browsers: Firefox 2.0, IE 6.0, and Opera 9.0. Opera version may be old, it's because i don't use it. I don't know better browser than Firefox, sorry Opera fans.

How we can see http headers sent by browser ? There is not only one solution here. For this purpose we can use sniffer such ethereal to watch at HTTP protocol, some addition or extension to browser (for example Live Http Headers for FF or HttpWatch for IE), or even a local proxy server. But it is possible to track such data at server side. Lets take Tomcat 5.0 and write a little of code. It will be filter that listen all incoming requests to our web application and prints http headers. So, it is some kind of spider at server side :)

The code listing of filter :

package com.mycoolcompany.filters;

import java.io.IOException;
import java.util.Enumeration;

import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.http.HttpServletRequest;

public class RequestDumpFilter implements Filter
{
public RequestDumpFilter() {}

public void destroy() {}

public void doFilter(
ServletRequest request,
ServletResponse response,
FilterChain chain) throws IOException, ServletException {

HttpServletRequest req = (HttpServletRequest) request;

System.out.println(req.getMethod()+
" "
+req.getRequestURI()+" HTTP/1.1");
Enumeration enumer = req.getHeaderNames();
String headerName = null;
while(enumer.hasMoreElements())
{
headerName = (String)enumer.nextElement();
System.out.println(headerName+
" : "
+req.getHeader(headerName));
}
chain.doFilter(request, response);
}

public void init(FilterConfig filterConfig)
throws
ServletException {}
}

url-pattern '/*' is the key. It's mean that we'll be listen to all requests for our web app.
Compile, deploy and enjoy :)
So, lets begin to dig.

Point your IE browser to http://localhost:8080/RequestDumper/
You'll get something like this at your server console :

GET /RequestDumper/ HTTP/1.1
accept : image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
application/x-shockwave-flash,
application/vnd.ms-excel, application/vnd.ms-powerpoint,
application/msword, */*
accept-language : ru
accept-encoding : gzip, deflate
user-agent : Mozilla/4.0 (compatible; MSIE 6.0;
Windows NT 5.2; .NET CLR 1.1.4322)
host : localhost:8080
connection : Keep-Alive

When you begin to frequently refresh te page you may notice that sometning changed :

GET /RequestDumper/ HTTP/1.1
accept : */*
accept-language : ru
accept-encoding : gzip, deflate
user-agent : Mozilla/4.0 (compatible; MSIE 6.0;
Windows NT 5.2; .NET CLR 1.1.4322)
host : localhost:8080
connection : Keep-Alive

Accept header become a little bit shorter ;)

If you press Ctrl-F5 the browser will send the following :

GET /RequestDumper/ HTTP/1.1
accept : */*
accept-language : ru
accept-encoding : gzip, deflate
user-agent : Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.1.4322)
host : localhost:8080
connection : Keep-Alive
cache-control : no-cache

'cache-control : no-cache' - that's it !

Lets take Firefox. First request looks like this :

GET /RequestDumper/ HTTP/1.1
host : localhost:8080
user-agent : Mozilla/5.0 (Windows; U; Windows NT 5.2; ru; rv:1.8.1)
Gecko/20061010 Firefox/2.0
accept : text/xml,application/xml,application/xhtml+xml,text/html;
q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
accept-language : ru-ru,ru;q=0.8,en-us;q=0.5,en;q=0.3
accept-encoding : gzip,deflate
accept-charset : windows-1251,utf-8;q=0.7,*;q=0.7
keep-alive : 300
connection : keep-alive

Frequently refreshing page will change the headers :

GET /RequestDumper/ HTTP/1.1
host : localhost:8080
user-agent : Mozilla/5.0 (Windows; U; Windows NT 5.2; ru; rv:1.8.1)
Gecko/20061010 Firefox/2.0
accept : text/xml,application/xml,application/xhtml+xml,text/html;
q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
accept-language : ru-ru,ru;q=0.8,en-us;q=0.5,en;q=0.3
accept-encoding : gzip,deflate
accept-charset : windows-1251,utf-8;q=0.7,*;q=0.7
keep-alive : 300
connection : keep-alive
cache-control : max-age=0

'cache-control : max-age=0' - it is the difference.

Pressing Ctrl-F5 gives :

GET /RequestDumper/ HTTP/1.1
host : localhost:8080
user-agent : Mozilla/5.0 (Windows; U; Windows NT 5.2; ru; rv:1.8.1)
Gecko/20061010 Firefox/2.0
accept : text/xml,application/xml,application/xhtml+xml,
text/html;q=0.9,text/plain;
q=0.8,image/png,*/*;q=0.5
accept-language : ru-ru,ru;q=0.8,en-us;q=0.5,en;q=0.3
accept-encoding : gzip,deflate
accept-charset : windows-1251,utf-8;q=0.7,*;q=0.7
keep-alive : 300
connection : keep-alive
pragma : no-cache
cache-control : no-cache


We got some changes : 'pragma : no-cache' and 'cache-control : no-cache'.
And finally Opera.
First request:

GET /RequestDumper/ HTTP/1.1
user-agent : Opera/9.00 (Windows NT 5.2; U; en)
host : localhost:8080
accept : text/html, application/xml;q=0.9, application/xhtml+xml, image/png, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1
accept-language : en
accept-encoding : deflate, gzip, x-gzip, identity, *;q=0
connection : Keep-Alive, TE
te : deflate, gzip, chunked, identity, trailers

Frequently refreshing page will change the headers :

GET /RequestDumper/ HTTP/1.1
user-agent : Opera/9.00 (Windows NT 5.2; U; en)
host : localhost:8080
accept : text/html, application/xml;q=0.9,
application/xhtml+xml, image/png, image/jpeg,
image/gif, image/x-xbitmap, */*;q=0.1
accept-language : en
accept-encoding : deflate, gzip, x-gzip, identity, *;q=0
cache-control : no-cache
connection : Keep-Alive, TE
te : deflate, gzip, chunked, identity, trailers

'cache-control : no-cache' - new.

Pressing Ctrl-F5 :

GET /RequestDumper/ HTTP/1.1
user-agent : Opera/9.00 (Windows NT 5.2; U; en)
host : localhost:8080
accept : text/html, application/xml;q=0.9,
application/xhtml+xml, image/png, image/jpeg,
image/gif, image/x-xbitmap, */*;q=0.1
accept-language : en
accept-encoding : deflate, gzip, x-gzip, identity, *;q=0
cache-control : no-cache
connection : Keep-Alive, TE
te : deflate, gzip, chunked, identity, trailers

For now I need a little meditation on HTTP RFC ;)

Tuesday, November 14, 2006

CORBA vs RMI

Interesting thing... There are a lot of technologies for distributed computing : CORBA, RMI, EJB, etc. What is the best choice for particular case ? Which are criterias to determine appropriate technology in some case ? To figure out criteria we need to answer some of the questions listed nere.

Do we need to integrate some applications written in different languages across enteprise ?
Do we'll switch to another technology in the future ? Can we feel more comfortable when can send Java objects as parameters to remote methods ? Is it politically correct to use proprietary protocol ? How much is the cost of CORBA/EJB/RMI developer and how the learning curve goes ? How the technologies supported by different vendors ?

And at least - performance. Some people say that invocation of remote CORBA method in comparison with RMI takes 3 times faster. I think it due using serialization... It will be fun to make some benchmarks in debug mode with YourKit debugger for example :)

Thursday, November 02, 2006

Using EJB 3.0 and Struts together

I have used Struts with ORM framework but I'm not sure how to properly
use it with EJB's. I have some stateless session beans which are
encapsulate business logic + some EJB 3.0 entities ("EJB 3.0 entities"
is more correct term than "Entity Beans" according to Mike Keith).
A book "Struts Kick Start" by James Turner and Kevin Bedell bring some
light on this topic.
"Chapter 18. Using Struts with Enterprise JavaBeans" - is nice
introduction and it doesn't matter that is non-EJB 3 related.

Wednesday, November 01, 2006

Install jBoss with EJB 3.0 support using command line

I'm using this command to install jBoss AS with EJB 3.0 support. It is rather more interesting than using GUI install steps :)

java -jar jboss-4.0.4.GA-Patch1-installer.jar -installGroup ejb3-clustered installpath=C:\server\jboss-4.0.4.GA

Change -installpath at your taste. -installGroup can be also ejb3.

Friday, October 13, 2006

Java : fun with String

class Str
{
public static String String = "test";
public static void main(String args[])
{
System.out.println(String);
System.out.println(Str.String);
}
}
No comments :)

Wednesday, October 11, 2006

Java : fun with main()

Yesterday I was playing with main()... We have 1 class A (A.java) :

package com.mycoolcompany.mainAB;
class A
{
public static void main(String args[])
{
System.out.println("A");
}
}

Second class B.java :

package com.mycoolcompany.mainAB;
class B
{
public static void main(String args[])
{
(new A()).main(args);
}
}

It seems to be that java is comfortable with executing main() from another class - executing B says A ! Next thing is more interesting...
C.java :
package com.mycoolcompany.mainAB;
class C extends A
{
public static void main(String args[])
{
(new C()).main(args);
}
}

If we run this class, we'll have a stack owerflow. But it's compiled without problems ! And at last... Can we run class without main ? Maybe yes :)
package com.mycoolcompany.mainAB;

public class D extends A{}

Executing "java com.mycoolcompany.mainAB.D" says A !

Friday, September 22, 2006

The blog is started

First of all I would like to say thanks to Google for such kind of service. About blog... I'll be posting some interesting things which I see at work, over the Internet, etc. About me : I'm 23 years old J2EE programmer.