Selenium Page Objects and Abstraction

Jan 23, 2012 by

Last Friday, I explained the concept of Selenium Page Objects to my colleague and I feel like I didn’t explain it in a simple way enough. I have spent some time today to come up with explanation that relates the page object pattern to the concept of abstraction in programming

Object Abstraction

Let’s say we have a class that processes news messages

class NewsListener{
    public void onNewsMessage(String rawMsg){
        //Example of raw message; "ID1234:Alert-News content"
        String newsId = rawMsg.substring(0 , rawMsg.indexOf("-"));

        if( newsId.substring( newsId.indexOf(":") + 1 ).equals("Alert") ){
            notifyAlertListener(newsId, rawMsg);
        }else{
            renderNewsContent(newsId, rawMsg);
        }
    }
}

It seems like the NewsListener class know the news raw message really well. But the class is unlikely to be the only one that needs to get news ID and message type out of the raw message; AlertListener and NewsContentPanel may also need to use those values too. Although, those two lines of raw message parsing seems to be easy enough to be copied to all other classes but any decent developers know that will definitely causes problems

  • Tight Coupling and Code Duplication – This is obvious, the parsing code is copied to every place that need to know message ID and message type. If the format of the message has changed then every class needs to be modified
  • Wrong level of abstraction – The main business logic of the class is to dispatch message to the right listener according to message type. It should not care how the type is derived from the message

We can improve the code by introduce an abstraction; NewsMessage, to encapsulate the internal message format in one place and expose just the information that need to be used by other classes

public void onNewsMessage(NewsMessage msg){
        if( msg.isAlert() ){
            notifyAlertListener(msg);
        }else{
            renderNewsContent(msg);
        }
}


Selenium Page Objets

Let’s look at a test case to verify that the target page will not show any project information when there is no project stored in database

public class ListProjectPageTest {
    private static WebDriver driver;

    @BeforeClass
    public static void setUpBeforeClass() throws Exception {
        driver = new FirefoxDriver();
    }
    @Test
    public void listProjectsWhenNoProjectInDB(){
        driver.get(  URLConfig.get(“list-project-page”)  );
        WebElement table = findElementIgnoreException( By.cssSelector("#container > table") );
        int numberOfProjShowing = table.findElements( By.xpath("tbody/tr") ).size();
        assertEquals(0, numberOfProjShowing );
    }
}

Just like the NewsListener example, the test just want to assert the number of projects that have been showed on the target page but it choose to use Selenium driver to get the information directly from HTML DOM node. This code for retrieving the number of rows must be copied into every test case that needs the same information and the changes in HTML structure will risk breaking all those test cases

Page Objects is a pattern to capture web UI components into object. We can create ProjectsPage to encapsulate all the logic to get project’s information in one place and let our test cases work with our web UI at the higher level of abstraction

@Test
public void listProjectsWhenNoProjectInDB(){
ListProjectsPage page = ListProjectsPage.navigateToPage(driver);
	assertEquals(0, page.getNumberOfProjectsShowing() );
}

Below is example of how the page objects could be implemented

public class ListProjectsPage{
	private final WebDriver driver;
	private final WebElement tableElem;

	private ListProjectsPage(WebDriver driver) {
		tableElem = findElementIgnoreException( By.cssSelector("#container > table") );
	}

	public static ListProjectsPage navigateToPage(WebDriver driver){
		driver.get( getURL("list-projects-page") );
		return new ListProjectsPage(driver);
	}

	public int getNumberOfProjectsShowing(){
		return  tableElem.findElements( By.xpath("tbody/tr") ).size();
	}
}

The pattern will make test cases less brittle and easy to read. Test cases will come close to the promise of “executable requirement”

read more

Related Posts

Share This

Simple explanation for inner join and left join

Dec 25, 2011 by

Summary: The inner join is for filtering out unwanted records and left join is for pulling in extra information

Last week, I have heard my colleague explained how inner join and left join are used. SQL join types are typically explained using Venn diagrams but my colleague’s explanation is from the view of how we are using them.

Let’s say we are writing a function to get list of employees in the database shown below

The business logic for this method is that we are only interested in the employees that have been assigned to a project, the SQL string is shown below

SELECT employees.name, projects.name as project
FROM employees INNER JOIN projects ON employees.project_id = projects.project_id

You can see in the result set that we uses inner join for filtering out unwanted records which are those employee records that don’t have corresponding project and those projects that haven’t been associated to an employee

Now we also want to know whether each employee in the result set has access right to the laboratory room. We uses left join to pulling in extra information. Our SQL will looks like:

SELECT employees.name, projects.name AS project , lab_accesses.role AS lab_role
FROM employees JOIN projects ON employees.project_id = projects.project_id
LEFT JOIN lab_accesses ON employees.lab_code = lab_accesses.lab_code

Our main purpose is to get list of employees with corresponding projects. The lab_role value is just our extra information. It is fine if an employee doesn’t have laboratory access right, we still want to include the employee in our result set. That’s why we use left join

It is somewhat oversimplified but I found this explanation is quite easy to remember

read more

Related Posts

Share This

Java Deflater and OutOfMemoryError

Aug 2, 2011 by

Summary: Relying on Finalizer to clean up native memory of Deflater object may cause OutOfMemoryError

I guess this is old for many developers but it is new to me. I have been asked to join an investigation of OutOfMemoryError issue occurred in a project outside my division. Actually, it was a brief involvement. My colleague who was handling the issue already got a possible root cause. He believed that it’s something about the compressing code that just had been added in the newest release. The compressing logic used java.util.zip.Deflater and the end() method of Deflater objects hadn’t been explictly closed once the compressing was done. Cleaning up mechanism relied on Java Finalizer to call finalize() method of Deflater objects

My colleague suspected that a Deflater object consumed a little space in Java heap but also allocated much bigger space of native memory. He believed it was possible that a large number of Deflater objects had been created which caused OOM in native side. Most of the Deflater objects were unreachable but GC didn’t kick in because overall heap consumption was still low. The result was that all those unreachable objects hadn’t been scheduled for Finalizer yet so all the native memory allocations were still retained

I was skeptical of his assumption at first. I thought if JVM was being under the risk of OOM then way didn’t GC try to reclaim java heap to let the object that allocated native memory be cleaned up. I decided to write a simple code to prove the scenario


public static void main(String[] args) throws IOException {
        int testSize = 1;
        if(args.length > 0){ testSize = Integer.parseInt(args[0]); }

        String content = readConent( Main.class.getResourceAsStream("data.txt"));
        byte[] input = content.getBytes("UTF-8");

        System.out.println("Test size = " + testSize);
        for(int i=0; i< testSize; i++){
            byte[] output = new byte[500];
            Deflater deflater = new Deflater();
            deflater.setInput(input);
            deflater.finish();
            deflater.deflate(output);
            //deflater.end();
        }

        System.out.println("Done");

        //Let it run for a while so I can monitor the java process.
        try{ Thread.sleep(60*60*1000); }catch(InterruptedException ex){}
    }

The program repeatedly compresses a short string (around 300 bytes). Notice that the line deflater.end() has been commented out which will defer the cleaning up process until the Finalizer thread start looking at the objects

I got OOM when I ran the program to create 6300 instances of Deflater

java -cp build\classes -verbose:gc -Xmx300m -Xms300m deflat.Main 6300

Exception in thread “main” java.lang.OutOfMemoryError
at java.util.zip.Deflater.init(Native Method)
at java.util.zip.Deflater.(Deflater.java:123)
at java.util.zip.Deflater.(Deflater.java:140)
at deflat.Main.main(Main.java:23)

I ran the program with –verbose:gc to check whether GC would try to collect objects in heap before OOM occurred or not. There was no GC activity logged to console at all. My colleague’s assumption was right; GC are not able to aware that native memory usage is about to exceed limit

I ran the program again with test size of 6200 to let it finish execution successfully then checked the current heap size

“C:\Program Files\Java\jdk1.6.0_13\bin\jstat.exe” -gc 2936

S0C          S1C        S0U    S1U   EC              EU          OC                OU
2304.0    2304.0   0.0     0.0     19008.0   4186.7   283584.0     0.0

Only the Eden part of heap size was occupied and the current utilization was just 4186.7 KB (EU column). The size of heap usage wasn’t large enough to trigger any GC activities it wasn’t even large enough for a minor collection in new generation. It was safe to say that the OOM generated with test size of 6300 wasn’t the result of exhausted java heap

While the java process with ID 2936 was still running, I could see in Window Task Manager that the virtual memory of the process was almost 1.9 GB. I guess the OOM from the test size of 6300 is about the memory limit of 32bit process which practically around 2 GB (the rest 2 GB is reserved for Window).

Clean it up as soon as possible

I found a very informative link described this GC behavior. One of SUN engineers has clarified that:

We’re throwing an OOM not because the java heap is (anywhere close to) full,but because the non-java (i.e., C/C++) heap is full: you can get an OOM when either heap is full. The reason for the OOM is because unreachable, but not yet finalized, java objects are holding pointers to C/C++ heap memory, and that memory is deallocated pnly when the finalizers run. The JLS doesn’t specify when finalizers get run, so if the VM delays executing them long enough, the C/C++ heap memory doesn’t get freed soon enough to prevent an OOM

An easy fix for the problem is to clean up the Deflater objects as soon as possible by calling deflater.end() method once the compression is done. The modified version of the program can run with test size of 10000 or more. The interesting thing is that with the test size of 10000, the heap utilization is around 6.4 MB which is still not enough for a minor GC collection so all 10000 Deflater objects are still there in heap but the virtual memory of the Java process is just around 331 MB. It seems like the end() method has cut the link between the a Deflater object and the space it has allocated in native memory

Note: HeapDumpOnOutOfMemoryError

I have tried using -XX:+HeapDumpOnOutOfMemoryError JVM options to get heap dump right at the point of OOM but it doesn’t work so I need to lower the test size a bit to keep the process alive and monitor the process using jstat instead. I found an explanation saying that:

Allocation of a native thread is not from the heap or the permanent generation – hence failure to allocate one does not result in a heap dump. (There would be little point in dumping the heap as that is not what was exhausted.)

The OOM caused by native memory usage will not trigger the JVM option

read more

Related Posts

Share This

The Answer Lies Elsewhere

Jul 3, 2011 by

Summary: A lesson my former group leader taught me is that don’t put all investigating effort toward only one direction and keep looking for other possible root cause

I have been thinking about this for quite sometime. An event occurred last week prompted me to finally write about it. It is about the problem investigating approach that my former group leader taught me when I was still in the early period of my working life. The approach goes like this

Sometimes, we just don’t know exactly how to handle a problem. There may be many unknown factors or the behavior of the case is unpredictable. We may not want to just choose an approach and put all effort toward the direction. It’s possible that the approach will lead us to the solution but it’s also possible that we waste all effort just to find out in the end that we have chosen the wrong path. With many unknown factors or unpredictable behaviors, it’s very likely that we will end up in the second case. For this kind of problem, it may be better to try gathering information from different possible directions. We may try to experiment a bit of something here and there to see whether there are any real convincing evidences show up in the experiment. Who knows, we may pick up the scent of the root cause in the area that hasn’t been in our original focus at all

There is no special technique in this approach at all. It’s just a reminder for keeping our eyes and minds open for other possible directions. I myself didn’t pay much attention to the method when my leader explained it to me at the time. You may wonder that who in the right mind would jump to a path and just walk along it without knowing where the end of the path is. Well, I have done that a couple times. There are some factors that can trick us to keep sticking to a path and ignore all other possible things. Let me tell you some of my stories that make me realize the how much useful the approach can be

My first memory leak investigation

If I recall it correctly, it is my first experience in handling reported software issue. Our support team hadn’t been formed yet so all issues had to be handled by developers. This story is not much about the unpredictable behavior of the case but more about my lack of experiences in gathering necessary information. A production team raised an issue for a possible memory leak in my product. I managed to get the heap configuration of the problematic server, number of users and their characteristic of activities. I set out to reproduce the memory leak in my development environment immediately. I have to admit I don’t know what happened to my mind at the time. May be it was because the idea of memory leak was so cool and exciting or because this was the first time I had a chance to apply what I had studied to a real world problem. I was so convinced that there was memory leak in my code base. I wrote a small program to simulate user’s activities, investigated GC log and looked through the code base to find a possible leak. I couldn’t reproduce the problem but I still kept trying. The recorded effort used for this issue was growing and growing. At some point, the production team seemed to lost interest in the case. I finally asked them for the status of their side. They replied that the root cause was found. Apparently, the problem was in their module that was a plug-in configured to my product to perform customized entitlement

The product had been on production for quite a long time before the memory leak issue was reported. If I stepped back and asked myself what might be the change that triggered this memory leak then I might change my attention to a different direction. When I failed to reproduce the leak within a couple days, it might occur to me that the root cause might be something out of my control. But I was so caught in the idea of finding memory leak in my code base and ignored other possibilities

Strange performance drop – story 1

There was a strange performance problem when my team was just about to release a new version of our product. We performed performance testing for this new release on the fastest machine available to our team. The new release turned out to perform not as good as the previous release. The strange thing was that the difference in performance figures wasn’t stable. For example, a test round reported the final average response time 25 ms but other rounds reported 30 – 35 ms(the response time in the previous version is 18 ms). I tried to measure the time spent in each part of the modules and found that there was no part of the system that looked like a bottleneck. It seemed like the overall system just got slower.

Our platform migration from 32 bit system to 64 bit turned out to be quite a misleading for this problem. I saw many resource claimed that when a system was ported to 64 bit, it might get slower. I tried tuning OS dependent parameters and also investigating whether large heap size in 64 bit java could cause performance problem. Nothing helped me identified the root cause

One day while I was searching for the technique to better monitor CPU usage, I stumbled across a blog post about CPU frequency scaling. I read it and learned that some processors were able to lower its frequency to save power and generate less heat. It just happened that my performance testing machine was HP DL585 with a dynamic power management feature which was a kind of frequency scaling mechanism. I queried the current mode of my system and found that it was set to Ondemand by default

$ more /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
ondemand

Then I changed it to performance and rerun my test again

$ echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
(perform the above step for every CPU cores)

And everything just got back to the normal day gain. Now the new release performed as good as the previous one. It was about the CPU frequency mode all along. It was another time that the root cause lied in someplace totally out of our focus

Strange performance drop – story 2

This story happened last Friday which inspired me to write this post. We planed to support Tomcat as another target platform. Our QA team ran a rough performance test of our product on Tomcat. Tomcat was quite mature and wildly used so we didn’t expect any kind of problem at all. But like all other performance tests in our project, unexpected thing happened. We found that Tomcat on Solaris X86 made our product run much slower than Sun Java System Web Server on the same machine. The strange thing is that our product on Tomcat running on Linux is slightly faster than our product on SJSWS on the same Linux machine

Our product contained both Java code and Native C module. My colleague performed some measurements and pointed out that it was something to do with the native side. I tried to perform tuning on Tomcat server and play with various Tomcat connectors but nothing helped

In our team meeting on last Wednesday, we agreed that we needed to do some code profiling on native side. We didn’t know yet how could we run C profiler in the environment of Tomcat but we thought it was the way to go. After the meeting, I still keep playing with the testing environment.
I roughly scanned through the startup script of SJSWS looking for some customized tuning specific to Solaris platform. I discovered that the script contained a section to load libumem library. The comment above the section described that this was for performance reason. I copied the section to Tomcat startup script then started the server for performance test again. You want to guess the result? That’s right; I got the figures that as good as the one on SJSWS. That just save us from C profiling task that might takes us sometime to figure it out

# Preload libumem to improve performance on Solaris 10
LIBUMEM_32=/usr/lib/libumem.so
if [ -f "${LIBUMEM_32}" ] ; then
if [ `uname -r | sed s/\\\.//` -ge 510 ] ; then
LD_PRELOAD_32=”${LIBUMEM_32} ${LD_PRELOAD_32}”; export LD_PRELOAD_32
fi
fi

LIBUMEM_64=/usr/lib/64/libumem.so
if [ -f "${LIBUMEM_64}" ] ; then
if [ `uname -r | sed s/\\\.//` -ge 510 ] ; then
LD_PRELOAD_64=”${LIBUMEM_64} ${LD_PRELOAD_64}”; export LD_PRELOAD_64
fi
fi

The approach is a balancing act. If you have enough information or you experiences suggest you to choose a certain path then it’s reasonable to choose it. You may just want to remind yourself that if the path doesn’t seem to lead to a solution then the answer might lie elsewhere

read more

Related Posts

Share This

RenameTo and Poor Man’s File Based Cache

Jun 17, 2011 by

I have been assigned to investigate a performance problem in one of the servers in my project and I found an interesting concurrent issue that caused a noticeable drop of throughput figure in performance test. The main business logic of the server is to generate image chart for a Servlet front-end module. Let me use an oversimplified example code below to show the pattern of the problem.

class ChartServer{

    public String generateChart(ChartParameters params) throws IOException{

        String cacheFile = computeCacheFileName(params);
        if( isInterDayRequest(params) && isCacheFileExist(cacheFile) ){
            return new File(cacheDir, cacheFile).getAbsolutePath();
        }

        BufferedImage img = generateImageChart();
        File output = new File(cacheDir, cacheFile);
        synchronized( cacheLock ){
            ImageIO.write(img, "PNG", output );
        }

        return output.getAbsolutePath();
    }
}

All request parameters will be used to compute the cache key. Each cache image will be saved to file using this key as the file name. The server can just return the existing cache file for the subsequent request that contains the exact same set of parameters.

To prevent each worker thread from writing to the same file concurrently, all write to cache directory will block on cacheLock object. This block cause a big bottleneck since all thread will have to block on the same object. The question is that, in order to get rid of the global write-lock, how to make a worker thread be able to lock only the file it is going to write

I have some ideas in approaching the problem but I end up not choosing them.

- Apply the cache file name to a hash function to choose a worker thread so the write to the same file will be done sequentially in a thread.

Problem How can I have a good hash function to distribute tasks to all worker threads evenly?

- Create a lock object for each cache file and store it in HashMap. Each thread uses file name as key to get the corresponding lock (in a thread safe manner) from the HashMap and then block on the lock before writing. Now two threads writing on different files will block on different locks. All thread still block on the same HashMap object but let’s hope retrieving object from map will be very fast.

Problem Will all lock be stored in the HashMap for the whole life of process? If the system employs 16 worker threads then there will be 16 files that are being written at a time but each thread must find the lock for its target file in a HashMap that contains all lock objects in cache system.

It would be great if there is a way to fix this without too many changes in the old execution flow.

The FileChannel locking can’t be applied with this case

Since JDK 1.4, Java has provided a way to perform file locking with the use of lock() and tryLock() methods of class FileChannel. Unfortunately, those methods are meant to be used in the process-to-process level. The mechanism can’t be used to coordinate accesses between threads. The code below will result in java.nio.channels.OverlappingFileLockException when the second FileChannel try to get the lock already hold by the first FileChannel.

public static void concurrentWrite() throws IOException {
        FileChannel f1 = new FileOutputStream(new File("data.txt")).getChannel();
        FileChannel f2 = new FileOutputStream(new File("data.txt")).getChannel();

        FileLock lock1 = f1.tryLock();
        System.out.println("lock1 = " + (lock1 == null ? null : " acquired"));

        FileLock lock2 = f2.tryLock();
        System.out.println("lock2 = " + (lock2 == null ? null : " acquired"));
    }

In contrast, if I start the ConcurrentWriter thread below in 2 different processes, the file locking mechanism will work.

class ConcurrentWriter extends Thread {
    private final File output;

    public ConcurrentWriter(File output) {
        this.output = output;
    }

    @Override
    public void run() {
        try {
            writeInLoop();

        } catch (InterruptedException ex) {
            //Let this thread stop;
        } catch (IOException ex) {
            ex.printStackTrace();
        }
    }//run

    public void writeInLoop()
    throws InterruptedException, FileNotFoundException, IOException{

        while (true) {
            FileChannel fs = new FileOutputStream(output).getChannel();
            FileLock lock = tryLock(fs);
            try{
                writeData(fs);
            }finally{
                lock.release();
                fs.close();
            }

            Thread.sleep(100);
        }
    }

    public FileLock tryLock(FileChannel fs)
    throws InterruptedException, IOException {

        FileLock lock = null;
        while ((lock = fs.tryLock()) == null) {
            System.out.println(getName() + ": the file is locked, sleep then try again later");
            Thread.sleep(200);
        }
        System.out.println(getName() + ": got the file lock");
        return lock;
    }

    public void writeData(FileChannel fs)
    throws IOException, InterruptedException {
        //write data
    }
}

The output of a writer thread is shown below.

W1: got the file lock
W1: got the file lock
W1: the file is locked, sleep then try again later
W1: the file is locked, sleep then try again later
W1: the file is locked, sleep then try again later
W1: the file is locked, sleep then try again later
W1: the file is locked, sleep then try again later
W1: the file is locked, sleep then try again later
W1: the file is locked, sleep then try again later
W1: the file is locked, sleep then try again later
W1: got the file lock
W1: the file is locked, sleep then try again later
W1: the file is locked, sleep then try again later
W1: got the file lock
W1: got the file lock

Save by renaming

I came across this approach when I was searching about file locking. Each worker thread can just write to a temporary file with unique file name; may be the real target file name appended with worker thread’s id. Now each thread is exclusively writing to a temporary file. The temporary file will be renamed to be the real target file after it has been successfully written. The only step that is in high contention is the renaming operation. The approach will wok only if the renaming operation is atomic. Although I found various developers claiming that this operation is atomic in most file system, I haven’t found the good solid reference to support the claim yet. I have tested it on my performance environment with 16 cores machine and I didn’t see any corrupt data or IO exception. All responses contained the correct content size and the content in target cache file was updated so I am quite confident this technique practically works.

With this technique, I can get rid of the global write-lock which causes the contented synchronization point and improve the throughput figure dramatically. The technique may be implemented as the code shown below.

public class FileUtil {
    private FileUtil(){}

    public static boolean saveByRename(BufferedImage img, String format, File destination)
    throws FileNotFoundException, IOException{
        String tmpFileName = destination.getName() + getUniqueSuffixPerThread();
        File tmpFile = new File( destination.getParent(),tmpFileName);

        FileOutputStream out = new FileOutputStream(tmpFile);
        try{
            ImageIO.write(img, format, tmpFile);

        }finally{
            out.close();
        }

        return tmpFile.renameTo(destination);
    }

    private static String getUniqueSuffixPerThread(){
        return Thread.currentThread().getId() + ".tmp";
    }
}

Note: On window, a file can’t be renamed to replace an existing file. There will be no error but the renameTo() method just return false to indicate fail attempt.

What I personally like the most about this approach is that I doesn’t require much change. I don’t need to change the execution flow and it doesn’t involve complicated logic which may require a lot of test cases.

read more

Related Posts

Share This