Sunday, 16 March 2014

Several approaches to processing events (part 2 - asynchronous processing with blocking queues and thread pool)

In the previous post we were solving problem with computing prime factors for numbers read from a large file.
We were taking a synchronous approach, which was quite easy to implement but very slow.
This time we'll try to go the asynchronous way using blocking queues and thread pool.

First of all, let's take a look at our Reader class:

The reader reads each line and puts it into the blocking queue. Once it's finished, it informs the latch.

The LineProcessor class is more complicated:

Let's take a closer look at the collaborators.

inputQueue is the queue that Reader is writing to.
successQueue and exceptionsQueue are the queues that will be populated based on the line processing result.
inputLatch is the latch modified by the Reader.
outputLatch will be informed when there are no more lines to be processed.

The LineProcessor checks if the Reader already finished by checking inputLatch and inputQueue.
If it didn't, it takes a line from the inputQueue and and populates appropriate queue with the result.
If it did finish, it informs outputLatch and terminates processing.

The Writer class is quite simple:

It takes the messages from the queue and writes it to a file.
It terminates, if the queue is empty and the latch was informed that there is no more messages.

The only thing left is the Bootstrap class that binds it all together:

The latches and queues are initialized and then passed to the processing objects via constructor.
Line processors are wrapped in a thread pool.
Reader and two writers are started as threads. We then wait until all writers are finished and the summary is printed.

We can check the behavior with the following test:

On a machine with Core2 Duo 2.53GHz it takes ~43 seconds to process 10 000 numbers.
The whole project can be found at github.

Wednesday, 26 February 2014

Scheduling tasks with Spring (updated)

Spring provides an easy way to schedule tasks.

Let's say that we would like to have an information about current time printed on a console periodically.

The class that prints the time may look like this:

We need to define a class that will encapsulate the printing task:

The @Scheduled annotation is the key here: the method reportCurrentTime is annotated by it, therefore it will be invoked every 5 seconds.
You can also specify cron expression. You can use fixedRateString parameter if you want to read it from properties file.
Please note setting of the thread name - it will be needed for the test.
Adding production code only for tests is generally not a good practice, but in this case it can also be used for monitoring purposes.

The spring configuration looks as following:

To run it we need to create an invoker class:

Unfortunately there is no trivial way to test it automatically. We can do it the following way.
Let's create a test class:

When we run the test, the spring context will start and the task will be invoked by scheduler.
We check if the thread with our name exists.
We do it for some time to avoid race condition - it may happen that verification method will be invoked before thread starts.
The whole project alongside with dependencies can be found on github.

Monday, 13 January 2014

Several approaches to processing events (part 1 - synchronous processing)

Let's consider the following problem that we have to solve:

We have a file with numbers (each line has one number).
Our goal is to process every number and count its prime factors.
We need to write the result along with the processed number in a separate file.
In case of any exception during processing we need to write the exception to a file together with the number that was processed by the time it occurred.
Apart from that we also need to write the summary of the time that we spent on processing.

The class that counts prime factors looks as following:

The simplest but not the optimal way to solve this would be to process each line synchronously. It could look like that:

We're reading lines from numbers.txt file and then we're processing each line in a for loop. At the end we're writing everything to three files.

The Reader and Writer classes are quite simple:

They're using Guava and Apache Commons dependencies that have following declaration:

We can check the results in a following test:

On a machine with Core2 Duo 2.53GHz it takes ~73 seconds to process 10 000 numbers.

The whole project can be found at github.

In next posts we'll take a look at other approaches to solve this problem.