Despite all of the choices one has for connecting applications and services within and across organization boundaries, message oriented middleware (see MOM in wikipedia) still remains the primary choice for architects and developers to reliably deliver messages. There are handful of messaging products on the market, including, IBM WebSphere MQ, Tibco EMS, Progress SonicMQ, Apache ActiveMQ, Pivotal RabbitMQ, Red Hat HornetQ (now forgotten), etc.
On a surface all of these products appear to do the same thing – deliver message from point A to point B in a secure and reliable manner, there are also non-reliable delivery options, as well as publish-subscribe models. However there are significant differences between these products in reliability, security, performance, admin capabilities and cost. Before you start implementing your enterprise project you need to understand the technical and cost limitations of the software you are going to use. I have described some of the common pitfalls of such decision process in my blog post: “How to NOT buy enterprise software”.
Back in November 2013 I set out to do a performance comparison between WebSphere MQ and Apache ActiveMQ. This was a fun project and I am still not completely done with tuning. In this first article I will describe overall approach to my performance project, as well as install and setup and load generation instructions in case you ever wanted to run your own tests. Yes, I will share complete instructions in Open Source style, including full shell scripts for install and load generation. After you read this, it should be “a breeze” to run your own benchmark (just kidding, performance testing is never easy as there are too many variables to control).
Performance benchmarking could be complex and time consuming. For one thing, there are many different choices for possible configurations of hardware, software, load drivers, test applications, etc. No matter the choice, I am sure there will be folks criticizing my selections. Diagram below shows configuration of the test environment used for the benchmark.
- Host server used for the benchmark is IBM x3950 M2 server model 7233-AC1 (circa 2009), 24 cores Intel Zeon X7440 2.66 GHz, 256 GB RAM. Host machine had four 300GB SSD drives.
- Client load driver and messaging servers’ software ran in OS guests powered by Red Hat Enterprise Linux 6.5, 64 bit, kernel 2.6.32-431.el6.x86_64 under the control of VMware ESXi Hypervisor 5.0.
- Each VM has 8 processor cores and 32 GB of RAM.
- I used the latest versions of the messaging software: IBM WebSphere MQ v220.127.116.11 and Apache ActiveMQ 5.9.0 running on Oracle JDK 18.104.22.168 (all of the above 64 bit).
- Load driver used for the benchmark is based on IBM Performance Harness for JMS (see Requestors and Responders in the diagram above and see perfharness.sh in the project download).
- For persistent messaging there are total of 80 remote requestor threads (in 4 different perharness processes, each connecting to its own message server) and similarly 80 responder threads (local to the server). Since I had 4 SSDs and 8 cores on the server, the best performance results were when I ran 4 instances of WMQ and 4 instances of AMQ.
- Each server instance had 5 request queues and 5 reply queues, overall 40 queues for the benchmark overall. Persistent test was run with transactional JMS messages for both WMQ and AMQ.
- All three virtual machines were interconnected via private isolated virtual network using VMXNET3 virtual adapter.
- You may want to see the video of how all this works in this blog post.
Few words on performance test methodology
Before I get into the specifics of running this test, I would like to make few general observations about the performance testing. Before doing any kind of tuning, you need to run the baseline performance tests and understand your machine physical limitations as well as your out of the box performance.
- Isolate your workload from everything else. Isolate your network, isolate your disks and processors. If you do not, you may get order of magnitude difference in results and not be able to explain it. In my test I was the only one using the server with local disks and private in-memory network between my VMs.
- Test network speed between client and server. This is important as network may be your limiting factor (which it was for me at some point). There are many performance testing tools available for network. I used free and easy to use iperf. Even if you have 10GB Ethernet in your servers, it does not mean you are going to get all that speed. As a matter of fact, so far I only got 3.1 Gbit/sec on my VMXNET3 adapter (about 388 MB/sec) and this is only good for persistent messages as I am only using about 40% of the network capacity. Rate and network requirements for non-persistent messages are 2-4 times higher and I need to double or triple my network speed before I can continue testing non-persistent scenarios. For now I maxed out my network capacity with WebSphere MQ, but still have about 25% of CPU idle.
- For persistent messaging tests you must know raw performance of your disks. I use 4 SSDs for each VM and tested disk read/write by copying and measuring times for many small and large files from one disk to another. This gives me about 744 MB/sec from all four disks combined. This is more speed than I get from my network.
- Monitor all key characteristics of your system while you are testing performance. At a minimum you must log and monitor (a) CPU, (b) Disk, (c) Network, (d) Memory of your client driver and servers. Ideally you also need to monitor other parameters, such as log files, message content (to make sure you are not testing error rate), etc. If you are truly trying to test maximum performance, you shall max out on one or more of these parameters (i.e. 100% of CPU or disk or network). If you aren’t at 100% it generally means that there is a limiting factor that can be improved or tuned to achieve 100% in at least one of those metrics, meaning you got the most out of your hardware system.
- Measure a baseline for each iteration of the test and potentially with increasing number of input files or messages or concurrent users. For example – from 1, to 10, 20, 50, 100, 200, 500, 1000, 2000, 5000 input messages or client threads.
- After the baseline tests are done, start tuning the environment, one tuning variable at a time and retest. If you change two variables and have better or worse performance – you wont know which of the two had the impact. Hence only change one thing at a time and retest.
- Clean system state between test runs – this could mean cleaning queues from any remaining messages, resetting the database. After some periods of time (if not between tests) you need to restart your JVMs and queue managers and even OS and hypervisor. Long stress tests can cause side effects in all layers of the software stack and you need to reset your environment once in a while.
- If time permits, test with different request / response message sizes – for example, 1K, 10K, 100K, 1MB, 10MB, etc. Your application requirements should be driving the kinds of tests and message properties to be tested.
- Stress test – find where the system breaks – max number of users? max size of the message? etc…
- Long run test – run workload over 24 hour (or more) period to see stability of the product. 1 or 2 minute performance may not be representative of real workloads as effects of garbage collection, memory leaks, etc. will not be stressed in 2 minute tests.
- Scalability test – add more instances – JVMs on a machine (or threads, execution groups, etc.) or even instances across multiple machines.
- I highly recommend that you read this book, which will give you one of the best methodologies I have seen for performance testing: “Performance Analysis for Java websites”. You may also want to read the IBM WebSphere MQ performance reports as well as other performance reports. Google is your friend.
- Last, not least – automate your test runs! This is key for repeatable and predictable results. You need to automate everything – starting your servers, running tests, cleaning system between tests, etc. This is exactly what I have done in my scripts as you shall see later in this post.
- Oh, one more thing. The performance testing could be a slow process mainly because it is so iterative and manual. However I believe it is possible to solve performance tuning problem automatically. For the time being I have not gotten around to implementing this just yet, but here is my idea on how this can be done quickly and efficiently using neural networks or mathematical programming and optimization.
In the full spirit of openness I am publishing all of the installation and configuration steps, tuning options as well as automated shell script to generate the load. You can find it in my google doc:
I am lazy (in a good way), therefore after having worked on this project for a bit, I quickly got tired of manually changing client and server configuration, starting and stopping servers, synchronizing TCP and load driver tuning options across all of my VMs, etc. So I wrote an automated script that does it all in one step. To execute fully automatic test all you need to do is to login into the client VM and run this command: ./run_all.sh. This command will copy latest configuration files from your client load driver to both servers, start queue managers, start responder threads and start iostat command to log CPU, memory and disk usage into the log files on all three VMs, start requestor threads, iterate over multiple message sizes and tuning settings and finally consolidate performance results from the multiple output files into a single number per test. Phew, that was a lot of automation :-).
For those not interested in reading install doc, you can simply have a look at the tuning configuration and load scripts in this Dropbox folder:
I am doing final tuning of both WebSphere MQ and ActiveMQ and will publish results in a few weeks on this blog. ActiveMQ results seem to be a bit too slow and I am using several performance guides from multiple sources trying to tune it before I make my numbers public. I am giving it the best I can. In the meantime, please let me know if you have done similar performance tests and what your experience was.
- WMQ vs AMQ persistent performance results (part 3)
- Video of the performance testing in action
- 60 second install of WebSphere MQ
- WebSphere Message Broker vs. Oracle Service Bus performance benchmark
- IBM (still) delivers more performance at lower cost – response to the Oracle’s latest (misleading) performance claims