Failover demos of WebSphere MQ and Apache ActiveMQ


What are the primary reasons to use Message Oriented Middleware (MOM) in an application today? Why would you install, configure and manage this extra “layer” in your already complex architecture? Why not use HTTP, or sockets, or IIOP, or POP3, or Atom, etc. There are several useful properties that MOM provides, including:

  1. Asynchronous messaging
  2. Reliable transactional message delivery (best effort is also supported)
  3. Publish subscribe (in addition to point to point)
  4. Decoupling message producers and consumers logically and physically with the ability to route and transform messages as they pass thru
  5. Support for many programming languages, platforms and network protocols

While some of the aforementioned properties can be gained from HTTP and other protocols, there is no alternative to the MOM when you need #2 (short of using distributed transactions with databases) or when you need a combination of all of the above. If you do not need #2, then arguable MOM still provides benefits over other protocols by the flexibility and the tool support to achieve your messaging goals. Therefore my conclusion is that reliable message delivery is perhaps the most critical quality of the MOM product. This means no duplication of messages, no losses of messages, although it may be ok to have a delay, but eventually you will deliver the messages to its destination(s).

Edison Group recently tested reliability of WebSphereMQ and Apache ActiveMQ under different kinds of power and network failures and recorded their findings in these two videos:

WebSphere MQ failover video:

Apache ActiveMQ failover video:

Here is the summary of the failover tests shown in both videos above:

failover_summary

Some of the findings are documented in the ActiveMQ JIRA and are known in the community.



Categories: Technology

Tags: , , , , , ,

13 replies

  1. A behavior of the ActiveMQ Master/Slave depends of the configuration of the NFS connection. For example if I mount a shared storage with the `soft` option and break the connection to the storage the master is crushed. Administrator must restore master by hand but there isn’t message duplicates.

    It is necessary to configure the NFS connection to the store thoroughly.

    Like

  2. Please, describe the exactly configuration of the ActiveMQ in this test. Configurations of persistence providers and the broker network.

    Like

  3. Some feedback on this AMQ fail-over test. The video doesn’t mention it, but I’m assuming they are using KahaDB as your persistent store. If so, to avoid the master/master scenario you’d want to add in a configuration property:

    http://activemq.apache.org/pluggable-storage-lockers.html

    The reason for the messages is because is they had 2 brokers writing to the same data store and one was overwriting the other. Obviously not a good thing, but just as there are default settings in WMQ you need to alter for your purpose, the same goes for the default AMQ configuration.

    It would be worth re-doing your own test after tuning the locking as I think you’ll see more desirable results. After this sort of scenario, you may be able to access the console on both brokers, but only one should be readable/writable.

    No clue why Scenario 4 was included. How is does not reading documentation = lack of feature? Marking it fail is pretty unfair. You’ll find clustering in AMQ is often referred to as a network of brokers (http://activemq.apache.org/networks-of-brokers.html). There are numerous topologies that can be configured and its very flexible. You can mix the same HA solutions with AMQ as you can with WMQ.

    I’m a fan of both providers and have used both extensively in large enterprise environments, but this comparison is pretty sloppy. If anyone is reading this the decide on AMQ or WMQ, I’d recommend looking elsewhere or doing a bit of experimenting yourself.

    Like

    • Eric, thanks for your comments. Since I did not do these tests myself I can’t answer what changes to the default AMQ configuration were done as part of this test, but I will contact Edison Group and perhaps they can answer your questions. I do know that they have not simply used default configuration and spent a lot of time tinkering with settings and reading documentation. In any case – thanks again for your comment and lets see what Edison Group folks have to say about this.

      Like

    • Eric,

      We appreciate your feedback on the Video and test results. The main purpose of these initial video comparisons was to show a simple failover scenario trying to compare apples to apples (ActiveMQ’s Master/Slave and WMQ’s backup queue manages) and how they react to a failover scenario in a controlled environment (i.e. Network Disruption by disabling the NIC in our VMWare environment). You are correct that we are using the KahaDB with base configuration for this specific test and we are using an NFSv4 File Share on a separate server. With that said, we have also previous performed this test many times trying to tune the configuration options that are also defined in the link that you sent. For sanity purposes, we spend the past couple days re-validating these results.

      We do feel this is a limitation of the ActiveMQ specific to this failover scenario (Network Disruption) based on our testing as well as seeing similar issues posted in user forums. To be fair, if the original failover test was a power outage (or just stopping/starting of the ActiveMQ process), the original Master would come back up appropriately as a “Slave”. This scenario also validated that the locking and configuration is setup and works in certain cases.

      Lastly, In regards to your comment on Scenario 4, this was not actually part of the Video or results (Scenarios 1-3 should have only been shown). This was part of a larger comparison effort. Based on some of our initial testing, research, and experience, we do see that there are differences and advantages using WMQ clustering vs ActiveMQ’s Network of Brokers, but I will follow up in a separate reply to address this in more detail.

      Bill

      Bill Karounos
      Edison Group – Messaging Analyst

      Like

      • Bill

        Thanks for the explanations and clarification.

        I do not like AMQ’s default handling of the scenario that tested for master/slave. Even when configured, it can be very confusing to understand what is going on as the broker actually remains running with many of the supporting components still up. In this scenario, its possible that some clients won’t even know to reconnect if they are clients that maintain a persistent connection and the transport connectors did not go down. The test in the video would not have ran into this, but its worth noting.

        So I’d agree that WMQ’s implementation of master/slave is much more trustworthy as a fail-over solution.

        I’m glad to hear that the 4th scenario should not have been included. The two clustering solutions are a bit difficult to compare like for like …………… (deleted by the moderator)…………

        Like

      • Eric,
        Thanks for your comments. I would love to learn specifics of why you had the specific point of view in your last paragraph. The reason I moderated is because it was an opinion without evidence – contrary to what we have seen in the lab and user forums. If however, you have a chance to share technical evidence to support your opinion I will be happy to post your comments with those technical details.

        Like

Trackbacks

  1. White paper: IBM WebSphere MQ 7.5 versus Apache ActiveMQ 5.9 « WhyWebSphere.com Blog
  2. IBM MQ vs. Apache ActiveMQ performance comparison update | WhyWebSphere Blog

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: