Wednesday, January 15, 2014

Zero Downtime of Coherence Infrastructure (24x7 Availability) as part of Planned Deployment Strategy


Coherence is a
 reliable in-memory data grid product offering OOTB failover & continuous availability with extreme scalability. But we at times, face challenges during Coherence deployment and tend to lean towards clean restart of entire Coherence Cluster. This defeats the purpose of 24x7 availability of data grid layer and eventually the availability of dependent applications as well.
I came across this discussion with several people and hence sharing my thoughts on the entire Coherence Deployment Strategy, which does not require any downtime ensuring continuous availability.

In my opinion, there are particularly three high-level scenarios with respect to Coherence deployment:

Scenario 1 - Deployment of Application, which is using Coherence Data Grid Layer
  • Problem Statement: Typically, this is the case when there are multiple web or native applications backed-up by Coherence data grid layer. Often, infrastructure team tends to restart Coherence Cluster during the process of deployment causing downtime to cache layer & eventually the entire application. This causes extended downtime of entire application (even hours) as clean restart of Coherence usually takes time.
  • Solution Approach: 
    • As a best practice, Coherence Cluster shutdown & restart should be avoided wherever possible. Coherence does not require to be cleanly restarted unless there are changes in libraries (which is second scenario below).
    • If there is requirement to clean-up existing Cache entries and replace them with new cache entries, then it is more of a change in application version maintenance of cache items than cache system. Typically, each cache item can have version information (getter method like getVersion()) attached to it and post deployment, previous version entries can be discarded by the application. 
    • You can also refer to Cache Invalidation Strategies, which comes as an OOTB feature in Coherence.
Scenario 2 - Deployment of Application with updated Coherence Application Libraries, which is using Coherence Data Grid Layer
  • Problem Statement: This scenario is applicable in cases where there is usage of Coherence Application Cache particularly where read-through or write-through patterns are implemented. In this case, application specific JAR files or libraries need to be updated on Coherence Nodes & hence infrastructure team tends to shutdown entire Coherence Cluster with clean restart.
  • Solution Approach: 
    • As a best practice, Coherence Cluster shutdown & restart should be avoided wherever possible.
    • A cyclic restart (or rolling restart) can help in this case along with version based maintenance & cache invalidation strategies of cache items (as explained in scenario 1).
    • Note that invalidation or cache item clean-up plays critical role as even if Coherence Nodes get restarted, data will get automatically backed-up in data grid layer (by other nodes). In essence, failover feature is acting against clean deployment in this case and hence need to be careful in clean-up approach in this case.
Scenario 3 - Coherence Configuration Change as part of Deployment
  • Problem Statement: This scenario is applicable in cases where there are changes in Coherence Configuration (Cluster Configuration or otherwise). Note that even if there is any difference (even minor) in configuration of any Coherence Node, it will get rejected by Coherence Cluster. For example, if there is change in Security Configuration (using override file) or TTL change or Coherence Edition Change.
  • Solution Approach: 
    • The easiest approach is to shutdown entire Coherence Cluster (JMX monitoring can help to make sure all Coherence Nodes are down) and post configuration change, restart all nodes. But it defeats our purpose of ZERO DOWNTIME.
    • If Zero downtime is needed, then we need to:
      • Setup an entirely new Coherence Cluster (e.g. by assigning a new multicast IP address or change of mutlicast port)
      • Make Configuration Changes &  do fresh deployment on new cluster
      • Do cyclic restart of dependent application servers using new Coherence Cluster setup
      • Discard Old Coherence Cluster post migration of old applications to new Coherence Cluster
There can be multiple other deployment scenarios possible but they can be variation of scenarios described above (at least in my mind).

Hope it helps to all those people who are seeking Zero Downtime Deployment without paying extra for other products like Oracle GoldenGate to achieve the same.

Disclaimer:

All data and information provided on this site is for informational purposes only. This site makes no representations as to accuracy, completeness, correctness, suitability, or validity of any information on this site and will not be liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use. All information is provided on an as-is basis.This is a personal weblog. The opinions expressed here represent my own and not those of my employer or any other organization.

Tuesday, May 7, 2013

Comparing Open Source Web Services Development Framework (Apache CXF, Apache AXIS, Spring WS)


This blog does not try to compare all available Web Services Development Framework but focuses only on three popular approaches regarding Open Source frameworks: Apache CXF, Apache AXIS & Spring WS.

Let's look at positives & concerns in nutshell of each of these frameworks:

Framework
Key Positives
Key Concerns
Apache AXIS2
❶ Most Commonly Used, Matured & Stable Web Services Development Framework

❷ Supports Multiple Languages (C++, Java)

❸ Supports both Contract-first & Contract-last Approach

❹ In context of Orchestration & Web Services Transaction (long-running transactions) it supports wide variety of related WS-* specifications:
WS-Atomic Transaction, WS-Business Activity, WS-Coordination, WS-Eventing, WS-Transfer
Compatible with Spring Framework
❶ Comparatively More Code Required/Generated w.r.t. Spring WS/CXF

❷ Is being phased out gradually (mostly by Apache CXF)

❸ It is not fully compliant for JAX-WS JAX-RS
Apache CXF
❶ Most widely used Web Services Standard Now; Improvement over AXIS2, which is now gradually being replaced by Apache CXF

❷ Intuitive & Easy to Use (less coding required as compared to AXIS2)

❸ Clean separation of front-ends, like JAX-WS, from the core code

❹ Fully compliant with JAX-WS, JAX-RS & others

❺ Best Performance across all available framework with minimum computation overhead

❻ Supports wide variety of front-end models

❼ Supports both JAX-WS & JAX-RS (for Restful Services)

❽ Supports JBI & SDO (not supported in AXIS2)

❾ Compatible with Spring Framework
❶ Does not support Orchestration & WS Transactions yet

❷ Does not support WSDL 2.0 yet
Spring WS
❶ Best in terms of supporting Contract-first Web Services Development Approach

❷ Enforces Standards & Best Practices by Framework Constraints (no way out of it & hence limitation as well)

❸ Supports Spring Annotations as well as JAX-WS

❹ Least code from developer’s perspective

❺ Best Aligned with Spring Technology Stack (also similar architectural stack as Spring MVC) including Spring Security
❶ Least number of WS-* Specifications supported (does not fully compliant with JAX-WS)

❷ Spring offers itself as standard & hence other Java-compliant frameworks support better standards support

❸ Only support Contract-first Web Services Development Model


I have carried out further detailed analysis (using a scorecard) to grill these frameworks further & came up with following scorecard:



 

Conclusion


  • Apache AXIS2 is relatively most used framework but Apache CXF scores over other Web Services Framework comparatively considering ease of development, current industry trend, performance, overall scorecard and other features (unless there is Web Services Orchestration support is explicitly needed, which is not required here)
  • Though Spring Core Framework is an established technology, Spring Web Services is still evolving in comparison with Apache CXF and CXF has wider support of standards and proven framework from performance perspective.
  • Hence, Apache CXF is the recommended framework and clearly the most preferred in my opinion.

Disclaimer:
All data and information provided on this site is for informational purposes only. This site makes no representations as to accuracy, completeness, correctness, suitability, or validity of any information on this site and will not be liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use. All information is provided on an as-is basis.This is a personal weblog. The opinions expressed here represent my own and not those of my employer or any other organization.



Friday, June 1, 2012

Code Quality Process/Guidelines for Development Team


Code Quality Process needs to be established at the start of development cycle so that all stakeholders (developers, PM, architects, etc.) will be aligned to same objective of delivering quality code.

The following Code Quality Process is based on my learnings and can be customized further to cater as per any project needs:


Also, to track Code Quality throughput development cycle, a tool (excel based) like this below can be used by team pro-actively:








Tuesday, December 6, 2011

Evolving Architecture Formulization Process in Enterprise 2.0 way


Software architecture in currently in practice across the organizations is rapidly adapting itself, either in organic or in-organic way in response to dynamic organizational environment today with massive social media presence.  Many organizations have recognized the presence of Social Media Platforms by embracing formal enterprise 2.0 platforms whereas others have at least started recognizing the importance of the same. Software architecture practice is also positively evolved & reached into a new era where way of interaction, presentation & feedback for architecture are changing day by day.

Entire architecture development process has revolutionized with Enterprise 2.0 in place and in my opinion, its impact on architecture can be outlined as below at abstract level:

 Social Networking
  Social Networking Software specially targeted for architect’s community can help share relevant experience informally in unstructured way without any boundaries. It can span within organization, across organizations & across the globe. Social networking sites like Facebook, LinkedIn are great example of people talking informally about architectural & design decision informally.

 Social Collaboration & Feedback
 Social Collaboration is the most happening & useful resource in the field of architecting within an enterprise. Internal wikis, blogs, collaborative office & virtual worlds are key features in this space. One of the greatest examples I have observed developing best practices guide for architecting eCommerce application from experts across the enterprise collaborating through Social Collaboration platform built using Jive.
Social feedback inside enterprise about architectural decisions & best practices are becoming common practice on day-to-day basis for enterprise adopting Enterprise 2.0. This elevates quality of artifacts & also helps in diminishing boundaries between developers, architects & other senior management professionals giving feedback on the same platform.

 Social Media
Sharing architectural artifacts have never been easy with tools in social media categories like social tagging, bookmarking & content posting like Flickr, Digg. This aides to further communication in promoting links, bookmarks, etc, such as enterprise architecture goals, objectives & vision, which are important for architect’s community in the enterprise.

In summary, software architecture needs to keep adapting itself to ever changing environment and adequate support is needed at enterprise level by management to support the same to evolve with social media in place. Recognizing its presence will not only increase productivity of architecture development but also will develop culture of immediate feedback & open culture.

Disclaimer:
All data and information provided on this site is for informational purposes only. This site makes no representations as to accuracy, completeness, currentness, suitability, or validity of any information on this site and will not be liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use. All information is provided on an as-is basis.This is a personal weblog. The opinions expressed here represent my own and not those of my employer or any other organization.

Thursday, August 11, 2011

Oracle Coherence Best Practices for Session Management (Replication) in Clustered Application Servers Environment

Oracle Coherence is a in-memory data grid product, which is also being used commonly for Session Replication across cluster of application server nodes. It supports wide variety of application servers like WebLogic, WebSphere, Tomcat, JBoss, etc. Coherence*Web is Session Management module (built on top of Coherence) used for managing session information in clustered environment.

I would recommend following best practices w.r.t. Coherence*Web & Coherence usage particularly for Session Management (it can also be applied in other Coherence scenarios):

Coherence Deployment Topology

Coherence supports three deployment modes:
¨     In-process - Application servers that run Coherence*Web are storage-enabled, so that the HTTP session storage is co-located with the application servers. No separate cache servers are used for HTTP session storage.
¨     Out-of-process - The application servers that run Coherence*Web are storage-disabled members of the Coherence cluster. Separate cache servers are used for HTTP session storage.
¨     Out-of-Process with Coherence*Extend - The application servers that run Coherence*Web are not part of a Coherence cluster; the application servers use Coherence*Extend to attach to a Coherence cluster which contains cache servers used for HTTP session storage.
Recommendation:
If there is need for Coherence to extend its boundaries beyond core Coherence TCMP (internal protocol used by Coherence), use Coherence*Extend, which supports Java, .Net & C++ clients.
In most of the scenarios, out-of-process is recommended topology because it has dedicated cache server nodes running independently promoting loose-coupled physical architecture.

For Session Replication, sharing associated application server memory (heap) with Coherence using in-process deployment creates dependability. If application server memory usage increases, it will impact Coherence performance as well & vice-versa.

Please make sure the following for Out-of-process configuration:
¨     Application Server Nodes are running in Storage-disabled mode. You need to pass both of these command-line parameters (or by using Coherence over-ride file) to application server JVM :

-Dtangosol.coherence.session.localstorage=false
-Dtangosol.coherence.distributed.localstorage=false


Please note that setting session storage property explicitely is needed as by default it is “true” in “session-cache-config.xml”:
…………………. 
<local-storage system-property="tangosol.coherence.session.localstorage"
………………….
 ¨     Coherence Dedicated Nodes need to be storage enabled (otherwise there is nobody to store session attributes) and should either use “session-cache-config.xml” or custom cache configuration file with session cache configured in it:


java –Xms512m -Xmx512m -cp /usr/local/coherence_3_6/lib/coherence.jar:/usr/local/coherence_3_6/lib/coherence-web-spi.war:/usr/local/coherence_3_6/lib/commons-logging-api.jar:/usr/local/coherence_3_6/lib/log4j-1.2.8.jar
 -Dtangosol.coherence.cacheconfig=../../../webapps/example/WEB-INF/classes/session-cache-config.xml -Dtangosol.coherence.log.level=6 -Dtangosol.coherence.ttl=2 -Dtangosol.coherence.log=log4j -Dtangosol.coherence.edition=EE
-Dtangosol.coherence.session.localstorage=true com.tangosol.net.DefaultCacheServer


Coherence Cache Topology

Coherence supports five different types of Cache based upon four cache topologies:
¨     Local Cache Topology: Local Cache
¨     Partitioned Cache Topology: Distributed (or Partitioned Cache)
¨     Replicated Cache Topology: Replicated Cache, Optimistic Cache
¨     Hybrid Topology (Local + Partitioned): Near Cache

You can use following simple guidelines in choosing appropriate type of cache:

Scenario
Recommended Cache Type
·         You need faster read & write
·         You don’t need fault tolerance (caution: no fault tolerance)
Local Cache
·         You need faster read with best fault tolerance
·         Write will be comparatively good but will have latency for copying updated data across
·         Typically used to store metadata or configuration data
Note: Scale-out (horizontal scalability) can not be linear.
Replicated Cache
·         You need faster write but best fault tolerance
·         Read will be comparatively faster but depend on whether it reads from local or remote node
Partitioned or Distributed Cache
·         You need faster write but best fault tolerance
·         Read will be comparatively faster but depend on whether it reads from local or remote node
·         Affinity boost performance of read-heavy application with moderate writes
Near Cache (backed up by Partitioned Cache)


Executing Production Checklist

Coherence recommends executing list of checklist on production environment to make sure environment & infrastructure has recommended settings/configurations particularly in following areas:
¨     Network:
·         Multicast Test: If you are using multicast clustering, this test is must to make sure multicast configuration is correct & working properly.
·         Datagram Test – Before deploying your application, you must run it to make sure that there is no packet-loss in your network. Note that in 1GbE network, you should use 100MB packets for testing & minimum (not average) success rate should be close to 100% (~98-99%)
·         TTL – It is very important setting for multicast network & usually 2-5 is recommended value in production environment
¨     Hardware, OS & JVM Settings
¨     Coherence Editions & Modes:
·         Needless to say, Coherence mode should be PROD in production environment. It needs to be specified on command-line as override configuration file can not be used for Edition & Mode.
-Dtangosol.coherence.mode=PROD
·         By default, Coherence runs in GE (Grid Edition) & it is very important to use appropriate edition (as per your license & needs) to specify the correct edition.
-Dtangosol.coherence.edition=EE
Note that all the nodes in cluster should use same edition.


Executing Performance Tuning Guidelines

Coherence suggests tuning for: OS, Network, JVM & Coherence Network.

Please refer to Coherence Performance Tuning guidelines (reference section) for more details.

Enable JMX for Monitoring Coherence

Coherence provides OOTB support for JMX-based monitoring for its cluster, nodes, caches & others.
It needs at least one node to act as manager & rest of the nodes in cluster can publish their statistics using JMX.

For management node,

-Dtangosol.coherence.management=all -Dtangosol.coherence.management.remote=true
-Dtangosol.coherence.management.jvm.all=false  -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port= -Dcom.sun.management.jmxremote

For other nodes, you can simply remove tangosol.coherence.management” command-line parameter.
Also, note that in above case, JMX authentication is not enabled (which needs to be secured) & JMX port needs to be specified as well.

Using Log4J for Coherence Logs

Though Coherence has its own logging mechanism, Log4J is more beneficial in terms of log rotation & controlling appropriate log levels.
Note that you can use both Coherence Log Level parameter (-Dtangosol.coherence.log.level) & Log4J configuration for logging level.

Follow these steps to enable Log4J for Coherence:
¨     Coherence does not have log4j libraries & hence you need to add following jars to classpath:
a.       Copy “commons-logging-api.jar” & “log4j-1.2.8.jar” to /lib folder
¨     Create/Modify your Log4J XML file & put that in classpath of your Coherence JVM.
¨     Set command-line parameter (or use override file) to specify log parameter value as “log4j”.


Note that Coherence assumes that Log4J XML will have Logger Name as “Coherence” otherwise you need to specify logger name by having separate parameter “tangosol.coherence.log.logger”.

Example:

Cache Server Startup Script


JAVA_OPTS="-Xms$MEMORY -Xmx$MEMORY -Dtangosol.coherence.log.level=6 -Dtangosol.coherence.log=log4j -Dtangosol.coherence.log.logger=MyCoherence"

$JAVAEXEC -server -showversion $JAVA_OPTS -cp "$COHERENCE_HOME/lib/coherence.jar:$ "$COHERENCE_HOME/lib/commons-logging-api.jar:$ COHERENCE_HOME/lib/log4j-1.2.8.jar" com.tangosol.net.DefaultCacheServer $1


Log4J XML

......................
<logger name="MyCoherence">
  <level value="3"/>
  <appender-ref ref="CoherenceAppender"/>
....................

Review Coherence*Web Context Parameter

There are several Coherence Web Context Parameters, which need to be adjusted when you are installing Coherence*Web in your web application, particularly following:
¨     coherence-enable-sessioncontext
¨     coherence-session-id-length
¨     coherence-session-urlencode-enabled
¨     coherence-session-thread-locking
¨     coherence-sticky-sessions
¨     coherence-reaperdaemon-assume-locality
¨     coherence-enable-suspect-attributes

Note: These parameters are configured in web.xml & got instrumented when Coherence*Web install utility is invoked.


Using Coherence as L2 Cache Provider

Coherence can also be used as L2 Cache provider for ORM frameworks in-use. Having Coherence as L2 cache enables enterprise level caching for your ORM L2 caches as well.

To configure the same, you need to specify Coherence as L2 Cache Provider (particularly for Hibernate L2 Cache):

¨     Specify Coherence as L2 Cache provider in Hibernate Configuration file:


<prop key="hibernate.cache.provider_class">
com.tangosol.coherence.hibernate.CoherenceCacheProvider
</prop>


¨     Configuration for Hibernate L2 Cache is loaded based on following parameter. There is default L2 Cache configuration file already in place.



-Dtangosol.coherence.hibernate.cacheconfig = /hibernate-cache-config.xml


References




Disclaimer:
All data and information provided on this site is for informational purposes only. This site makes no representations as to accuracy, completeness, currentness, suitability, or validity of any information on this site and will not be liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use. All information is provided on an as-is basis.This is a personal weblog. The opinions expressed here represent my own and not those of my employer or any other organization.



Thursday, June 2, 2011

Is cutting-edge technology the right solution for all business problems?


Technology is fast-paced & keeping up your application stack with latest technology stack available is one of the common problems architects face in today’s world. At one end, vendors keep on pushing their clients to upgrade to latest version or latest patch to make their life easier. On the other hand, existing technology stack becomes outdated & lucrative new options baffles technology people how to go about it!!
There is no black and white answer to this problem, but these key factors help in making the right decision:
  1. Context:
Unless you understand the context, it does not make sense to make any sensible suggestion at all to any application or enterprise architecture. For example, for a travel website, there is constant competition with other websites to keep experimenting new features, which in turn needs cutting-edge technology or something latest in the market (like mashups). But it might not be a need for stock-trading application.
So, it is never “one-solution-fits-all” approach that works in technology!!

  1. Problem Area:
Regular architecture reviews of your application might reveal certain problem areas (e.g. caching system needs to move to enterprise caching) or there is a specific business problem to solve, which needs new technology (e.g. using Amazon S3 for storage).
So, 2nd point is to pinpoint strong problem area, which demands change in technology (upgrade or replacement).

  1. Business Needs:
Thirdly & most importantly is business need as business stakeholders will be paying for extra cost &  hence there has to be very strong business benefits to be realized (e.g. integration with third-party using ESB instead of hub-spoke model). Unless business gets convinced that they will get strategic benefit out of it, it is not possible to sell any technology to them.
So, 3rd  point is to discover business needs which backs up technology.

  1. Technology Needs:
Last but not least, there are technology needs as well (e.g. your Oracle 10.0 will be out of support soon & you need to upgrade to Oracle 11g to get customer support), which can also be a key driver. Also, certain business problems (like lower TCO) can only be solved by implementing strategic major technology solution (like top-down/bottom-up SOA stack).
So, 4th point is to discover technical needs in your software or application stack.

In summary, needs create technology not technology create needs & hence all the above mentioned points make sense whenever you are going to make any decision on applying new cutting-edge technology. Also, keep in mind that proven technology works best & experimentation might be risky for business continuity. But make sure that innovation (which specially need new technology) does not get killed which makes you one-step ahead in this competitive world!!

Friday, November 26, 2010

In-Memory Data Grid (IMDG) for Linear Horizontal Scalability & Extreme Transaction Processing (XTP)

IMDG products offers the capability to handle transactions in-memory (so faster) & facilitates creating a data grid (so linear scalability) for managing extreme processing (XTP) needs.
There are both commercial & open-source products available in market but before deciding upon the product or even using IMDG as technology, I would recommend considering following design/architectural points first:
1. Your needs: First & foremost, like any other solution, this is the most important factor to determine whether there is a need for such product or not. License cost for commercial product can cost a lost (~ $ 2500/- per processor for enterprise edition) and hence cost-benefit needs to be assessed first. I don’t see the need of it if there is no requirement for XTP (e.g. not needed for < 200 TPS).
2. Parallel Computing:
A distributed grid can offer processing ability similar to a mainframe processing utilizing cumulative capacity of the nodes in the grid. Processing can be seamlessly distributed across available nodes facilitating “parallel query” execution for faster response.
3. Caching Needs:
All the caching needs can be fulfilled using IMDG products and they offer support to all types of caches, e.g. distribute cache, replicated cache, partitioned cache, local cache with distribute as backup cache. But if you have just caching needs, then you are better off with specific caching related products (see at the end of the article).
4. Events based Processing Needs:
IMDG products support Complex Event Processing (CEP) based business architecture & ability to consume many events in scalable way.
5. High Availability (Failover support):
Failure of any node does not impact the cluster of nodes and as soon as failed node comes back, it starts contributing again seamlessly (without any configuration change or manual efforts). Also, real-time configuration change (e.g. changing cache high units) or product upgrade is possible without any downtime.
6. Scalability:
If there is need to add more nodes to your grid, it is seamless without any impact on existing grid. Mostly, IMDG products offer capabilities to be “linear scalable” to take full advantage of added capacity.
7. In-memory Database (IMDB) Support:
It also offers the entire database to be maintained in memory for faster response, throughput & performance. All the transactions can happen in memory and persisted asynchronously to database during non-peak hours.
8. Monitoring & Management:
Some products great real-time monitoring & management capabilities (also with JMX support) and it is very handy in troubleshooting or in finding out bottlenecks for improvements.
9. In-line with Cloud Computing:
With cloud computing as future, this is more important as it can offer “data as service” or “data virtualization”.

Commercial Products:
Oracle Coherence (earlier known as Tangosol Coherence), Gigaspaces XAP, IBM WebSphere eXtreme Scale (WXS), Tibco ActiveSpaces (recently launched), ScaleOut StateServer

Open-source Products: 
Terracotta, JBoss Infispan, Hazelcast

Other Distributed Caching Solutions are also available but in my opinion, they are not exactly offering entire IMDG capabilities, but you have only caching needs then they are worth considering (though out of context of this discussion):
NCache (only for Distributed Caching for .Net)
Apache JCS, Terracotta EhCache, OpenSymphony OSCache

Disclaimer:
All data and information provided on this site is for informational purposes only. This site makes no representations as to accuracy, completeness, correctness, suitability, or validity of any information on this site and will not be liable for any errors, omissions, or delays in this information or any losses, injuries, or damages arising from its display or use. All information is provided on an as-is basis.This is a personal weblog. The opinions expressed here represent my own and not those of my employer or any other organization.