Traditional Cache, Modern Cache
August 31, 2006
Cameron Purdy from Tangosl.com commented
Your examples are definitely “traditional” caching. We do see that a lot (e.g. hotel chains, as you said). There is a more contemporary approach to caching, which is to actually keep the data up-to-date in memory (real-time “single system image”), as opposed to having LRU or some other way to get rid of stale data. For a good example, see: http://wiki.tangosol.com/display/COH32UG/Cluster+your+objects+and+your+data Peace.
It is a very relevant comment. Cameron says that instead of using a database for persistant synchronized cache, we could use in-memory replicated cace like the product of his company Coherence.
This concept is not new – concept of travelling objects and external cache have neen there for some time. Though I havent tried Coherence myself, there is a possibility that Coherence has solved many of the stumbling blocks of the past.
Traditionally, we have considered and possibly used in-memory databases for some of our caching requirements. ( Remember TimesTen ? Now its owned by Oracle). These databases were either too expensive, or too flaky(using non standard queries, providing no guarantee of the data etc.). Then we discovered that all databases also started giving ways to pin databases to memory – to get significantly faster performance – and that became our preferred choice.
App servers also started providing session sharing – though performance cost of the same forced us to stop putting data in session, and putting it in databases instead.
There were also programming frameworks like Jini ( and more recently Javaspaces as a part of Jini) which enabled networked objects – available to all nodes in the Jini cluster. Jini never got much adoption.
Neither have cluster replication products.
I believe the reason for the same is that when most of us design, we design for a single server and IF the usage becomes so high, warranting big clusters cluster and performant caches, we then re-engineer parts of the application or Hardware to enable clustering ( for example, organizations prefer to introduce a NAS in the datacenter than to change the software to use Databases instead of file systems when clustering is introduced).
With products like Coherent, there could be a case where by paying 5K licensing fee, we could save 1-2 CPUs on the database and hence justifying the cost. But, i will be vary of this technology for the following reason.
Today, the architecture of choice on App layer seems to be a large number of small, blade servers clustered by a hardware load balancer. In this case, for a muti-way replication with such large number of nodes, we dont know how such technologies will perform. For vertically scaled applications ( running on lets say 2-3 machine having 4-8 processors each), such products have a very high changes of working. The second issue which I have is the whether the updates are ACID or are the BASE ( Well – base is a concept which never picked up – standing for Basically Asynchronous, eventially Synchronised i think), so do they leave some amount of margin for stale data? If not, if used in “Transactional” mode, do they really give better performance?
The onus is on the cache vendors to convince us.
I am sure they will – and if the products are actually as good, they should become popular. They seem to have a good client list already.
However, I would be worried about the life of such products. When this demand becomes mainstream, commercial App server vendors will have to provide the same functionality to survive – whether they OEM Camerons product or build something on their own.
Let the brutal theory of Survival of those “most responsive to change” decide the same.