Scalability of Method Calls as Database Transactions

Source: Distributed Objects mailing list
Date: 02-Jan-98

Related Sites


------------------------------

o-< Problem: Should every method call on a distributed object be a database transaction? Is reading and storing the state of the object for every method call necessary for efficiency and scalability?


---------------

o-< Roger Sessions used the Elvis scenario to explain this view (a recap from the previous tip):

Let's say I have an Elvis object, and I want to ask him to shake, rattle, and roll, in that order. [...] each method is dependent on the state changes of the others.

[...]

Now assume we our using MTS [Microsoft Transaction Server] and its automatic transaction capability. In this case, MTS will automatically begin a transaction at the start of each method invocation, and commit it at the end of the method invocation. Elvis then sees the scenario like this:

Somebody instantiated me

Somebody started a transaction
Somebody asked me to shake
I better read in my state from the database
shake
I better write out my state to the database
Somebody committed the transaction

Now I hang around doing nothing

Somebody started a transaction
Somebody asked me to rattle
I better read in my state from the database
rattle
I better write out my state to the database
Somebody committed the transaction

Now I hang around doing nothing

...

The question is, while Elvis is hanging around doing nothing, can he do something for some other client?

If he starts methods by reading his state and ends them by writing his state, then he is free to work for somebody else during the hang around times, which are likely to be very considerable.


---------------

o-< Neville Burnell clarified:

I think the point is that when an object retains its state, it retains its unique identity. This means, for example, that if 100 clients have interactions with 100 different stateful stock items say, a server would keep all 100 "resident". This approach does not scale as well as the same 100 clients interacting with the same 100 states on demand via say 10 stateless objects (perhaps object shells is a better term).

So, in the stateless object scenario, depending on physical resources and queue lengths etc, another 1000 clients accessing another 1000 different states could be serviced by the same 10 stateless "shells".

To reuse the "elvis" scenario:

create stateless performer object #1 to n

someone requests elvis
load empty performer #1 with elvis
someone starts a transaction
someone asks elvis to shake
someone commits a transaction
unload elvis state so performer#1 is empty

performer #1 waits around doing nothing during which time he can service a request for elvis, or buddy say


---------------

o-< Ron Resnick replied:

[...]

b. The server 'shell' [...] takes on the identities of different objects, as they are needed by clients in the runtime. Neat questions that arise: are all the actual instances it can implement restricted to the same type? Or maybe, once it's donning different identities, these can come from heterogeneous objects which are instances of different classes? I guess that if the code of the object remains bundled in the middle-tier object, it's limited to filling itself up only with objects of exactly the same class. But if this is, say Java-based, and we have some neat class-loader stuff going on,... maybe the reusable-object, aka shell, aka stateless-object, can take on any old identity... I think we're starting to chase our tails here :-)

c. Architecturally, this seems to share a certain similarity to 'pool of threads' designs. I.e. - you have N clients which each demands a server resource. But you don't want to provide N 'runtime entities' on the server, since N is very large. So you have M runtimes, where M<<N, and you share the M amongst the N. In pool-of-threads, the M runtime entities are reusable server threads which can be dispatched on inbound client calls to do generic client work, and then return to the pool. In the stateless-shells Neville describes, the M runtime entities are these fillable object shells which can be stuffed from a much larger number of possible server objects.

d. (derives directly from c) - based upon the analogy to pool-of- threads, this still leads me to further questions about the posited scalability of the shell approach. In pool-of-threads, I can understand the scalability claim. Threads are always (as far as I know) a bounded resource in any OS architecture [...]. Hence, you can't take a given box and run one thread per every client that tries to access it over the Internet - you need to do some mapping from your limited supply of threads to the boundless number of client requests you want to service. Either you dump client requests when they overflow, or you do some kind of pooling/queuing.
But in the 'shell game', this isn't clear to me. It would seem to hinge on whether the shell-objects are active objects (ie have their own dedicated thread) or not. If they *are* active objects, then indeed the shell model is architecturally identical to pool-of-threads, and I buy the scaling argument completely. If they're passive objects though, then the only overhead of maintaining them as 'live' (stateful) instances is the memory they occupy. Have a hundred objects, each of size 1K? That's 100K. Have 100,000 objects? That's 100M. Yawn. Big deal. RAM is cheap. Virtual memory is standard equipment these days even on crappy OSes like Win95. Memory scaling isn't something worth fudging up the application architecture over. [...]


---------------

o-< RStew31177 mentioned some performance problems that arise when each method call is a database transaction:

One problem [...] is that of simple-minded cache management provided by object to relational mapping tools and OODBs. Many of these systems will completely flush (remove) objects from their cache upon transaction commit/rollback. When dealing with method bound activation policies, this becomes a major performance problem.

[...]

My experience has been that a large portion of the state of the application involved in any given transaction is read-only. That is, during a typical transaction, several objects are activated and used for navigating to the 'target' object that will be changed. In many cases, 80% of the state involved in the transaction has been read-only. If, after loading all this state and making the change, the cache is thrown away upon commit, the system spends most of it's time reloading the same state into the object server cache for each method invocation. This is simply stupid! 8)


------------------------------

o-< More Info:

Steve Vinoski and Douglas C. Schmidt, The Thread-Pool Concurrency Model

Greg Lavender and Douglas C. Schmidt, Active Object - An Object Behavioral Pattern for Concurrent Programming

Robert C. Martin, Active Objects - A summary of a comp.object thread


------------------------------