db4o: Client-Server and Concurrency

So far we’ve always used a single object container. This it the simplest way to used db4o. Just open an embedded database an use it. In this post I’ll give a short introduction to the client-server-features of db4o. Handling concurrency is more challenging in a server-client scenario, therefore I also loose a few words about that.

(All posts of this series: the basics, activation, object-identity, transactions, persistent classes, single container concurrency, Queries in Java, C# 2.0, client-server concurrency, transparent persistence, adhoc query tools)

Embedded Client-Server

Let’s assume we build a small web application. Naturally multiple requests are handled concurrently. Of course you want to run each request independent from other request. Having a transaction for each request would be nice, because then you can rollback the transaction when a request fails. So far we’ve always used a single object-container. Additionally we know that a transaction is bound to a object-container. In order to support our scenario we need more than one object container. No problem, it quite easy:

const int RunEmbeddedServer = 0;
var server = Db4oClientServer.OpenServer(FilePath, RunEmbeddedServer);
var db1 = server.OpenClient();
DoStuffWith(db1);
var db2 = server.OpenClient();
DoStuffWith(db2);

Creating a new server is really easy. You pass in the database-file, a optional configuration and a port-number to the factory and you’re done. When you pass a 0 as port-number an embedded server is created. You can open as many clients as you want on this embedded server with the .OpenClient()-method.

For our small web application this is wonderful solution. We still have an embedded database without additional deployments, administration and so forth. The server just runs within the web application.

db4o-cs

Client-Server Over a Network

Before we created an embedded client-server. But what if the clients are to be different processes on different devices? Well the step is really small, you take the calls above and specify additionally the port and grant access to users. Now your server is ready to accept clients over the network:

The server:

const int RunAsRealServer = 8080;
var server = Db4oClientServer.OpenServer(FilePath, RunAsRealServer);
server.GrantAccess("db4o","passwordOfUser");

The client:

using(var db = Db4oClientServer.OpenClient("localhost", 0, "db4o", "passwordOfUser"))
{
    DoStuffWith(db);
}

Opening a server and connecting clients of the network isn’t much more work. The .OpenClient()-method on the server still works. So the server-process can open internal clients, for example to execute maintenance-work. Of course opening a connection over the network has way more overhead than opening embedded clients.

db4o-cs-network

Server With / Without Classes

Now running the server and clients in different process brings a whole new complexity with it. Remember that the classes actually is the schema of your database? I also said that a query might cannot be optimized and the actual objects are instantiated to run the queries. That’s why there’s a big difference to have a server which has all persisted classes and a server which only runs a pure db4o instance. A server without persisted classes cannot execute all kinds of queries. Here I think it shows that db4o’s focuses on embedded scenarios.

db4o without or with persistet classes

db4o without or with persistet classes

Isolation Between Clients

First lets check something very simple. In the post about object-identity I said that object-container keeps references to the loaded objects. It’s used to keep track of identity and for caching purposes. So how does this work in client-server-mode when multiple object-containers are around? To check this we load the same object from different object-containers. Then we check if the objects are the same by identity:

var client1 = server.OpenClient();
var client2 = server.OpenClient();
var objFromClient1 = (from SimpleObject o in client1
                         where o.Name == "first-Obj"
                         select o).Single();
var objFromClient2 = (from SimpleObject o in client2
                         where o.Name == "first-Obj"
                         select o).Single();
AssertTrue(ReferenceEquals(objFromClient1, objFromClient2),"Have a shared reference-cache");

The assert will fail! Because each client object-container has its own reference-tracking. Basically a client object-container acts the same as a simple object-container. When you have multiple object-containers in your application, it’s extremely important that objects are always loaded and stored with the same object-container instance. Otherwise you run into the magic-clone issue I’ve described in the object-identity-post.

Now to the next big question. When is a change from one object-container visible to the other object-container? As already mentioned in the transaction-post, db4o has read-committed isolation properties. Here’s a little example to illustrate this. We open two client object-containers. The first client stores some object in the database. When the second client queries the database it doesn’t see the stored objects, until the first client commits its changes:

var client1 = server.OpenClient();
var client2 = server.OpenClient();

client1.Store(new SimpleObject("first-Obj",2,2));
client1.Store(new SimpleObject("first-Obj",3,3));

var countBeforeCommit = (from SimpleObject o in client2
                         select o).Count();
AssertTrue(0 == countBeforeCommit,"Doesn't see stuff from first client");

client1.Commit();

var countAfterCommit = (from SimpleObject o in client2
                        select o).Count();
AssertTrue(2 == countAfterCommit,"After commit it does see the stuff from the first client");

db4o-isolation

Beware Of Cached Instances

Ok, the object-containers are well isolated and changes are visible after commits. Ready for more? Let’s take a look at the example code below. Again we open two client object-containers. Each get’s the object by its name. Then the first client changes the name, stores and commits the changes. After that we get the object by its new name on the second client object-container. Then we compare the names. Will it run fine?

var client1 = server.OpenClient();
var client2 = server.OpenClient();
var objFromClient1 = LoadObjectByName(client1, "old-name");
var objFromClient2 = LoadObjectByName(client2, "old-name");
AssertTrue(objFromClient1.Name==objFromClient2.Name,"Names are equal");

objFromClient1.Name = "new-name";
client1.Store(objFromClient1);
client1.Commit();

objFromClient2 = LoadObjectByName(client2, "new-name");
AssertTrue("new-name" == objFromClient2.Name, "Names are equal");

Well the result is probably unexpected. The second assertion fails. What the fuck? We ran the query with the ‘new-name’ and retrieved the object. But the object actually has the ‘old-name’? That’s totally inconsistent! Well here the reference-cache plays a dirty trick. As said, db4o holds references to already loaded object and only returns the instantiated object to avoid unnecessary de-serialization. Because the second client object-container already has a instance of the object it simple returns the existing instance. Unfortunatetly the object in memory doesn’t have the same state as the object in the database. This is a very common issue with all database-systems. You can explicitly refresh an object. This will read set a object in memory to the state in the database:

const int activationDeph = 4;
client2.Ext().Refresh(objFromClient2, activationDeph);

Or you open for each task a new client object-container. In the embedded client-server mode this is quite a good solution, since its cheap to open a new client. Over a network its rather a costly operation.

However in the end you’ve always a race condition, because another client could do something between the refresh and the store-operation.

cached objects

cached objects

Locking

As seen above db4o hasn’t a very ‘strict’ concurrency model. Therefore you might need more control. For this purpose db4o provides low-level locks. The API is simple:

const int timeOutInMilliSec = 1000;
try
{
    objContainer.Ext().SetSemaphore("name_of_semaphore", timeOutInMilliSec);
    DoWork();

}finally
{
    objContainer.Ext().ReleaseSemaphore("name_of_semaphore");
}

Often the name of the semaphore is derived from the object-id to ensure that the name is unique. A small example. Again we open two client object-containers. The first one grabs a lock. Then we check on another thread if we can get the lock there.

const int timeOut = 100;
var client1 = server.OpenClient();
var client2 = server.OpenClient();
if (!client1.Ext().SetSemaphore("lock-demo", timeOut))
{
    throw new InvalidOperationException("Expect that I can get lock");
}
var couldGetLockOnClient1 = false;
var couldGetLockOnClient2 = false;
var thread = new Thread(()=>
   {
       couldGetLockOnClient1 = client1.Ext().SetSemaphore("lock-demo", timeOut);
       couldGetLockOnClient2 = client2.Ext().SetSemaphore("lock-demo", timeOut);
   });
thread.Start();
thread.Join();
AssertTrue(couldGetLockOnClient1, "client1 can get lock. Also in other threads");
AssertTrue(!couldGetLockOnClient2, "client2 cannot get lock, since it's hold by client1");

Here’s one important thing to notice. The Thread we run can acquire the lock from the first client, but not from the second. So db4o locks work on a object-container. They have nothing to do with thread-synchronization like the lock-statement.

Normally you build some kind of wrappers around this low-level-mechanism. For example to have this convenient way to lock objects.

dataBase.WithLock(yourAccount, myAccount)
    .Execute(() =>
                 {
                     var transferAmount = 100;
                     yourAccount.Money -= transferAmount;
                     myAccount.Money += transferAmount;
                 });

This higher level locking-method is also responsible for ordering the locks, to avoid dead-locks. I’ve added a simple implementation to this blog-entry. No guaranties that is correct though.

db4o-locking

More Concurrency Stuff

Well I think that I’ve covered the very basic concurrency controls in db4o. There are some more building-blocks. For example you can send messages around. So you could notify clients about changes via the database-server. Furthermore db4o can keep version-numbers of object around. Maybe useful for for some kind of optimistic locking.

concurrency mad

concurrency mad

Next Time

Well I think I’ve now covered the very basics of db4o. For more information read the documentation, or ask it in the forums. I will certainly write about db4o sometime in the future =). But next time I’ll take a brief look at another database-technology.

Files: DB4O-ClientServer.cs, SimpleObject.cs

Lock-Utility: LockExtensions.cs, TestLockExtensions.cs (Unit-Test require NUnit and Moq)

Tagged on: ,

12 thoughts on “db4o: Client-Server and Concurrency

  1. Edward

    hmm..

    AssertTrue(!couldGetLockOnClient2, “client2 cannot get lock, since it’s hold be client2”);

    should that not be:

    AssertTrue(!couldGetLockOnClient2, “client2 cannot get lock, since it’s hold be client1”);

  2. Guacharaca

    hi, very nice post. An alternative, When we are dizzy with to db4o.com web pages.

    Thanks a lot

  3. damian

    A question, on a web application with the Embedded Client-Server configuration, i should open a new client for every request?

  4. gamlerhart Post author

    In general yes. The IObjectServer.OpenClient() is a very cheap, lightweight operation and has its own transaction and reference-system. When opening a client per request you isolate the different requests from each other.

  5. Jon E.

    Thank you for the great article. According to the “Isolation Between Clients” section above, the following code should work but it’s not…any help is much appreciated:
    IServerConfiguration serverConfig = Db4oClientServer.NewServerConfiguration();
    var db4oServer = Db4oClientServer.OpenServer(serverConfig, dbfile, 0);

    Person d = new Person();
    d.Name = “Hi”;
    d.Age = 10;

    var db1 = db4oServer.OpenClient();
    db1.Store(d);
    db1.Commit(); //commit called so all clients should be synched.

    d.Name = “New Name!”;

    var db2 = db4oServer.OpenClient();

    Assert.IsTrue(db2.Ext().IsStored(d)); //Fails here!

  6. gamlerhart Post author

    The article is a little unclear confusing there.

    In your example code, the assert indeed fails, and its the expected behavior.

    Embedded containers have their own reference cache and transaction. When you load a object in one container, the other container doesn’t recognize it. Because db4o uses the object identity to identify objects. This means each object container has a table which contains all objects which where loaded by that container. And this also applies for the .IsStored(d)-check. Since the person ‘d’ wasn’t loaded by the ‘db2’-instance, it isn’t recognized by that object container. The object container only recognized objects which are loaded by itself.

    See also:
    Identity-Concept: http://developer.db4o.com/Documentation/Reference/db4o-8.0/net35/reference/Content/basics/identity_concept.htm

    And reference-cache:
    http://developer.db4o.com/Documentation/Reference/db4o-8.0/net35/reference/Content/basics/indentity_concept/reference_cache.htm

    Hmm, I should update the documentation links in the article.