September 25, 2019

Intro to MVStore, an embedded key value store

MVStore is the backend storage for the popular embedded H2 relational database. If you don’t need a relational database, but a lower level storage MVStore is maybe an option. The MVStore has a good nice range of features. The documentation isn’t as detailed, but the intro documentation gives a decent overview. Anyway, this post is another small intro.

H2 vs MVStore
Figure 1. H2’s companion, the MVStore

Getting Started

First, include the MVStore in your project. The easiest way is to download from the Maven repo.

pom.xml
<dependency>
    <groupId>com.h2database</groupId>
    <artifactId>h2-mvstore</artifactId>
    <version>1.4.199</version>
</dependency>

After that open the MVStore and open a map. The map is Java ConcurrentMap and supports the same operations like put and putIfAbsent etc.

try (MVStore store = MVStore.open("database.mv")) {
    MVMap<Integer, String> topMovies = store.openMap("imdbTopMovies");
    topMovies.put(1, "The Shawshank Redemption");
    topMovies.put(2, "The Godfather");
    topMovies.put(3, "The Godfather: Part II");
}
// Later on
try (MVStore store = MVStore.open("database.mv")) {
    MVMap<Integer, String> topMovies = store.openMap("imdbTopMovies");
    // returns The Shawshank Redemption
    var no1 = topMovies.get(1);
    System.out.println(no1);
    // returns null
    var no4 = topMovies.get(4);
    System.out.println(no4);
}

You can create as many maps as you need:

MVMap<Integer, String> topMovies = store.openMap("imdbTopMovies");
MVMap<String, Double> ratings = store.openMap("imdbRating");

topMovies.put(1, "The Shawshank Redemption");
ratings.put("The Shawshank Redemption", 9.2);

var no1 = topMovies.get(1);
var rating = ratings.get(no1);
System.out.println("Top movie is " + no1 + " with a rating of " + rating);

Serialisation

We’ve seen that you open a map and just put in Java objects. That makes me immediately nervous because that smells like Java serialization or some other reflection magic. I try to avoid that and certainly do not want my persistence involved some magic I might later regret.

MVStore serializes basic Java types
Figure 2. MVStore serializes basic Java types

So, what happens when we put in our object?

public class Movie {
    public final String title;
    public final double rating;

    public Movie(String title, double rating) {
        this.title = title;
        this.rating = rating;
    }

    @Override
    public String toString() {
        return "Movie(" +
                "title='" + title + '\'' +
                ", rating=" + rating +
                ')';
    }
}

try (MVStore store = MVStore.open("database.mv")) {
    MVMap<String, Movie> movies = store.openMap("imdbMovie");
    movies.put("tt0111161", new Movie("The Shawshank Redemption", 9.2));
}

Well, we get an exception:

Exception in thread "main" java.lang.IllegalStateException: java.lang.IllegalArgumentException: Could not serialize Movie(title=The Shawshank Redemption, rating=9.2) [1.4.199/0] [1.4.199/3]
	at org.h2.mvstore.DataUtils.newIllegalStateException(DataUtils.java:883)
	at org.h2.mvstore.MVStore.store(MVStore.java:1194)
	at org.h2.mvstore.MVStore.commit(MVStore.java:1166)
	at org.h2.mvstore.MVStore.closeStore(MVStore.java:982)
	at org.h2.mvstore.MVStore.close(MVStore.java:946)
	at info.gamlor.testing.Main.main(Main.java:43)

Alright, that doesn’t work. However, it works when the class is serializable:

// MVStore works with serializable objects
public class Movie implements Serializable {
    // ...
}

So, the MVStore stores Java primitives, Strings, UUIDs, BigDecimal, BigIntegar and Dates and arrays by default. Any other object needs to be serializable. If that doesn’t suit you you can provide your data type handling. That is what I often end up doing to have fine control for my types.

Teaching the MVStore another serialization format
Figure 3. Teaching the MVStore another serialization format
MovieType.java:
// Plug in your serialization mechanism by implementing your own
public class MovieType implements DataType {
    @Override
    public int compare(Object a, Object b) {
        throw new UnsupportedOperationException("Only required for keys");
    }

    @Override
    public int getMemory(Object obj) {
        var movie = (Movie) obj;
        // Memory used by the object. It can be a estimate, but in general is the precise number
        var titleLen = 4; // Int size
        // Estimate title's storage size, assuming it's in ASCII will be correct most of the time
        var title = movie.title.length();
        var rating = 8; // Double size
        return titleLen + title + rating;
    }

    @Override
    public void write(WriteBuffer buff, Object obj) {
        var movie = (Movie)obj;
        var title = movie.title.getBytes(StandardCharsets.UTF_8);
        buff.putInt(title.length);
        buff.put(title);
        buff.putDouble(movie.rating);
    }

    @Override
    public void write(WriteBuffer buff, Object[] obj, int len, boolean key) {
        // boiler plate for most cases
        for (int i = 0; i < len; i++) {
            write(buff, obj[i]);
        }
    }

    @Override
    public Object read(ByteBuffer buff) {
        var titleLen  = buff.getInt();
        var title = DataUtils.readString(buff, titleLen);
        var rating = buff.getDouble();
        return new Movie(title, rating);
    }

    @Override
    public void read(ByteBuffer buff, Object[] obj, int len, boolean key) {
        // boiler plate for most cases
        for (int i = 0; i < len; i++) {
            obj[i] = read(buff);
        }
    }
}

And then you can configure the map to you your serialization mechanism:

try (MVStore store = MVStore.open("database.mv")) {
    MVMap.Builder<String,Movie> mapConfig = new MVMap.Builder<>();
    mapConfig.valueType(new MovieType());
    MVMap<String, Movie> movies = store.openMap("imdbMovie", mapConfig);

    movies.put("tt0111161", new Movie("The Shawshank Redemption", 9.2));
}

In summary, MVStore has built-in serialization for basic Java types. By default, it falls back to Java serialization if it doesn’t know the type. At any point in time, you can plug in your own serialization.

Commit

By default, MVStore does commit changes in the background once in a while. At any time you can commit changes explicitly:

// Commit all changes right now. Otherwise, changes are committed after a while by default
store.commit();

You can disable auto commits by configuring the store. In that case, you control when changes are committed to disk:

        var builder = new MVStore.Builder()
                .autoCommitDisabled()
                .fileName("database.mv");
        try (MVStore store = builder.open()) {
            // Now changes are not stored automatically, but only on store.commit() and store.close()
        }

That’s it for an intro. I plan to dig into more specifics soon.

Tags: MVStore Java