Home

Synopsis

EMF fragments is a Eclipse Modeling Framework (EMF) persistence layer for distributed data stores like NoSQL databases (e.g. MongoDB, Hadoop/HBase) or distributed file systems.

What Problem Does It Solve?

The [EMF](http://www.eclipse.org/emf Eclipse Modeling Framework) is designed to programmatically create, edit, and analyze software models. It provides generated type-safe APIs to manipulate models based on a schema (i.e. metamodel). This is similar to XML and XML-schemas with JAX-like APIs. EMF works fine as long as you use it for software models and your models fit into main memory. If you use EMF for different data, e.g. sensor-data, geo-data, social-data, you run out of main memory soon and thinks become a little bit more complicated.

Why would I use EMF for this kind of data anyways? EMF provides very good generated APIs, generated GUI tools to visualize data, and a series of strong model transformation languages. All things one can apply to structured data. Data in EMF is described through metamodels similar to XML schemas or entity-relationship diagrams.

To use larger models in EMF, we need something to persist models that does not require us to load complete models into memory. Existing solutions include ORM mappings (i.e. eclipse's CDO). These solutions have three drawbacks:

ORM mappings store data slowly because data is indexed and stored very fine grained
ORM mappings are slow when structures are traversed because data is loaded piece by piece even though it is used by larger aggregates
SQL databases are not so easily distributed

EMF fragments is designed to store large object-oriented data models (typed, labeled, bidirectional graphs) efficiently and scalable. EMF fragments builds on key-value stores. EMF fragments emphasize on fast storage of new data and fast navigation of persisted models. The requirements for this framework come from storing and analyzing large ammounts of sensor data in real-time.

How Does EMF Fragments Work?

EMF fragments are different from frameworks based on object relatational mappings (ORM) like Connected Data Objects (CDO). While ORM mappings map single objects, attributes, and references to databae entries, EMF fragments map larger chunks of a model (fragments) to URIs. This allows to store models on a wide range of distributed data-stores inlcuding distributed file-systems and key-value stores (think NoSQL databases like MongoDB or HBase). This also prepares EMF models for cloud computing paradigms such as Map/Reduce.

example fragmentation

The EMF fragments framework allows automated transparent background framgmentation of models. Clients designate types of references at which models are fragmented. This allows to control fragmentation without the need to trigger it programatically. Fragments are managed automatically: when you create, delete, move, edit model elements new fragments are created and elements are distributed among those fragments on the fly. Fragments (i.e. resources) are identified by URIs. The framework allows to map URIs to (distributed) data-stores (e.g. NoSql databases or distributed file systems).

How Is EMF Fragments Used?

Using EMF fragments is simple if you are used to EMF. You create EMF metamodels as usual, e.g. with ecore. You generate APIs and tools as usual using normal genmodels but with three specific parameters.

You have to configure your genmodels to use reflective feature delegation.
You have to use a specific base class: FObjectImpl
You have to enable Containment Proxies

You use the generated APIs and tools as usual. EMF fragments provide a specific ResourceSetimplementation. called FragmentedModel. Resources are managed automatically in the background and you do not have to create, load or unload them manually (as you would have to without EMF fragments).

To actually have your models fragmented, you need to annotate those reference features in your meta-model that you want to cross fragment borders. This is only possible for containment references. When an object is added to a container via such a fragmentation reference feature, a new resource will be created automatically and the new contained will be automatically put into that new fragment.

EMF fragments provides an abstract interface to map resources (fragments) URIs to a physical storage. An implementation for Apache HBase and an in-memory test implementation is provided.

Hello World Example

This example is part of the de.hub.emffrag.tests eclipse project. You can find it within the sources of emf-fragments. For this Hello World example we use a very simple meta-model:

example meta-model

The following code demonstrates how to initialize emf-fragments, to create a model, and how to traverse a model:

// necessary if you use EMF outside of a running eclipse environment
EPackage.Registry.INSTANCE.put(TestModelPackage.eINSTANCE.getNsURI(), TestModelPackage.eINSTANCE);
EPackage.Registry.INSTANCE.put(EcorePackage.eINSTANCE.getNsURI(), EcorePackage.eINSTANCE);
Resource.Factory.Registry.INSTANCE.getExtensionToFactoryMap().put("ecore", new XMIResourceFactoryImpl());

// initialize your model
Resource resource = new FResourceSet().createResource(URI.createURI("memory://localhost/test"));

// create a object and add it to the model root
TestObject testContainer = TestModelFactory.eINSTANCE.createTestObject();
testContainer.setName("Container");
resource.getContents().add(testContainer);

// create the rest of your model as usual
TestObject testContents = TestModelFactory.eINSTANCE.createTestObject();
TestObject testFragmentedContents = TestModelFactory.eINSTANCE.createTestObject();

testContents.setName("Hello Old World!");
testFragmentedContents.setName("Hello New World!");

testContainer.getRegularContents().add(testContents);
testContainer.getFragmentedContents().add(testFragmentedContents);

// call save to force save of cached and unsaved parts of your model
// before exiting the JVM
resource.save(null);

System.out.println("Key value store contents: ");
System.out.println(((FragmentedModel)resource).getDataStore());

// to read a model initialize the environment as before
// initialize your model
resource = new FResourceSet().createResource(URI.createURI("memory://localhost/test"));

// navigate the model as usual
System.out.println("Iterate results: ");
TreeIterator<EObject> allContents = resource.getAllContents();
while (allContents.hasNext()) {
	System.out.println(allContents.next());			
}

The result should be something like this:

Key value store contents: 
memory://localhost/test
key: 102 95 0 0 0 0 0 0 0 0 , URI: memory://localhost/test/Zl8AAAAAAAAAAA
value: <?xml version="1.0" encoding="UTF-8"?>
<tm:TestObject xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI" xmlns:tm="http://hu-berlin.de/sam/emfhbase/testmodel" name="Container">
  <regularContents name="Hello Old World!"/>
  <fragmentedContents href="Zl8AAAAAAAAAAQ#/"/>
</tm:TestObject>

key: 102 95 0 0 0 0 0 0 0 1 , URI: memory://localhost/test/Zl8AAAAAAAAAAQ
value: <?xml version="1.0" encoding="UTF-8"?>
<tm:TestObject xmi:version="2.0" xmlns:xmi="http://www.omg.org/XMI" xmlns:tm="http://hu-berlin.de/sam/emfhbase/testmodel" name="Hello New World!"/>


Iterate results: 
Container
Hello Old World!
Hello New World!

As you can see, the object added to the fragmentedContents reference was stored in its own fragment. The object added to the normal contents reference was stored in the same fragment as its container. The fragmentedContents reference was annotated with _de.hub.emfhbase: Fragmentation->true_, the reference contents was not.

Provide feedback

Saved searches