On M-theory and H-models, or can EMF fit every nail

by Vladimir Piskarev

Those readers who are interested in modern physics may know about M-theory that seems to be gathering a lot of hype in recent years. It is the latest toy of theoretical physicists, which many think could become the ultimate “theory of everything”. It is a subject of ongoing debate, however, whether there actually can be a single unified theory capable of explaining everything or, instead, it is a set of theories, each describing different aspects of the universe (with some overlap). And if one adopts model-dependent realism, it doesn’t really matter :-)

In Eclipse world, EMF has been steadily gaining traction. Rightfully, EMF is such a nice, flexible tool that many problems start looking like a nail. But can an entire IDE (like JDT) be essentially EMF-based? Xtext might lead many to believe so, but let’s take a closer look and see how JDT is structured from the modeling point of view. After all, it reflects thirty years of experience!

It may be surprising that there are at least three different models in JDT, each describing different aspects of the IDE (with some overlap) and having very distinctive characteristics of its own:

  • Java model

    I like to think of the Java model as the IDE ground (and glue). This model is the common foundation for many core services as well as all views and actions in JDT. One can also think of it as a Java-oriented projection of the Eclipse workspace.

    The Java model is a handle-based model (H-model for short). Its elements are handles: value objects that act like a key to a model element. This gives it a number of special properties that are very useful for a large-scale in-memory model of the whole IDE:

    • Stable (elements to which a reference can be kept, e.g. to show in a view)
    • Light weight (model non exhaustive, no resolved information kept)
    • Lazily populated, LRU cache (doesn’t hold on resources)
    • Eventually consistent (need not be consistent all the time, allows “dirty reads”)
    • Thread safe without imposing much locking (fosters high concurrency)

    That’s quite unique combination of properties, if you think of it. It’s not a mere coincidence that the underlying workspace resource model is also an H-model (with slightly different properties).

  • Abstract Syntax Tree (AST)

    Each AST node represents a Java source code construct, such as a name, type, expression, statement, or declaration. The parser parses a string containing Java source code and returns an AST for it.

    It is important to note that the AST is really a tree (no cross-references). Since ASTs tend to consume a lot of memory, JDT holds only one AST, for the currently active editor, reconciling it as necessary. Clients should not keep references to elements of the AST.

  • Bindings

    When an AST is created by the parser, there is an option to also create bindings. A binding represents a named entity in the Java language, such as a package, type, field, method, constructor, or local variable.

    The world of bindings (an environment) provides an interconnected picture of the structure of the program as seen from the compiler’s point of view. This is the only JDT model with graph structure, i.e. cross-references. One can think of it as the semantic model of the language.

It seems only one of those models (Bindings) can be naturally represented with EMF. Another (AST) might fit EMF, with some caveats. The remaining one (Java model) is not a good fit for EMF at all. (Beg to differ? You are welcome to leave a comment. I may be missing something.) It looks like “M-theory” for Eclipse-based IDEs (where ‘M’, in this case, stands for Modeling) would indeed include a set of models with different goals, properties, and requirements to the underlying implementation technology.

In particular, the common needs of H-models like the Java model are to be addressed in Handly, an Eclipse technology project that is about to turn one year old. Handly (at its core) supplies basic building blocks that help developers create their own H-models.

A nice bonus is that many useful IDE components (micro-frameworks) could be implemented atop the common API for H-models provided by Handly — there must be a considerable potential for unification. As an example, the upcoming Handly 0.3 release, scheduled to coincide with Eclipse Mars, will provide an outline framework (including quick outline support) for H-models. The outline functionality is already in place, actually, in the project’s git repository and latest I-builds.

The vision behind Handly is not that of an all-encompassing IDE framework with its many inherent assumptions. Rather, it envisages a set of loosely-coupled micro-frameworks around H-models — that common IDE ground. If you are interested in giving Handly a try, there’s a multi-part tutorial on GitHub. Beware though: if you are new to H-models, it takes some getting used to. But if you are keen to know what principles make JDT (or other Eclipse-based IDEs) tick as a seamless whole (the *I*DE), I think it might be worth it.

Of course, Handly doesn’t prescribe a specific implementation technology for the other IDE models. They could be EMF-based (like in Xtext; in fact, Handly already includes a layer of integration with Xtext) or something else; it doesn’t matter much when one adopts model-dependent realism!