At work, we needed a mechanism to compare EMF models. We are developing a system that uses ATL model to model transformations. We wanted to validate the transformations with JUnit unit tests. We needed a mechanism that let us compare expected models with transformed output models.

EMF Compare

I started evaluating the EMF Compare Framework. This framework targets model comparison by building a EMF model of the differences found during the comparison process. It let you evaluate differences pretty exhaustively. It evaluates source and target models by trying to match parallel elements. For related elements, matching is applied recursively.

The differences model is build of matched and unmatched elements. Matched elements let you examine the similarity of the matching (a number between 0 and 1). It seems as if two elements match just by having the same metaclass.

, the similarity is set depending of how similar are their attributes. If the two models are identical, the root elements matching similarity is 1. If you change the value of two attributes, for example,

this precision was over 0.9. If you just change the order of two nodes, the precision was again under 1, even when the references in the metamodel weres unordered. This behavior wasn’t very good for our purposes, since models under comparison could have different orders in their references.

If you think in the semantic of the comparison, two models can be identical when their unordered references don’t have the same order, don’t they? I suspect that EMF Compare probably would let you configure different comparison strategies easily, but we found a simpler way to achieve what we wanted.

Modifying the EcoreUtil.EqualityHelper class to ignore order in references

A friend told me have a look at the equals() method provided by the org.eclipse.emf.ecore.util.EcoreUtil class. It receives two EObject elements and compare them recursively. The two models have to had exactly the same structure in order be equal (no matter wether references were ordered or unordered). Again stabbed with the same problem. A look at the source of the method revealed that it delegates completely the functionality in a helper inner class: EqualityHelper. This class code is well factorized and it can be understood easily. The comparison of two list of elements was properly contained in an equals(List, List) method. So we try to hack this method in order to ignore orders when comparing.

After spending some hours trying to make a very complex modification of the method, another friend proposed the simplest way to compare two lists: sort them both and

compare sorted lists expecting exactly the same order. The solution was as obvious as wonderfully easy to implement: we already had the exhaustive comparison so we only needed to center our efforts in sorting the lists.

The last thing we needed was to find an EObject comparison criteria in order to implement the proper java.util.Comparator. This wasn’t that easy and we finally ended up parsing the toString() method result so we can obtain the attributes list string (the attribute’s name is hard coded in the EMF generated code of each concrete toString() method).

Below the source code of the modified EqualityHelper is shown.

public class EMFComparator extends EcoreUtil.EqualityHelper {

    class EObjectComparator implements Comparator<EObject> {
        public int compare(EObject object1, EObject object2) {
            String targetString1 = extractComparisonString(object1);
            String targetString2 = extractComparisonString(object2);

            return targetString1.compareTo(targetString2);
        }

        private String extractComparisonString(EObject object) {
            return object.toString().replaceAll(
                    object.getClass().getName(), "").replaceAll(
                    Integer.toHexString(object.hashCode()), "");
        }
    }

    @Override
    public boolean equals(List list1, List list2) {
        Comparator comparator = new EObjectComparator();

        List<EObject> sortedList1 = new ArrayList<EObject>(list1);
        List<EObject> sortedList2 = new ArrayList<EObject>(list2);

        Collections.sort(sortedList1, comparator);
        Collections.sort(sortedList2, comparator);

        return super.equals(sortedList1, sortedList2);
    }

Conclusion

I was a bit surprised of not finding this problem solved when googling for it. I’m sure more people have had this need and have solved this problem before. When you are transforming models, you need a formal way to compare real output models and expected output models. I wouldn’t develop a complex transformation system with a lot of transformation rules without this system. Lateral effects when modifying transformation rules can break the system and being easily unnoticed. If transformations are an essential mechanism in a data loading system, like our case is, this danger is not acceptable.

Update (2008-11-19)

There was an error in the code originally posted. The method extractComparisonString() received an String object, when it should receive an EObject (it didn’t make sense). Thank you so much to Jim Showalter for warning me. I should have copy/pasted from what I coded at work, instead of rewriting at home.

You can also download an Eclipse sample project that compares two UML models. It includes the source of the comparator.