Distributed Software Engineering :LOCKING GRANULARITY

By June 20, 2015


The previous sections of this paper have outlined the motivation for software engineering processes, software configuration management, and various approaches to coordination. This section will look in detail into the ability for configuration management systems to lock at various granularities.

Traditional software configuration management (SCM) systems lock at a file level and assume text-based content for their differential analysis [Cederqvist, P, 1993; Chu-Carroll, 2002; Magnusson et al., 1993]. But there is much potential to be realized if a finer granularity, such as a class, method, or block, of locking is adopted [Chu-Carroll, 2002; Magnusson et al., 1993].

First, by locking at a finer granularity, contention for shared documents will be reduced. A single file containing many classes and/or methods may no longer be the point of contention among multiple users; now, since this file is broken into multiple sub-files – each of which is managed separately by the SCM system – many users can have parts of the heretofore large file checked out (or in use) simultaneously.

Second, by adopting a fine granularity (sub-file level) for locking, the system may improve concurrency regardless of whether a pessimistic or an optimistic coordination policy is utilized. This is due to the fact that the probability of two users requesting the same document will be reduced when the document sizes are reduced (i.e. if the locking granularity increases, the documents being managed will decrease in size and the number of such documents will increase). Documents in high demand will be partitioned into subsections, and these subsections may be locked independently (with less contention of requests) in a pessimistic system. And in an optimistic system, probability of users editing the same document will be reduced.

Third, by adopting a fine granularity for locking, the system may aggregate documents together to form virtual files and other versioned objects from collections of other objects [Chu-Carroll, 2002]. Most SCM systems currently employ the concept of a project (or directory) which is itself an aggregate [Microsoft, 2005a]. Unfortunately, most existing SCM systems do not make it easy to aggregate objects from different projects. But fine-grain SCM systems allow for heightened aggregation as the reusability of each element is increased; if elements are decoupled from other elements, then reuse should increase [Microsoft, 2005a]. The following figure demonstrates the aggregation of many elements into larger, versioned objects within the repository.

Chu-Carroll et al [2002] propose a model of using fine-grain elements as first-order entities to achieve a high level of aggregation. The idea in their system is that the semantic rules of the programming language(s) in use can guide the automatic management, searching, and merging of various versions of the documents being edited collaboratively; particularly novel in their system, titled “Stellation,” is the idea of automatically aggregating specification to implementation, linking code to requirements/specification.

Magnusson et al [1993] propose an interesting approach to manage the complexity that can emerge when dealing with fine-grain entities in a SCM system. Since the number of elements in a fine-grain SCM system can increase by an order of magnitude or more, Magnusson et al implement their system using a hierarchical representation of the code base. Blocks of code contain other blocks of code in a component pattern, and a tree of code blocks is constructed to represent the source in the repository. Depth in the code tree represents such semantic programming language elements as classes, methods, and blocks. To keep the complexity and redundancy of the system minimal, the authors employ a sharing scheme such that references to new/changed entities are “grafted into” the current source tree. The following figure demonstrates the shared source sub-tree from version n to version n+1.

Notice in this model only the changed source code tree node data is modified in the structure; the tree node and its ancestors are marked as part of the new version (notice the grey nodes above), but no code is replicated unless it has been modified (notice the black node data above). This avoids redundancy in the source code repository. This model supports additions, edits, single evolution lines (progressions) and alternative revisions (version branches) [Harrison, 1990]. Additionally, this version tree assists in facilitating merges between multiple disparate edits of a single node.