Home
Differencing and Merging of Architectural Views∗
Contents
1. Subtypes for Specifications Predicate Subtyping in PVS IEEE Transactions on Software Engineering 24 9 sampkluw tex 16 12 2007 13 23 p 44 Differencing and Merging of Architectural Views 45 Sartipi K and K Kontogiannis 2003 On Modeling Software Architecture Recoverv as Graph Matching In Proceedings of the 19th IEEE International Conference on Software Maintenance pp 224 234 Schmerl B J Aldrich D Garlan R Kazman and H Van 2006 Discovering Ar chitectures from Running Svstems IEEE Transactions on Software Engineering 32 7 454 466 Schmerl B and D Garlan 2004 AcmeStudio Supporting Stvle Centered Archi tecture Development In Proceedings of the 26th International Conference on Software Engineering pp 704 705 Shasha D and K Zhang 1997 Approximate Tree Pattern Matching In A Apos tolico and E Galil Z eds Pattern Matching Algorithms Oxford Universitv Press Shaw M and D Garlan 1996 Software Architectures Perspectives on an Emerging Discipline Prentice Hall Spitznagel B and D Garlan 1998 Architecture Based Performance Analysis In Proceedings of the Conference on Software Engineering and Knowledge Engineering Sun Microsystems 2006 J2EE Tutorials Dukes Bank http java sun com j2ee tutorial 1_3 fcs doc Ebank2 html Telea A A Maccari and C Riva 2002 An Open Visualization Toolkit for Reverse Architecting In Proceedings of t
2. Capturing Software Architecture Design Expertise with Ar mani Technical Report CMU CS 98 163R Carnegie Mellon University School of Computer Science Muccini H M S Dias and D J Richardson 2005 Towards Software Architecture Based Regression Testing In Proceedings of the Workshop on Architecting Dependable Systems pp 1 7 Murphy G C D Notkin and K J Sullivan 2001 Software Reflexion Models Bridging the Gap between Design and Implementation IEEE Transactions on Software Engineering 27 4 364 380 Object Technology International Inc 2003 Eclipse Platform Technical Overview http www eclipse org whitepapers eclipse overview pdf Ohst D M Welle and U Kelter 2003 Differences between Versions of UML Diagrams In Proceedings of the 9th European Software Engineering Confer ence 11th ACM SIGSOFT International Sumposium on Foundations of Software Engineering pp 227 236 Raghavan S R Rohana D Leon A Podgurski and V Augustine 2004 Dex A Semantic Graph Differencing Tool for Studving Changes in Large Code Bases In Proceedings of the 20th IEEE International Conference on Software Maintenance pp 188 197 Roshandel R A van der Hoek M Mikic Rakic and N Medvidovic 2004 Mae A System Model and Environment for Managing Architectural Evolution ACM Transactions on Software Engineering and Methodology 13 2 240 276 Rushby J S Owre and N Shankar 1998
3. 1 from z and F p max f x p x X Then MDIR s worst case running time is O 2 x F p N2 The average case is considerably faster than the worst case in our implementation due to the following strategies We prune the search tree by using both the tree structure and any semantic information such as optional type information We also limit the running time by returning a possibly suboptimal solution In practice the observed runtime is O KN where K is a large con stant In comparison THP has a worst case running time of O d N where d is the maximum degree of a tree and d lt lt N Torsello et al 2005 Regarding memory requirements both THP and MDIR could be implemented in O N space at the expense of additional complexity Our current THP implementation requires O dN and MDIR requires O bN where b is a large constant factor sampkluw tex 16 12 2007 13 23 p 16 Differencing and Merging of Architectural Views 17 4 Empirical Evaluation An empirical evaluation of the accuracy of the MDIR algorithm is necessary because bounds B and R remove the guarantee of optimality We generated the test data as follows 1 generate a random tree with random labels taken from a pool of 10 possible names so as they are non unique 2 copy the tree 3 delete a random number of nodes in the copy including both internal and leaf nodes 4 rename a number of nodes in the copy 5 and finally compare the two trees u
4. Abi Antoun et al 2005 or in a change impact analvsis Krikhaar et al 1999 A runtime analvsis could use the difference information to perform architectural repair Dashofv et al 2002 Finallv during evolution the difference information can focus regression testing efforts Muccini et al 2005 Several techniques and tools have been proposed for differencing and merging architectural or design views Most techniques do not detect differences based on structural information Many assume that elements have unique identifiers Alanen and Porres 2003 Ohst et al 2003 Mehra et al 2005 Others match two elements if both their labels and their types match Chen et al 2003 which is often infeasi ble when dealing with views at different levels of abstraction Many techniques detect only a small number of differences For instance ArchDiff only detects insertions and deletions van der Westhuizen and van der Hoek 2002 Chen et al 2003 possibly leading to the loss of information when elements are renamed or moved across the hierarchy Tracking changes using element level versioning helps infer high level operations such as merges splits or clones in addition to the low level operations such as inserts and deletes Jimenez 2005 Roshandel et al 2004 But such an approach requires building new tools or changing existing tools and cannot handle legacy architectural models In this paper we propose an approach that overcome
5. The architect used the synchronization tool to compare the two views As he was the least sure about how he represented the circuitModel component in Acme he decided to focus on that component first sampkluw tex 16 12 2007 13 23 p 30 Differencing and Merging of Architectural Views 31 channelRouteViewer J placeRouteViewer floorPlanViewer circuitModel a Top level Acme model circuit channel aS i i ji route f D titi partitioner Ne 3 foorPlanner Li l place m b Acme representation of the circuitModel component Figure 19 As designed Aphyds architecture represented in Acme The tool detected a few renames e g ArchJava uses model in stead of circuitModel and inside that representation ArchJava uses globalRouter instead of route See Figure 21 The architect was particularly intrigued that the Acme representation for circuitModel had more connectors than the ArchJava implementation In Figure 21 the tool only matched the starConnector which connects components circuit partitioner floorPlanner place route and channel in Figure 19 The architect investigated this further and confirmed that the Acme connectors corresponding to the thick data flow arrows in the informal diagram in Figure 18 are not in the implementation Since Aphyds was written for academic study and not for industrial application it is missing some of the data fl
6. once the user assigns types to components ports and connectors For instance the tool infers the role type sourceT based on the source component type Filter source port type inputT and connector type Pipe In this case the synchronization produces the edit script in Figure 17 Since the user mapped the types the edit script elements already have types Each view element that already has a type is displayed using the same type and style dependent visualization that it would have in AcmeStudio If the user does not specify architectural types and styles the elements that the edit script will create will be untyped Of course the user can set the types on the newly inserted view elements at a later point in AcmeStudio Although assigning types during synchronization seems to duplicate functionality it may affect the edit script and the view merging as explained below sampkluw tex 16 12 2007 13 23 p 28 Differencing and Merging of Architectural Views 29 For instance when a component instance is assigned the Filter component type it inherits any ports declared on that type e g ports input and output of types inputT and outputT So the tool need not create additional ports of these types on the component instance Based on the user s selection in Figure 16 the tool matches the Arch Java port portOut since it only provides methods with the Acme type outputT The tool suggests renaming the port portOut of type outputT to
7. the optimal successor set mapping for B b is D d E e Finally in Figure 8 MDIR computes the cost of matching B to a At the end of this phase MDIR has determined the best suc cessor set mapping and stored it for the next phase when MDIR will retrieve the best matches MDIR could avoid keeping the optimal successor set mapping for each node pair in the first phase to reduce the space complexity to O N But it is simpler conceptually to store this information and this is how we currently implemented MDIR In the second phase MDIR uses a recursive procedure to com pute the match list i e to determine what node corresponds to what sampkluw tex 16 12 2007 13 23 p 14 Differencing and Merging of Architectural Views 15 Step Work List Match List 1 A a 2 B b F f G g A a 3 F f G g D d E e A a B b 4 G g D d E e A a B b F f 5 D d E e A a B b F f G g Figure 9 Computing the match list other node MDIR uses the following recursive formulation The list of matches for subtree pair rooted at x y consists of x y in addition to the list of matches of each pair in the successor set mapping of x y In Figure 9 MDIR starts with A a The successor set mapping of A a is B 6 F f G g So MDIR first adds A a to the match list and then adds the pairs B b F f and G g to the work list Then MDIR pops B b from the work list adds it to the match lis
8. Abi Antoun et al 2007 We chose HillClimber because it uses a framework A product line architecture often uses a framework as its platform and one often needs to compare variants in a product line Chen et al 2003 The implementation technology ArchJava also made it easy to extract the as built C amp C view As Designed Architecture The applications that use the CIspace framework follow a simple high level design An application window uses a canvas to display nodes and edges not shown of a graph in order to demonstrate the algorithms provided by the engine See Figure 27 As Built Architecture We first ran the tool to synchronize the as designed C amp C view in Figure 27 with a C amp C view retrieved from the HillClimber implementation In this case the top level structure of the as designed view was not sufficiently detailed i e the various nodes have roughly the same number of ports In such cases structural com sampkluw tex 16 12 2007 13 23 p 37 38 Differencing and Merging of Architectural Views egend Components Connectors Ports CallReturnConn O us j provide Figure 27 Base design for a CIspace framework application parison alone can produce inaccurate results In this case the MDIR algorithm incorrectly matched the top level element graph in one view to window in the other view So the user manually forced the matches between the top level nodes in the two views and re ran the com
9. But in our case this solution could produce a sub optimal edit script because it is possible that a few ancestors were deleted while the forciblv matched node was not deleted This requires distinguishing between individual delete operations and mass delete operations We therefore allow the deletion of ancestors of the forciblv matched node on the condition that this deletion operation is not part of a subtree deletion operation Whenever an ancestor is deleted at least one of its descendents which is itself an ancestor of the forciblv matched node must be part of the successor set The base case of the recursive BACKTRACK procedure enforces this constraint When computing the best cost for the i j entry of the cost matrix if is an ancestor of a forcibly matched node BACKTRACK does not record in BestSolution any mapping that deletes the branch leading to the forcibly matched node Instead BACKTRACK records a mapping that deletes a few in termediate nodes on the path from to the forcibly matched node This feature is not shown in the pseudo code to keep the latter manageable 3 5 RUNTIME AND MEMORY COMPLEXITY An upper bound on the running time of the MDIR algorithm is as follows let X be the set of nodes of both trees x be an element of X p be the maximum allowable size of a connected subgraph of the tree that can be deleted or inserted in the middle of the tree f r p be the number of nodes that lie within a distance of p
10. Entity Beans Legend Components Connectors Ports Roles EntityBeanT iz BeanChannel E ResponsePortT E ResponseRoleT Dbwrite E RequestPortT DbRequestRoleT E DbPortT RequestRoleT SessionBeanT e DbRead DbRespondRoleT Figure 24 Duke s Bank documented architecture in Acme the components were added inside the Acme representation of an EJB container shown as a thick border Session and Entity Beans are grouped sampkluw tex 16 12 2007 13 23 p 35 36 Differencing and Merging of Architectural Views AccountControllerBean 17f192 7 CustomerControllerBean 1feeb85 AccountBean e55475 CustomerBean_13a5041 E a TxBean 433461 TxControllerBean B1e8B1f l TT l Figure 25 Duke s Bank recovered architecture in Acme instance The goal was to make the recovered C amp C view in Figure 25 comparable to a typical C amp C view where each component instance represents any number of runtime components Matching Types In this case the as built view and the as designed view use the same architectural style and types so the architect skipped the optional step of matching types Matching Instances The differencing tool correctly detected the moves corresponding to replacing the container component in one view with its representation in the other view See Figure 26 Because a tool generated the names in the recovered view e g AccountBean e55475 there was a large n
11. Instances a5 DukesBankApp_Documented Compare 5 DukesBank pp Components 9 Components container clear AccountBean_e55d75 Ports Gin Oz DbClientReadPort E Components oy DbClientWritePort BWC account ean B respons G Ports uei AccountControllerBean_17f1f92 Oy p EA 69 Ports fl 04 pO Scroll E oy p Oy pl rm request Account_Controller_Bean Report CustomerBean 13a5041 Customer_Bean 4 Ports Customer_Controller_Bean hr Og DbChentReadPort G DB oy DbClientWritePort ic Tx_Bean EQ Oy response Tx Controller Bean _ Reset CustomerControllerBean 1fee685 Connectors E Ports il 43 BeanChanneld F Oy request il 93 BeanChanneli DataSource 0 1 BeanChannel2 rt Bhd na E Properties amp messages 4 search Properties Properties Name Value Name vValue AcmeFamilvTvpe ejbFam EntityBeanT AcmeFamilvTvpe ejbFam EntitvBeanT lt Back Next gt Finish Cancel Figure 26 Comparison of the Duke s Bank documented and recovered architectures The subject system HillClimber is part of a collection of Java ap plications to graphically demonstrate artificial intelligence algorithms built on the CIspace framework In particular HillClimber demon strates stochastic local search algorithms for constraint satisfaction problems HillClimber was re engineered from about 15 000 lines of Java
12. Match Acme and ArchJava Instances Please check the Messages tab 0 error s 21 warning s Found Acme Instances ArchJava Instances 235 HillClimber Compare 585 Hill E G Components id E Components 8 canvas ae il G canvas engine engine Fl B graph Sa e FE Ports Fl QB window fe repengine H Connectors Components conni sc i greedvSearch fp conn2 to greRRSearch connd Scroll mchSearch conn B privateengine tf conn8 Report randSearch rdWkSearch i simAnnealSearch Match ic simpleSearch Ee simRanSearch eae il B traceDialog Connectors graph window f G Ports B repwindow Components batchDialog E optionsDialog privatewindow B problemDialog C Connectors lt Back Next gt Cancel Figure 28 Manual overrides improve matching the instances The user forced a match between the engine nodes in the two trees by selecting them both and clicking on the Match button before running the differencing algorithm Legend Components Connectors Ports use CallReturnCann Q ir provide Figure 29 The as built HillClimber architecture sampkluw tex 16 12 2007 13 23 p 39 40 Differencing and Merging of Architectural Views named and manv nodes match exactiv This enables them to recognize the unchanged nodes first and use them as landmarks to efficiently identify the other changes However these algorithms are unable
13. Matching types between Acme left and ArchJava right Merging Instances The architect next turned his attention to the additional top level component shown as privateAphyds in Figure 21 Based on the synchronization options he selected he determined that the tool created privateAphyds to represent a private window port in ArchJava and the corresponding glue After looking at the con trol flow the architect assigned that subsystem the Publish Subscribe Acme style He also renamed component privateAphyds to window renamed the added connector to windowBus and assigned windowBus the EventBusT connector type from the style The architect also decided sampkluw tex 16 12 2007 13 23 p 32 Differencing and Merging of Architectural Views 33 ox Synchronize Acme and ArchJava Step 3 of 5 Match Acme and ArchJava Instances Please check the Messages tab 0 error s 29 warninafs Found Acme Instances ArchJava Instances 5 Aphvds Step3a Compare E 5 Aphvds Ga Components ve Components channelRouteviewer Clear channelRouteViewer circuitModel NoorplanDialog Ga Ports z H model ep channel stow 9 Ports o circuit 5 repmodel oy floorPlanner Order GI Components op partitioner A l channelRouter o place Scroll circuitData oj route 21 Floorplanner repmodel Report il G globalRouter Components PN il QB partitioner channel ee fl G placer circuit Fi Connectors il G floo
14. and rules it satisfies And one type is a subtype of another if the predicate of the first type implies the predicate of the second type Such a type system is highly desirable at design time because it allows designers to combine type specifications in flexible ways Acme embodies this approach but is hardly unique for instance PVS Rushby et al 1998 takes a similar approach As an example of using a predicate based type system consider an architecture that is a hybrid of the Pipe and Filter and Shared Data architectural styles In this example a Filter component type has at least one input and one output port while a Client component in the Shared Data style has at least one port to communicate with the repository A component in this architecture might inherit both the Filter and the Client specifications yielding a component that has at least three ports two for communicating with other filters and one for communicating with the Repository However implementation level type systems such as the ones pro vided by C2SADL Medvidovic et al 1996 or ArchJava cannot ex press the example above A specification that a component has a port implies a requirement that the environment will match that port up with some other component Therefore conventional type systems re quire a component type to list all of the ports it might possibly have or at least all those ports that are expected to be connected at runtime There is no way
15. at one level in the hierarchy is just a transparent view of a more detailed decomposition specified by the representation of that component Multiple representations for a given component or connec tor could correspond to alternative ways of decomposing an element On the other hand ArchJava views hierarchy in terms of integration of existing components along with glue code into a higher level compo nent Due to the glue a higher level component is semantically more than the sum of its parts These differing views of hierarchy create additional challenges for architectural synchronization For example if multiple representations are present at the design level there must be a way to specify which of these representations was actually implemented Matching Instances Obtaining the tree structured data from Acme simply converts the Acme architectural graph into the cross linked tree structure discussed earlier Acme does not have first class constructs for required and provided methods In keeping with Acme s model for extensible properties the tool adds properties on a port to represent its provided and required methods as well as other salient properties e g the port s visibility To obtain the tree structured data from an ArchJava implement ation the tool traverses the compilation units ignores classes that are not component types and fields that are not of component type Differ ent modeling choices are possible in this case Fi
16. have two separate implementations for MDIR and THP with some shared procedures It is necessary to limit the running time on trees with more than a few hundred nodes and when the average degree of a non leaf node is greater than four We enforce a bound R on the number of recursive calls of the backtracking search corresponding to a given subtree pair Although bound R removes the guarantee of optimality by limiting the number of recursive calls MDIR still obtains good results empirically Since MDIR uses the branch and bound technique a good match allows for tight bounds and therefore early cutting of branches The search ter minates normally for matrix entries that correspond to good matches and is interrupted only when the match is not good This allows MDIR to return an optimal match even if the backtracking is interrupted during the computation of cost matrix entries corresponding to matches that are not part of the optimal solution 3 3 ILLUSTRATIVE EXAMPLE In this section we illustrate the MDIR algorithm on a small example MDIR exhaustively computes from bottom to top the cost of mapping each node in T to every other node in T The computed costs are stored in a cost matrix Following the dynamic programming paradigm MDIR uses the comparison on the high depth nodes to compare the low depth nodes The example also illustrates the usefulness of the successor set approach since bipartite matching cannot match subtree nodes bec
17. level predicate based type system is fundamentally in compatible with a type system for a programming language As a result the matching algorithm may not rely on exactly matching typing in formation as in UMLDiff Xing and Stroulia 2005 In our approach the user specifies arbitrary matches between the type hierarchies from Acme and ArchJava flattened and shown side by side Consider synchronizing the Acme model of a simple system follow ing the Pipe and Filter style with its ArchJava implementation In Figure 16 the user matches the types as follows The user selects the Capitalize CharBuffer Lower Merge Split Upper compo nent types in ArchJava and matches them with Filter Acme type All the component instances of these ArchJava types will be assigned the Filter Acme type during synchronization Using a limited form of wildcards the user assigns the Acme type Pipe to the ArchJava connector type ANY So any Acme connector created for an implicit ArchJava connector instance will have that type Since ArchJava ports are not typed the user can individually assign to an ArchJava port a set of Acme port types To reduce the manual sampkluw tex 16 12 2007 13 23 p 27 28 Differencing and Merging of Architectural Views ax Synchronize Acme and ArchJava Step 4 of 5 Optional View and Modify the Edit Script Please check the Messages tab 0 error s 30 warning s Found Acme Instances B FB Ca Components A Select All MALY
18. to match nodes based on structure alone or based on structure and highly non unique semantic information such as entity types For instance a heuristic solution with a worst case O N supporting arbitrary move copy and glue operations was tested on instances where more than 80 of the nodes matched exactly Chawathe and Garcia Molina 1997 As a result these algorithms are less suitable for comparing architectural views as they will perform poorly when all the nodes are renamed or when most of the renamed nodes are concentrated in one area of the tree such as when entire subtrees are renamed This may be atypical when comparing two versions of a given program or a design model at a given level of abstraction In our architectural views most names are transient or automatically generated Both THP and MDIR would still work even in the total absence of semantic information i e using tree structure only For instance in the Aphyds and Duke s Bank examples our inputs had more than half of their nodes renamed Finally none of these algorithms offer the ability to manually force or prevent matches It may be possible to easily add the ability to prevent matches to some of them e g JDiff but adding the ability to force matches could be substantially more complicated Tree Alignment vs Tree Edit Tree differences can be repre sented using tree alignment instead of tree edit distance Each align ment of trees actually corresponds to a r
19. to express that a Filter component has at least two ports instead one must say that the Filter has at most or exactly two ports Therefore in the implementation one cannot sampkluw tex 16 12 2007 13 23 p 26 Differencing and Merging of Architectural Views 27 ax Synchronize Acme and ArchJava Step 2 of 5 Optional Match Acme and ArchJava Types Optionally view and match Acme types and ArchJava types Acme Types if ArchJava Types 408 PipesandFikersFam Match E E Archjava a Component Types ko a Component Types BinaryFilter Unmatch il G ANY 4 B DataSink B Capitalize DataSource wG CharBuffer Reset Filter H Q Lower Connector Types S x B Main a Pipe Show F Merge G Port Types a G Split o inputT Order Streamsink outputT amp StreamSource Role Types Check H Upper 3 Connector Types ef ANY DB Port Types o ANY PROVIDE_AND_REQUIRE c PROVIDE_ONLY lt REQUIRE ONLY 09 Role Types Acme Properties ArchJava Properties Name Name Value AcmeFamily Type PipesAndFiltersFam outputT lt teck Net cence Figure 16 Matching types between the as designed Acme model of a simple system following the Pipe and Filter style with its as built ArchJava implementation combine the Filter type with a Repository component type which defines a third port that is prohibited by the filter specification So a design
20. 6 37 Chen P H M Critchlow A Garg C van der Westhuizen and A van der Hoek 2003 Differencing and Merging within an Evolving Product Line Architecture In Proceedings of the 5th International Workshop on Software Product Family Engineering pp 269 281 Clements P F Bachman L Bass D Garlan J Ivers R Little R Nord and J Stafford 2003 Documenting Software Architecture View and Beyond Cambridge Massachusetts Addison Wesley Conte D P Foggia C Sansone and M Vento 2004 Thirty Years of Graph Matching in Pattern Recognition International Journal of Pattern Recognition and Artificial Intelligence 18 3 265 298 Dashofy E M A van der Hoek and R N Taylor 2002 Towards Architecture Based Self Healing Systems In Proceedings of the First Workshop on Self Healing Systems pp 21 26 Dickinson P J H Bunke A Dadej and M Kraetzl 2004 Matching Graphs with Unique Node Labels Pattern Analysis and Applications 7 3 243 254 Easterbrook S and B Nuseibeh 1996 Using ViewPoints for Inconsistency Management Software Engineering Journal 11 1 31 43 Egyed A 2006 Instant Consistency Checking for the UML In Proceeding of the 28th International Conference on Software Engineering pp 381 390 Eixelsberger W M Ogris H Gall and B Bellav 1998 Software Architecture Re covery of a Program Family In Proceedings of the 20th International Conferen
21. Differencing and Merging of Architectural Views Marwan Abi Antoun Jonathan Aldrich Nagi Nahas Bradley Schmerl and David Garlan Carnegie Mellon University Pittsburgh PA 15213 USA October 1 2007 Abstract Differencing and merging architectural views is an important activity in software engineering However existing approaches are still based on restrictive assumptions such as requiring view elements to have unique identifiers or exactly matching types which is often not the case in many application domains We propose an approach based on structural information We generalize a pub lished polynomial time tree to tree correction algorithm that detects inserts re names and deletes into a novel algorithm that additionally detects restricted moves Our algorithm also supports forcing and preventing matches between view elements We incorporate the algorithm into tools to compare and merge Component and Connector C amp C architectural views We provide an empirical evaluation of the algorithm We illustrate the tools using extended examples and use them to detect and reconcile interesting differences between real architectural views Keywords tree to tree correction view synchronization graph matching 1 Introduction The software architecture of a system defines its high level organization as a collection of runtime components connectors their properties and constraints on their interaction Such an architecture is commonl
22. ause of the need to preserve the hierarchical constraints MDIR starts by computing the cost of matching D to d Figure 4 Similarly MDIR computes the costs of matching D e D f D g E d E e E g In Figure 5 MDIR computes the cost of matching B to d Then MDIR computes the cost of matching B to b Figure 6 This requires knowing the cost of the optimal successor set mapping for B and b At this point MDIR has computed the costs of matching every descendent of B to any node in the second tree because of the post ordering of the trees sampkluw tex 16 12 2007 13 23 p 12 13 Differencing and Merging of Architectural Views Figure 4 Cost D d cost of editing label of D to d i e the measure of similarity between the labels in this case 0 kK D f 2 D g 3 D b D a E f 1 E g 2 E b E a B d 14B e B f B 9 B b B a Figure 5 Cost B d Cost deleting B s children CosT editing B s label Assuming the cost of a deletion is 5 times a unit cost CosT B d CosT deleting D CosT deleting E CosT editing B s label 5 5 2 o Q Ti 2 6 D d 0 D e 1 D f 2 D g 3 D b D a E d 1 E e 0 E f 1 E g 2 E b E a B d 12 B se AB h B gie AB b 0 B
23. capitalize mal MA M G Ports Check y portin MA portout repcapitalize Components s JE b Show A Ca Ports m mapi Order m A Ga Ports ML s 5 Posts v portin Eie portouti capitalize fey portout2 Representation repcapitalize 2 Mou v gt Auto Correct Properties amp Messages Element Type Element Name Description Port portOut Rename port portOut of type outputT to output to match port in component type Filter Port portIn Rename port portin of type inputT to input to match port in component type Filter Port portin Rename port portin of type inputT to input to match port in component type Filter Port portOut1 Rename port portOut1 of type outputT ta output to match port in component type Filter Port portin Rename port portin of type inputT to input to match port in component type Datasink lt Back Next gt Cancel Figure 17 Validating the edit script can involve renaming some ports to match the names declared in the Acme type work the user use another form of wildcards He can assign an Acme type e g outputT to any ArchJava port that only provides methods Similarly he can assign the inputT Acme type to any ArchJava port that only requires methods In addition AcmeStudio defines connec tion patterns for most architectural styles Based on these patterns the tool can infer the Acme role types
24. ce on Software Engineering pp 508 511 Erdogmus H 1998 Representing Architectural Evolution In Proceedings of the Conference of the Centre for Advanced Studies on Collaborative Research pp 159 177 Garlan D R T Monroe and D Wile 2000 Acme Architectural Description of Component Based Systems In G T Leavens and M Sitaraman eds Foundations of Component Based Systems Cambridge University Press pp 47 68 Hlaoui A and S Wang 2002 A New Algorithm for Graph Matching with Application to Content Based Image Retrieval In Proceedings of the Joint IAPR International Workshop on Structural Syntactic and Statistical Pattern Recognition pp 291 300 Jiang T L Wang and K Zhang 1994 Alignment of Trees An Alternative to Tree Edit In Proceedings of the 5th Annual Symposium on Combinatorial Pattern Matching pp 75 86 Jimenez A M 2005 Change Propagation in the MDA A Model Merging Approach Master s thesis University of Queesland sampkluw tex 16 12 2007 13 23 p 43 44 Differencing and Merging of Architectural Views Krikhaar R A Postma A Sellink M Stroucken and C Verhoef 1999 A Two Phase Process for Software Architecture Improvement In Proceedings of the IEEE International Conference on Software Maintenance pp 371 380 Mandelin D D Kimelman and D Yellin 2006 A Bayesian Approach to Diagram Matching with Application to Architectural Mod
25. ction In this section we describe a novel algorithm for unordered labeled trees MDIR Moves Deletes Inserts Renames which generalizes a re cent optimal tree to tree correction algorithm Torsello et al 2005 which we will refer to as THP We also implemented THP for experi mental comparison with MDIR Section 4 3 1 PROBLEM DEFINITION We first give an unambiguous definition of the problem adapted from Sasha and Zhang Shasha and Zhang 1997 We denote the i node of a labeled tree T in the postorder node ordering of T by Ti IT sampkluw tex 16 12 2007 13 23 p 7 8 Differencing and Merging of Architectural Views denotes the number of nodes of T We define a triple M Ti 72 to be a mapping from T to T2 where M is any set of pairs of integers i j satisfving the following 1 1 lt i lt Fil lt j lt 721 2 For any pair of 1 ji and i2 j2 in M with in iz if and only if ji ja one to one Til af is an ancestor of Ti ig if and only if T j1 is an ancestor of To j2 ancestor order preserved We will use M instead of M Ti T2 if there is no confusion To delete a node N in tree T we remove node N and make its children become the children of the parent of N To insert a node N in tree T as a child of node M we make N one of the children of M and we make a subset of the children of M become children of N See Figure 1 Renaming a node only updates its label and preserves any pr
26. e view could correspond to a documented architecture The second view could correspond to a C amp C view recovered using any architectural recovery technique e g Schmerl et al 2006 The second view could also be sampkluw tex 16 12 2007 13 23 p 24 Differencing and Merging of Architectural Views 25 another C amp C view either retrieved from a configuration management system or corresponding to a variant in a product line Synchronizing an as designed C amp C view with an as built C amp C view must often address expressiveness gaps between architectural infor mation at different levels of abstraction Although we use Acme and ArchJava to illustrate some of these differences that must be bridged synchronizing any pair of as designed and as built C amp C views may encounter similar challenges Structural Differences There will always be name differences of the same structural information between Acme and ArchJava For in stance an ArchJava port can be named in a reserved keyword in Acme Even if code generation automatically produces a skeleton implement ation from the architectural model connector names and role names are lost since ArchJava does not even name those elements Finally in Acme port names are critical for typechecking But in ArchJava port names are unimportant and obey the standard programming language notions of binding and scope Hierarchy Acme treats hierarchy as design time composition where a component
27. eights L i j cost of string to string correction to change LABEL i to LABEL j MDIR Ti T2 Postorder T and To nodes for i 1 to Ti size do for j 1 to T size do BestSuccessor i j SEARCH i j CostMatrizjijiji Cost BestSuccessor i j L ij GETBESTMATCHING T size T gt size SEARCH 7 gt i j indices in trees Ti and To respectively Let L be the list of pairs p q where pis a descendent of i and q is a descendent of j Sort L according to MatchMerit p q Set BestSolution empty list Set CurrentSolution empty list Set BestCost infinity BACKTRACK O index L 0 CurrentCost return BestSolution Figure 2 Pseudo code of the algorithm sampkluw tex 16 12 2007 13 23 p 10 Differencing and Merging of Architectural Views 11 BACKTRACK indez L P Search for a good mapping between subtrees b index position reached in list L gt L list of pairs of nodes m n sorted by merit gt CurrentCost sum of the cost of the elements in CurrentSolution if no element of L can be added to CurrentSolution gt Base case then if CurrentCost cost of deleted subtrees lt BestCost then BestSolution CurrentSolution BestCost CurrentCost return foreach element l m n in L starting at index do if CurrentSolution already contains m n or any of their ascendants or descendents then continue if adding to current mapping violates bound B then continue Add cost of match to Curr
28. els In Proceedings of the 28th International Conference on Software Engineering pp 222 231 Medvidovic N and V Jakobac 2006 Using Software Evolution to Focus Architec tural Recovery Automated Software Engineering 13 2 225 256 Medvidovic N P Oreizy J E Robbins and R N Taylor 1996 Using Object Oriented Typing to Support Architectural Design in the C2 Style In Proceedings of the 4th ACM SIGSOFT Symposium on Foundations of Software Engineering pp 24 32 Medvidovic N and R N Taylor 2000 A Classification and Comparison Frame work for Software Architecture Description Languages IEEE Transactions on Software Engineering 26 1 70 93 Mehra A J Grundy and J Hosking 2005 A Generic Approach to Supporting Di agram Differencing and Merging for Collaborative Design In Proceedings of the 20th IEEE ACM International Conference on Automated Software Engineering pp 204 213 Melnik S H Garcia Molina and E Rahm 2002 Similarity Flooding A Ver satile Graph Matching Algorithm and Its Application to Schema Matching In Proceedings of the 18th International Conference on Data Engineering pp 117 128 Mens T and P Van Gorp 2005 A Taxonomy of Model Transformation In Proc Int l Workshop on Graph and Model Transformation Messmer B 1996 Efficient Graph Matching Algorithms for Preprocessed Model Graphs Ph D thesis University of Bern Monroe R 2001
29. entCost to obtain NewCost Get a lower bound F of remaining cost using MatchMerit if E NewCost gt BestCost then continue Add to CurrentSolution BACKTRACK index 1 L NewCost Remove from CurrentSolution GETBESTMATCHING i j Deduce the optimal mapping gt i j pair of nodes belonging to best possible mapping between the two trees foreach element e m n in BestSuccessor i j do Add e to BestGlobalMatch GETBESTMATCHING m n Figure 3 Pseudo code of the algorithm continued manipulation For example during the backtracking search checking whether a node is still available is a single bitwise AND operation instead of a loop over an array MDIR can be considered a generalization of THP because THP only handles the case where B 0 i e only the children of a node can be sampkluw tex 16 12 2007 13 23 p 11 12 Differencing and Merging of Architectural Views in a successor set of that node This produces a fullv polvnomial time algorithm that is tvpicallv much faster than our generalized algorithm Handling non zero values of B allows MDIR to detect hierarchical moves MDIR is guaranteed to find the optimal matching within the constraints of the bound B provided it is allowed to run long enough In principle one could use the same implementation for both THP and MDIR and adjust MDIR s parameters to simulate THP e g by modifying the SEARCH procedure in Figure 2 accordingly However we currently
30. estricted tree edit in which all the insertions precede all the deletions Algorithms based on tree align ment can detect unbounded deletes and can generalize to more than two trees something not easily done with tree edit distance algorithms Jiang et al 1994 But the memory requirements of tree alignment algorithms for the tree sizes and branching factors that are typical of our inputs would be several orders of magnitude higher than those of MDIR O 224N where d is the maximum degree of the tree Graph Matching Approaches Exhaustive graph matching algo rithms based on variants of the A algorithm Messmer 1996 do not scale beyond a few dozen nodes Hlaoui and Wang 2002 In the context of architectural views Sartipi proposed an approach for architectural recovery using a variant of the A graph matching algorithm but with an optimization that may cause it to miss the optimal solution in some cases Sartipi and Kontogiannis 2003 More scalable heuristic based approaches such as spectral methods perform poorly when the graphs are not nearly isomorphic Further more these algorithms occasionally miss the optimal solution Conte et al 2004 Others such as the Similarity Flooding Algorithm SFA sampkluw tex 16 12 2007 13 23 p 40 Differencing and Merging of Architectural Views 41 have an accuracy of around 50 Melnik et al 2002 The accuracy of MDIR is above 90 on a roughly similar range of graph sizes Fu
31. h edits that follow the probability distribution of typical graph edits Since we did not have such a distribution we generated random trees sampkluw tex 16 12 2007 13 23 p 18 Differencing and Merging of Architectural Views 19 45096 400 350 300 250 200 150 Sub Optimal Edit Script 100 50 0 0 10 20 30 40 50 60 70 Sub Optimal Edit Script vs Renames Renames mTHP N 1280 E THP N 640 a THP N 320 AMDIR N 1280 R 100K MDIR N 1280 R 5K AMDIR N 640 R 100K MDIR N 640 R 5K a MDIR N 320 R 100K MDIR N 320 R 5K Figure 10 A comparison of THP and MDIR R 100K and R 5K showing the sub optimality of the edit script vs the percentage of renames 60 50 40 30 20 Sub optimal edit script 10 0 0 Sub optimal Edit Script vs Deletes 10 20 Deletes 30 40 50 THP N 1280 E THP N 640 m THP N 320 AMDIR N 1280 R 100K MDIR N 1280 R 5K AMDIR N 640 R 100K MDIR N 640 R 5K a MDIR N 320 R 100K MDIR N 320 R 5K Figure 11 A comparison of THP and MDIR R 100K and R 5K showing the sub optimality of the edit script vs the percentage of deletes sampkluw tex 16 12 2007 13 23 p 19 20 Differencing and Merging of Architectural Views Sub Optimal Edit Script vs Internal Deletes 300 250 200 150 100 S
32. hallenges in differencing and merging architectural views the underlying assump tions and the limitations of our approach Section 3 describes our novel tree to tree correction algorithm Section 4 presents an empir ical evaluation of the algorithm In Section 5 we use the algorithm to synchronize architectural C amp C views Section 6 illustrates the approach using extended examples on real architectural views Finally we discuss related work in Section 7 and conclude 2 Architectural View Differencing Software architects rely on multiple architectural views where a view represents a set of system elements and the relationships between them Views can be of different viewtypes where each viewtype defines the element types and the relationship types used to describe a software system from a particular perspective Clements et al 2003 Since a view is generally described as a graph view differencing and merging is a problem in graph matching Graph matching measures the similarity between two graphs us ing the notion of graph edit distance i e it produces a set of edit operations that model inconsistencies by transforming one graph into another Conte et al 2004 Typical graph edit operations include the deletion insertion and substitution of nodes and edges Each edit operation is assigned a cost The costs are application dependent and model the likelihood of the corresponding inconsistencies Typically the more likely a cer
33. he 10th International Workshop on Program Comprehension pp 3 10 Torsello A D Hidovic Rowe and M Pelillo 2005 Polynomial Time Metrics for Attributed Trees IEEE Transactions on Pattern Analysis and Machine Intelligence 27 7 1087 1099 van der Westhuizen C and A van der Hoek 2002 Understanding and Propagating Architectural Changes In Proceedings of the Working IFIP Conference on Software Architecture pp 95 109 Wagner R A and M J Fischer 1974 The String to String Correction Problem Journal of the ACM 21 1 168 173 Wang Y D J DeWitt and J Y Cai 2003 X Diff An Effective Change De tection Algorithm for XML Documents Proceedings of the 19th International Conference on Data Engineering pp 519 530 Xing Z and E Stroulia 2005 UMLDiff an Algorithm for Object Oriented Design Differencing In Proceedings of the 20th IEEE ACM International Conference on Automated Software Engineering pp 54 65 Zhang K and T Jiang 1994 Some MAX SNP Hard Results Concerning Unordered Labeled Trees Information Processing Letters 49 5 249 254 sampkluw tex 16 12 2007 13 23 p 45 sampkluw tex 16 12 2007 13 23 p 46
34. hierarchical archi tectural views While not all architectural views are hierarchical many use hierarchy to attain both high level understanding and detail In a C amp C view the tree like hierarchy corresponds to the system decom position but cross links between the system elements form a general graph Other architectural views such as module views have similar characteristics Clements et al 2003 Many approaches are hierar chical Apiwattanapong et al 2004 Raghavan et al 2004 Xing and Stroulia 2005 So our choice is hardly new However we relax the constraints of existing approaches as follows No Unique Identifiers For maximum generality we do not re quire elements to have unique identifiers as in other approaches e g Chen et al 2003 Mehra et al 2005 As mentioned earlier this assumption alone enables the use of exact and scalable algorithms that can handle thousands of nodes Dickinson et al 2004 Unfortunately architectural view elements often do not have unique identifiers No Ordering In the general case an architectural view has no inherent ordering amongst its elements This suggests that an un ordered tree to tree correction algorithm might perform better than one for ordered trees Many efficient algorithms are available for ordered labeled trees e g Shasha and Zhang 1997 In comparison tree to tree correction for unordered trees is MAX SNP hard Zhang and Jiang 1994 Some algorithms for
35. his optimization could vield a significant performance gain and allow the implementation to handle larger values of B as well as trees of larger degree than is currentiv possible Another optimization would be to reduce the number of nodes in the trees under comparison e g bv adding some nodes as attributes on their parents In summarv MDIR has a dramaticallv improved accuracv over THP and an acceptable non interactive performance for manv usage scenar ios Unlike exact graph matching algorithms it can scale to thousands of nodes and can handle realistic architectural views as the extended examples in Section 6 will demonstrate 5 Architectural View Svnchronization In this section we use tree to tree correction algorithms to svnchronize hierarchical graphs corresponding to C amp C views 5 1 GENERAL APPROACH We represent the structural information in a C amp C view as a cross linked tree structure that mirrors the hierarchical decomposition of a system The tree also includes some redundant information to improve the accuracy of the structural comparison For instance the subtree of a node corresponding to a port includes additional nodes for all the port s involvements i e all the components and their ports reachable from that port Each node is decorated with properties such as type information The type information if provided populates a matrix of incompatible nodes that may not be matched That matrix also includes op
36. ints into account to improve the overall match The user can specify any set of constraints as long as they preserve the ancestry relation between the forcibly matched nodes In particular if a is an ancestor of b a is forcibly matched to c and b is forcibly matched to d then c must be an ancestor of d Optional Type Information Architectural views may be un typed or have different or incompatible type systems This is often the case when comparing views at different levels of abstraction such as an as designed conceptual level view with an as built implementation level view Therefore an algorithm should not rely on matching type information and should be able to recover a correct mapping from structure alone if necessary or from structure and type information if type information is available An algorithm could however take advan sampkluw tex 16 12 2007 13 23 p 5 6 Differencing and Merging of Architectural Views TI T2 rename a gt a delete a insert a Figure 1 Tree edit operations sampkluw tex 16 12 2007 13 23 p 6 Differencing and Merging of Architectural Views 7 tage of tvpe information when available to prune the search space bv not attempting to match elements of incompatible tvpes If the view elements are represented as tvped nodes at the verv least an algorithm should not match nodes of incompatible tvpes e g it should not match a connector x to a component y If architectural sty
37. iss Figure 6 CosT B b CosT successor set mapping of B b CosT editing the label of B to b CosT D d and CosT E e have been previously computed thus CosT B b CosT D d CosT E e 0 sampkluw tex 16 12 2007 13 23 p 13 14 Differencing and Merging of Architectural Views Figure 7 Computing the cost of matching B to b requires the successor set mapping of the pair B b The successor set mapping of B b is the set D d E e Fu KAN T D d 0 D e 1 3 J D a E d 1 E e 0 E ac 1 ae la 5 je a T B d 14 B wae B B jed B b 0 B 7 Figure 8 CosT B a Cost successor set mapping of B a CosT editing the label of B to a Cosr deleting b f and g The optimal successor set mapping corresponding to the pair B b is computed as follows Figure 7 First take all the node pairs where the first item is a descendent of B and the second item is a descendent of b i e the set K D d 4 D e E d E e The optimal mapping will clearly be a subset of this set To obtain that optimal mapping we examine all mappings except the ones that have been pruned because the bounds on their cost showed they could not be optimal The other constraint is if x y is a pair in a mapping neither x nor y nor any of their ascendents or descendents can appear in any other pair in the same mapping Thus
38. it is sufficient to find an optimal mapping 3 2 EXPLANATION OF THE ALGORITHM The algorithm s pseudo code is shown in Figures 2 and 3 Let C i j be the cost of the optimal mapping from the subtree rooted at 7 to the subtree rooted at j A set of nodes S i is a successor set of node i if it is a subset of the set of descendents of i none of the elements of S i sampkluw tex 16 12 2007 13 23 p 8 Differencing and Merging of Architectural Views 9 is an ancestor of another and each node of the subtree rooted at is either a descendent or an ancestor of an element of S i Given two sets S i where i belongs to T and S j and j belongs to Tb it is possible to define the optimal mapping of S i to S j as a one to one function from a subset of S i into S j with least cost The cost of mapping element k of S i to element l of S j is equal to cost of the optimal mapping of the subtree rooted at k to the subtree rooted at l The cost of leaving an element k of S i without image is equal to the cost of deleting the whole subtree rooted at k The cost of having an unmatched element l in S j is equal to the cost of inserting the entire subtree rooted at l This suggests that if we know all the costs C di da where d is a descendent of i and d is a descendent of j it is possible to compute C j by considering all possible pairs of sets S S j and for each such pair getting the minimum weight bipartite matching defi
39. ith B typically a small integer The reasoning behind this constraint is that nodes are not usually moved too far from their original positions in a hierarchy It is also relatively rare for several non leaf siblings to be deleted at the same time The bound B has the additional benefit that only relatively small neighborhoods of each node have to be considered for the computation of the optimal cost of a single subtree pair This also enables performing many operations efficiently using bit sampkluw tex 16 12 2007 13 23 p 9 10 Differencing and Merging of Architectural Views BestSolution list of node pairs that represents the best discovered matching between successor sets of two nodes where a successor set of node i is denoted by S i CurrentSolution dynamic list of node pairs that represents a matching being built between successor sets of two nodes CostMatriz CostMatriz i j is the cost of the optimal mapping from S i to S j BestCost cost of the BestSolution matching BestGlobalMatch array of node pairs corresponding to least cost mapping from T1 to T2 BestSuccessor 2D array of lists of node pairs m n BestSuccessor i j means m n is a match between one element of S i and one element of S j in an optimal mapping from S i to S j MatchMerit i j measure of the similarity i e quality of matching not cost between nodes and j deduced from Cost Matrizjillji as 1 CostMatriz i j sum of subtree w
40. le information is available additional architectural types may be available and could be used for similar purposes For instance an algorithm can avoid matching a component of type Filter from a Pipe and Filter architectural style to a component of type Repository from a Shared Data architectural style Shaw and Garlan 1996 Disconnected Stateless Operation For maximum generality we assume a disconnected and stateless operation A few approaches require monitoring or recording the structural changes while the user is modifying a given view Jimenez 2005 Roshandel et al 2004 Comparable Views The two views under comparison have to be somewhat structurally similar When comparing two completely different views an algorithm could trivially delete all elements of one view and then insert them in the other view In addition the two views must be of the same viewtype and must be comparable without any view transformation Checking the consistency of different but related views such as a UML class diagram and a UML sequence diagram is a problem in view integration Egyed 2006 and is outside the scope of this paper No Merging Splitting Our approach does not currently detect the merging or splitting of view elements Merging and splitting are common practice but are difficult to formalize since they affect connec tions in a context dependent way Erdogmus 1998 We leave merges and splits to future work 3 Tree to Tree Corre
41. match the output port on the Filter type The user can accept the corrective actions suggested by the tool using the Auto Correct button in Figure 17 In that case the tool automatically renames port0ut port to output and updates all the cross references in the edit script The user can also change the assigned or inferred types before pushing the changes to the Acme model 6 Extended Examples In this section we evaluate the tools for C amp C view synchronization in several extended examples on real architectural views 6 1 EXTENDED EXAMPLE APHYDS In this example we synchronize an as designed C amp C view with an as built C amp C view retrieved from an implementation This example mainly highlights the ability of the underlying MDIR algorithm to detect inserts deletes and renames The subject system is Aphyds a pedagogical circuit layout applica tion To evaluate ArchJava s expressiveness to specify the architecture in code Aphyds was re engineered into an ArchJava implementation starting with 8 000 source lines of Java code not counting the li braries used Aldrich et al 2002 We chose Aphyds since it had a doc umented as designed architecture and the ArchJava implementation enables extracting the as built C amp C view using a tool The developer of the original Java program informally drew the as designed Aphyds architecture in Figure 18 In the following discussion the architect performing the synchronization is a
42. n Languages ADLs To evaluate our approach we represented the C amp C views in the Acme general purpose ADL Garlan et al 2000 We also developed a tool to extract as built C amp C views from Arch Java Aldrich et al 2002 Similar tools can extract as built views from other an implementation constraining ADL with code generation ca pabilities or an implementation independent ADL with an implement ation framework such as C2 Medvidovic and Taylor 2000 We intended our synchronization tools to be lightweight enough that they can fit into a single dialog in an integrated development environ ment such as Eclipse Object Technology International Inc 2003 and not require a specialized environment for architectural recovery Telea et al 2002 Both AcmeStudio a domain neutral architecture modeling environment for Acme Schmerl and Garlan 2004 and Arch Java s development environment are Eclipse plugins which reduced the tool integration effort One synchronization tool can make an as designed architectural view expressed in Acme incrementally consistent with an as built architectural view extracted from an ArchJava implementation We still need to change the ArchJava infrastructure to support making incremental changes to an existing ArchJava implementation based on changes to the as designed view Another synchronization tool based on the same approach more generally takes any two C amp C views represented in Acme On
43. n the edges In particular the comparison assumes that the two views have the same number of nodes and that the nodes are identically named Furthermore in Reflexion Models views are non hierarchical which requires generating different views for each level of the hierarchy Checking the conformance of two hierarchical views using our approach can be considered a generalization of the computation of the Reflexion Model Of course there are many other less automated approaches For instance the FOCUS approach checks the conformance of an implement ation with respect to an architectural style but manually relates the as designed and the as built views Medvidovic and Jakobac 2006 Consistency Management There is significant work in the area of viewpoints view merging and inconsistency management e g East erbrook and Nuseibeh 1996 Egyed 2006 A viewpoint captures data from disparate sources into independent but interrelated units In view merging there is also a notion of knowledge order or degree i e a match can be disputed When synchronizing between an as built and an as designed architecture one may want to model incompleteness and inconsistency as a first class notion In our approach we model sampkluw tex 16 12 2007 13 23 p 41 42 Differencing and Merging of Architectural Views both views using the same viewtype arbitrarily bridging the inevitable expressiveness gaps in the process We also assume that one of the
44. ned by the entries of the cost matrix C corresponding to the elements of S i and S j Finally let L j be the cost of changing the label of node in the source tree to the label of node j in the destination tree The minimum cost obtained added to L i 7 will be equal to C i j L i j uses string to string correction to evaluate the intrinsic degree of similarity between the labels of two nodes using a standard algorithm to find the longest common subsequence Wagner and Fischer 1974 We choose the best pair S i S j using a branch and bound back tracking algorithm Let DESC i denote the set of descendents of i We try to choose a subset Q of DESC i x DESC j with minimal cost This is done by trying to add to Q one element of DESC i x DESC J such that the new element in Q is consistent with the previous elements i e no same node can be matched to two different nodes nor can a node appear in an element of Q if either a descendent or an ancestor already appears in some element of Q The algorithm backtracks when it determines that there are no more valid pairs to add or the cost of the current branch will be too large to match the best solution discovered to date As the problem is NP complete the approach outlined above can quickly become intractable without additional constraints We chose to enforce an upper bound B on the sum of distances between elements of S i and the closest child of i respectively S j and j w
45. nformation about the element even if this produces structurally equiv alent views These architectural properties such as throughput latency etc are crucial for many architectural analyses e g Spitznagel and Garlan 1998 In the following discussion a matched node is a node with either an exactly matching label or a renamed label Hierarchical Moves Architects often use hierarchy to manage complexity In general two architects may differ in their use of hier archy a component expressed at the top level in one view could be nested within another component in some other view This suggests that an algorithm should detect sequences of internal node deletions in the middle of the tree which result in nodes moving up a number of levels in the hierarchy An algorithm should also detect sequences of internal node insertions in the middle of the tree which result in nodes moving down the hierarchy by becoming children of the inserted nodes as shown in Figure 1 Manual Overrides Structural similarities may lead a fully auto mated algorithm to incorrectly match top level elements between two trees and produce an unusable output Because of the dependencies in the mapping one cannot easily adjust incorrect matches after the fact Instead we added a feature not typically found in tree to tree correction algorithms The feature allows the user to force or prevent matches between selected view elements The algorithm then takes these constra
46. on time Start at Component the user can select any component to be the tree root and can reduce the tree sizes by selecting subtrees Restrict Tree Depth the user can exclude from comparison any nodes beyond a certain tree depth Elide Elements the user can exclude selected nodes and their entire subtrees from comparison Elision is temporary and does not generate any edit actions The tool gives the user manual control using the following features sampkluw tex 16 12 2007 13 23 p 23 24 Differencing and Merging of Architectural Views Forced matches the user can manuallv force a match between two elements that mav not match structurallv Manual overrides the user can override anv edit action sug gested bv the structural comparison Step 4 produces from the edit script a common supertree that previews the merged view after the edit actions are applied In this step the user can assign tvpes to elements to be created change the tvpes of existing elements or override tvpes that were automaticallv inferred based on the tvpe matching in Step 2 The tool also checks the edit script for errors such as illegal element names The user can also rename anv architectural element that the edit script will create Finallv the user can cancel anv unwanted edit actions 5 2 SPECIALIZED TOOLS This approach supports building architectural view differencing and merging tools for a wide range of Architecture Descriptio
47. ons of deletions inser tions internal deletes internal inserts and renames Table II lists some of the different trees we generated Although the performance numbers were sensitive to the percentages of the different kinds of edits the trends were very similar Figure 14 shows a plot of the performance of the MDIR algorithm with R 5K when varying the tree sizes For comparison the times for THP are shown in the corresponding THP series Our empirical results clearly confirm our earlier theoretical complexity analysis i e the growth is quadratic Although Figure 14 shows that MDIR performs much slower than THP it does not give any indication on the accuracy of MDIR compared to THP which we previously demonstrated We have avoided prematurely optimizing our current implement ation to allow for easier debugging but we think that we can improve the running time in several ways Heuristics such as simulated an Table II Random trees generated for the performance evaluation of MDIR Case Deletes Inserts Internal Deletes Internal Inserts Renames MDIR_1 20 20 10 10 40 MDIR_2 10 10 10 10 60 MDIR 3 10 10 30 30 20 MDIR_4 5 5 15 5 15 sampkluw tex 16 12 2007 13 23 p 21 22 Differencing and Merging of Architectural Views nealing or genetic algorithms could significantlv improve the SEARCH procedure bv obtaining a better initial solution and thus a better start ing cost T
48. operations for various reasons or may be interested only in a change impact analysis Krikhaar et al 1999 Because Steps 1 and 5 are straightforward we will only discuss Steps 2 4 In Step 2 manually matching the type structures between the two views produces semantic information that speeds up the comparison This optional information can also reduce the amount of data entry for assigning types to the elements that the edit script will create In Step 3 matching instances proceeds as follows a build tree structured data from the two C amp C views to be compared b use tree to tree correction to identify matches and structural differences and c obtain an edit script to merge the two views The tool shows the structural differences by overlaying icons on the affected elements in each tree See Figure 15 If an element is renamed the tool automatically selects and highlights the matching element in the other tree For inserted or deleted elements the tool automatically selects the insertion point by navigating up the tree until it reaches a matched ancestor The tool shows in bold a node if it detects differences in its subtree The tool shows in italics ports that are inherited from the component type Figure 15 Figure 15 a indicates a match Figure 15 b indicates a rename Figure 15 c indicates an insertion and Figure 15 d indicates a deletion Various features can restrict the size of the trees and help reduce the comparis
49. operties associated with it In comparison THP does not allow any insertions or deletions in the middle of the tree It works under the assumption that if two nodes match so do their parents i e only subtrees can be inserted or deleted Suppose we obtain a mapping M between trees Ti and To From this mapping we can deduce an edit script i e a sequence of edit operations to turn T into T First we flag all unmatched nodes in the first tree as deleted and all unmatched nodes in the second tree as inserted We order the operations so that all deletion operations precede all insertion operations delete the nodes in order of decreasing depth deepest node first and insert them in increasing depth order To define the cost of an edit script for each node in the source tree we choose a cost of deletion not necessarily the same for all nodes For each node in the destination tree we choose a cost of insertion again not necessarily the same for all nodes Finally for each pair of nodes n m where n is some node in Ti and m in To we choose a cost of changing the label of n into the label of m For example string to string correction changes banana into ananas with a cost of two Wagner and Fischer 1974 The cost of the edit script is then equal to the sum of the costs of insertion deletion and renaming operations it contains Therefore any given mapping has a unique cost So to find an optimal edit sequence
50. ows that would be present in a real application i e the data flow is simulated rather than real So the architect accepted the edit actions to delete these extra connectors from the Acme model sampkluw tex 16 12 2007 13 23 p 31 32 Differencing and Merging of Architectural Views Acme Types aP Orchlava Types EAE ArchlavaFam Match JE i Archlava 9 Component Types ap 2 a Component Types ArchJavaComponent LUnmatch ANY a Ports Fl B Aphyds Connector Types ie Fl B AphydsModel 69 Port Types ChannelDisplayer ArchJavaPortT ChannelRouteDialag 49 Role Types show ChannelRouter B ArchlawaRoleT ChannelRouteViewer a MWEFAm Order Circuit Component Types CircuitDialog ControllerNodeT Fl B CircuitDisplayer Ca Parts H Circuitviewer ModelNodeT FloorplanDialag Ports H Floorplanner A B TierNodeT GlobalRouter a Parts MetDialog H ViewModer NodeDialog Forts PartDialog 3 WindowhlodeT Partitioner Ports PartTransViewerDialog 49 Connector Types aa CallReturn onnT ta Roles H gt EventBusT a Roles 9 Port Types o commandT o modelT provideT o pubsubT o UseT 49 Role Types m providerT B userT B PlacementGraphDialog Placer PlaceRouteDialag PlaceRouteDisplayer B PlaceRouteviewer Connector Types 29 Port Types o ANY o PROVIDE_ONLY REQUIRE_ONLY Role Types Figure 20
51. parison This time the MDIR algorithm took into account these manual overrides when matching the instances Having correctly matched the top level elements the comparison highlighted additional differences between the two views For instance Figure 28 shows several missing sub architectures But the user decided to merge only the changes for the top level elements and obtained the as built architecture in Figure 29 Discussion In a product line architecture each instantiation of a framework often introduces additional runtime dependencies Indeed HillClimber added several connections to the documented architec ture and these connections seem mostly justified For instance the connection between engine and canvas is needed since one of the sub components of engine required access to functionality from the canvas 7 Related Work Landmark based Algorithms We group several algorithms that have been proposed for differencing hierarchical information under the category of landmark based algorithms they have been proposed in the context of program differencing e g JDiff Apiwattanapong et al 2004 Dex Raghavan et al 2004 and design differencing e g UMLDiff Xing and Stroulia 2005 These algorithms are based on the assumption that the entities they are trying to match are uniquely sampkluw tex 16 12 2007 13 23 p 38 Differencing and Merging of Architectural Views 39 ex Synchronize Acme and ArchJava Step 3 of 5
52. rPlanner Ee conn_circuitData_main__partitioner_cit partitioner Reset placeRouteVviewer place privateAphvds 1 route viewer G Connectors Connectors 5 conn_floorPlanner_place fl 99 conn channelRouteviewer channel model chanr FR conn_partitioner_floorPlanner g r conn__floorplanDialog_floorplan__model_floorplan H gam place route conn placeRouteviewer place viewer pl E conn_route_channel il 97 conn placeRouteviewer router viewer r starConnector il 97 conn viewer circuit placeRouteviewer c circuitViewer conn__viewer_command__placeRouteviewe G floorPlanviewer fl g conn viewer partition model partition placeRouteViewer conn__viewer_window__floorplanDialog_wir Connectors conn circuitModel channelRouteviewer l gt EE Properties amp messages 7 Search sek nets e l ew Figure 21 Comparison of Acme C amp C view left and ArchJava C amp C view right starConnector matches a connector in ArchJava with an automatically generated name highlighted nodes privateAphyds exists in ArchJava but not in Acme to use the same component names as the ArchJava implementation to avoid future confusion so he accepted the renames in the edit script Discussion Figure 22 shows the resulting C amp C view after it has been manually laid out in AcmeStudio Unlike the original architect s view Figure 22 shows bi directional communication taking place be t
53. rst ArchJava does not name connectors or connector roles The tool generates synthetic names from the components and ports that a connector connects to Second the ArchJava top level component can have ports whereas the top level sampkluw tex 16 12 2007 13 23 p 25 26 Differencing and Merging of Architectural Views component in Acme i e the Acme svstem cannot One option is to create a top level component in Acme to correspond to ArchJava s top level component Another is to create a synthetic component to hold these ports Third ArchJava ports can be private whereas all Acme ports are public One option is to represent ArchJava private ports as Acme ports on an internal component instance another is to simply ignore private ports Matching Types Assigning architectural styles and types to an Acme view enforces the architectural intent using constraints Monroe 2001 For instance a constraint on a component type may specify that all instances of that type must have exactly two ports Similarly setting architectural styles on the overall system and on each sub system representation if applicable enforces any constraints associated with the style In Acme the Pipe and Filter style prohibits cycles a constraint that a general purpose implementation language such as ArchJava does not directly enforce In many design languages types are arbitrary logical predicates An element is an instance of any type whose properties
54. rst case was on a source tree of 640 nodes separated from its target by an optimal edit script of 440 operations containing both deletions and renames In that sampkluw tex 16 12 2007 13 23 p 17 18 Differencing and Merging of Architectural Views case the returned edit script was 2 5 times longer than the optimal edit script MDIR produced good results with most trees even when the optimal edit script involved 2 3 of the number of nodes With up to 85 of the nodes renamed and no deletions MDIR produced edit scripts within less than 1 of the optimal script on trees of 640 nodes So MDIR can recover the mapping from tree structure alone The improved match quality comes at a heavy runtime cost MDIR was about 60 times slower than THP on average with bound R 100K As predicted setting bound R to 5K produced slightly sub optimal edit scripts but noticeably reduced the running time On a tree of 1 280 nodes with an optimal edit script of 396 edits THP produced an edit script of size 1 775 in 7 seconds MDIR with R 100K produced an edit script of size 459 in 6 minutes whereas MDIR with R 5k produced an edit script of size 479 in 4 minutes Empirical data with those two different values of R is shown in Figures 10 11 12 and 13 Varying the bound R did not have much effect on MDIR s precision Note that all the tree pairs used in those figures have internally deleted nodes even if this is not the varying parameter Figure 10 show
55. rthermore SFA relies heavily on labels which are different when the graphs originate from different domains even if they express the same relationships while matching of an XML schema against another XML schema delivers usable results matching of a relational schema against an XML schema fails Melnik et al 2002 Mandelin et al proposed probabilistic matching based on label region type or position information Mandelin et al 2006 but the approach requires training the evidencers Mandelin et al also mention that a simple greedy search algorithm does not work in many cases Model Transformation Graph transformation approaches sur veyed by Mens and van Gorp Mens and Van Gorp 2005 tackle the same problem but use a different set of assumptions First in many graph grammars productions do not delete vertices and edges which effectively prohibits insertions and deletions one of our require ments Second graph transformation approaches do not attempt to find the optimal transformation that would preserve properties of view elements Finally these approaches do not yet offer easy to use tools such as the ones illustrated in Section 6 Conformance Checking The technique we followed in the ex tended examples is similar in spirit to the Reflexion Models technique Murphy et al 2001 Using Reflexion Models to check the confor mance of the as designed and the as built view is limited to checking the correspondences betwee
56. s some of these limitations Our contributions are Differencing and merging architectural views based on structural information Tree to tree correction algorithms identify matches and classify the changes between the two views Optional type in formation prevents matches between incompatible view elements thus speeding execution and improving match quality A novel polynomial time tree to tree correction algorithm The algorithm adapts a recently published optimal tree to tree correc tion algorithm for unordered labeled trees that detects renames inserts and deletes Torsello et al 2005 and generalizes it to additionally detect restricted moves Our algorithm also supports forcing and preventing matches between elements in the views under comparison An empirical evaluation of the novel algorithm and a comparison with the previously published algorithm A set of tools for the semi automated synchronization of C amp C views using these algorithms One tool can synchronize an as designed C amp C view with an as built C amp C view retrieved from sampkluw tex 16 12 2007 13 23 p 2 Differencing and Merging of Architectural Views 3 an implementation Another tool can more generallv svnchronize two C amp C views regardless of how they were obtained An evaluation of the tools to find and reconcile interesting differ ences in real architectural views The paper is organized as follows Section 2 describes the c
57. s the sub optimality of the edit script when varying the percentage of renames for both THP and MDIR The worst edit script for MDIR was suboptimal by around 100 whereas THP was off by over 400 This figure may mislead the reader into thinking that the accuracy of THP increases with the percentage of renames Of course it does not THP detects renames but not internal deletes so when the percentage of renames in the optimal edit script increases compared to the other operations THP s precision seems to improve Figure 11 shows the sub optimality of the edit script when varying the percentage of deletes for both THP and MDIR THP generated one edit script that was suboptimal by over 50 whereas MDIR generated fully optimal edit scripts Figure 12 shows the sub optimality of the edit script when varying the percentage of inserts for both THP and MDIR THP generated edit scripts that were suboptimal by around 110 on average whereas MDIR generated edit scripts that were off by 20 on average Figure 13 shows the sub optimality of the edit script when vary ing the node degree for both THP and MDIR THP generated edit scripts that were suboptimal by over 110 on average whereas MDIR generated edit scripts that were off by 16 on average The previous data showed mainly the improvement in the precision of the matching We also wanted some indication on the algorithm s performance and scalability Ideally one would generate graphs wit
58. se Study in Re engineering to Enforce Architectural Control Flow and Data Sharing Journal of Systems and Software 80 2 240 264 Abi Antoun M J Aldrich D Garlan B Schmerl N Nahas and T Tseng 2005 Improving System Dependability by Enforcing Architectural Intent In Proceedings of the Workshop on Architecting Dependable Systems pp 1 7 Alanen M and I Porres 2003 Difference and Union of Models In Proceedings of 6th International Conference on the Unified Modeling Language Modeling Languages and Applications pp 2 17 sampkluw tex 16 12 2007 13 23 p 42 Differencing and Merging of Architectural Views 43 Aldrich J C Chambers and D Notkin 2002 ArchJava Connecting Software Architecture to Implementation In Proceedings of the 24th International Conference on Software Engineering pp 187 197 Ammann M M and R D Cameron 1994 Inter Module Renaming and Reorga nizing Examples of Program Manipulation in the Large In Proceedings of the International Conference on Software Maintenance pp 354 361 Apiwattanapong T A Orso and M J Harrold 2004 A Differencing Algorithm for Object Oriented Programs In Proceedings of the 19th IEEE International Conference on Automated Software Engineering pp 2 13 Chawathe S S and H Garcia Molina 1997 Meaningful Change Detection in Structured Data In Proceedings of the ACM SIGMOD International Conference on Management of Data pp 2
59. sing THP and MDIR The deletion operations in the middle of the tree correspond to the restricted moves that MDIR detects We ran MDIR once with bound R 100K and another time with bound R 5K We left bound B unchanged from its default value in all runs Table I Evaluation of MDIR R 100K Case f Ideal Ops THP MDIR Nodes Ops Secs Ops Secs Renames 640 569 770 2 569 64 1280 857 1509 7 963 442 Deletes 640 492 701 2 492 50 1280 1113 1397 5 1114 169 Internal Deletes 640 441 1076 3 1093 215 1280 652 2407 9 735 471 Node Degree 640 288 712 2 288 65 1280 576 1194 10 576 248 The length of an optimal edit script must necessarily be equal to the sum of the number of deletion and the number of renaming operations Table I shows for different tree node sizes the length of the optimal edit script the length of the actual edit script and the running time in seconds for both THP and MDIR All numbers were measured on an Intel Pentium 4 CPU 3GHz with 1 5GB of RAM On average THP produced edit scripts that are sub optimal by about 120 whereas MDIR produced edit scripts that are sub optimal by about 7 In the worst case THP produced a suboptimal edit script by about 400 whereas MDIR s worst case performance resulted in an edit script sub optimal by around 150 The accuracy deteriorated significantly for both MDIR and THP when using nodes of large degree or when the trees were very different MDIR s wo
60. t and adds to the work list the successor set B b namely D d and E e Next MDIR pops F f from the work list adds it to the match list and proceeds similarly 3 4 FORCING AND PREVENTING MATCHES Manual overrides are not a standard operation in most tree to tree cor rection algorithms MDIR has the ability to force and prevent matches between a node in tree T and another node in tree T Preventing a match between two nodes and j is easy just assign a very large cost to the corresponding entry in the cost matrix C i j But forcing a match between two nodes is more difficult At first glance it would seem enough to first prevent the match of either of these two nodes with any node other than the required one and second make the cost of deletion and insertion of these nodes very high This would be the case if the algorithm did not have the additional constraint concerning the distance to the subtree root Because of this constraint it is often necessary to delete entire subtrees at a time when no match can be found for any node close enough to the subtree root So we have to avoid deleting one of the nodes involved in the forced match during one of those subtree deletions A possible solution would sampkluw tex 16 12 2007 13 23 p 15 16 Differencing and Merging of Architectural Views be to prevent the deletion of all the ancestors of the forciblv matched node This is indeed the best solution if we used THP
61. tain inconsistency is the lower is its cost Then the edit distance of two graphs gi and g2 is found by searching for the sequence of edit operations with the minimum cost that transform gi into ga A similar problem formulation can be used for trees However tree edit distance differs from graph edit distance in that operations are carried out only on nodes and never directly on edges Graph matching is NP complete in the general case Conte et al 2004 Unique node labels enable processing graphs efficiently Dick inson et al 2004 which explains why many approaches make this assumption e g Alanen and Porres 2003 Ohst et al 2003 Mehra et al 2005 Optimal graph matching algorithms i e those that can find a global minimum of the matching cost if it exists can handle at most a few dozen nodes Messmer 1996 Conte et al 2004 Non sampkluw tex 16 12 2007 13 23 p 3 4 Differencing and Merging of Architectural Views optimal heuristic based algorithms are more scalable but often place other restrictive assumptions For instance the Similarity Flooding Algorithm SFA works for directed labeled graphs only It degrades when labeling is uniform or undirected or when nodes are less distin guishable It does not perform well on undirected graphs having no edge labels Melnik et al 2002 Several efficient algorithms have been proposed for trees a strict hierarchical structure so our approach focuses on
62. third party with no prior experience with the original Java program or with re engineering the Java program into the ArchJava implementation As Designed Architecture The architect created an Acme model based on the informal architecture in Figure 19 a He represented the circuitModel as a single component and added all the computational components to a representation of circuitModel in Figure 19 b In the original diagram Figure 18 the thin arrows represent control flow sampkluw tex 16 12 2007 13 23 p 29 30 Differencing and Merging of Architectural Views l A 5 7 Chea ee Ase wer zee u hen ua Figure 18 Informal as designed Aphyds architecture as drawn by the original developer and the thick arrows represent data flow but the architect did not make that distinction and showed all communication as Acme connectors Matching Types The architect chose an Acme Model View Controller style MVCFam Since the architect was interested in the control flow he assigned the provideT Acme port type defined in MVCFam to any ArchJava port that only provides methods Similarly he assigned the useT Acme port type to any ArchJava port that only requires methods and the provreqT Acme port type to any ArchJava port that both provides and requires methods He also assigned the generic TierNodeT Acme type to all components and the CallReturnT Acme type to all the implicit ArchJava connectors See Figure 20 Matching Instances
63. tional user specified constraints to force or prevent matches A graph representing a C amp C view can generally have cycles in it Representing an architectural graph as a tree causes each shared node in the graph to appear in several subtrees We consider one of these nodes the defining occurrence and add a cross link from each repeated node back to its defining occurrence These redundant nodes while they significantly increase the tree sizes greatly improve the tree to tree correction accuracy However they may be inconsistently matched with respect to their defining occurrences either in what they refer to or in the associated edit operations We work around these inconsistent matches using two passes Dur ing the first pass we synchronize the strictly hierarchical information corresponding to the system decomposition i e components ports sampkluw tex 16 12 2007 13 23 p 22 Differencing and Merging of Architectural Views 23 and representations During the second pass we svnchronize the edges in the architectural graph The post processing step is simple at that point since it has the mapping between the nodes in the two graphs Svnchronization is a five step process 1 setup the svnchronization 2 optionally view and match types 3 view and match instances 4 optionallv view and modifv the edit script 5 confirm and optionallv applv the edit script The final step is optional because the architect may decline the edit
64. two views is authoritative Implicitly when the user decides to commit some edit actions but not others they are allowing some acceptable differences to remain In future work it may be interesting to model this more precisely using ideas from inconsistency management 8 Conclusions In this paper we presented a novel algorithm for differencing and merging tree structured data that improves an existing algorithm to detect moves and support forcing and preventing matches We used the tree to tree correction algorithm to compare and merge hierarchi cal architectural Component and Connector C amp C views We then presented tools that incorporate the algorithm and showed how our relaxed assumptions match more closely the problem domain of differ encing and merging architectural views Finally we illustrated the tools in extended examples and showed how the approach can find interesting differences in real architectural views Acknowledgements This work was supported in part by NASA cooperative agreements NCC 2 1298 and NNA05CS30A NSF grants CCR 0204047 and CCF 0546550 a 2004 IBM Eclipse Innovation Grant the Army Research Office grant number DAAD19 02 1 0389 entitled Perpetually Avail able and Secure Information Systems and was performed as a joint research project in Strategic Partnership between Carnegie Mellon Uni versity and Jet Propulsion Laboratory References Abi Antoun M J Aldrich and W Coelho 2007 A Ca
65. ub Optimal Edit Script 50 0 0 5 10 15 20 25 30 35 Internal Deletes E THP N 1280 AMDIR N 1280 R 100K MDIR N 1280 R 5K E THP N 640 AMDIR N 640 R 100K MDIR N 640 R 5K m THP N 320 a MDIR N 320 R 100K MDIR N 320 R 5K Figure 12 A comparison of THP and MDIR R 100K and R 5K showing the sub optimality of the edit script vs the percentage of internal deletes Sub Optimal Edit Script vs Node Degree 350 300 250 200 150 Sub Optimal Edit Script 100 50 0 0 2 4 6 8 10 12 14 16 Node Degree mTHP N 1280 AMDIR N 1280 R 100K MDIR N 1280 R 5K mTHP N 640 AMDIR N 640 R 100K MDIR N 640 R 5K m THP N 320 a MDIR N 320 R 100K MDIR N 320 R 5K Figure 13 A comparison of THP and MDIR R 100K and R 5K showing the sub optimality of the edit script vs the node degree sampkluw tex 16 12 2007 13 23 p 20 Differencing and Merging of Architectural Views 21 MDIR Performance R 5K vs Tree Size 1000000 900000 800000 700000 600000 500000 400000 300000 200000 100000 0 T 0 500 1000 1500 2000 Tree Size number of nodes Running Time ms MDIR_1 MMDIR 2 AMDIR_3 o THP_1 xTHP 2 eTHP 3 MDIR_4 THP_4 Figure 14 The performance of MDIR R 5K vs the tree size instead as before but with various combinati
66. umber of renames in this case The synchronization tool matched all the elements between the two views despite the large number of renames Discussion The tool also identified on Account Controller Bean a port that was attached to a DbWriter connector Figure 24 does not show a connection between the Account Controller Bean and the DB components In fact the EJB specification recommends that all database access goes through entitv beans In this case the tool found an architectural violation in Sun s own examplel Performance Evaluation On an Intel Pentium 4 CPU 3GHz with 1 5GB of RAM MDIR took around 30 seconds to compare the two Acme trees one with around 330 nodes and one with around 390 nodes In this case the edit script consisted of over 250 renames and over 50 inserts As expected in this case THP did not correctiv identifv the moved view elements 6 3 EXTENDED EXAMPLE HILLCLIMBER In this example we evaluate the tool to synchronize two C amp C views again but this time we allow the user to force matches All the ex amples actually use the feature to prevent matches to avoid matching elements of incompatible types sampkluw tex 16 12 2007 13 23 p 36 Differencing and Merging of Architectural Views 37 ax Synchronize two Acme models Step 3 of 5 Match instances Please check the Messages tab 0 error s 17 warning s Found Please check the Messages tab 0 error s 17 warning s found Instances amp
67. underlying MDIR algorithm to detect moves in addition to renames The subject system is Duke s Bank a simple Enterprise JavaBeans EJB banking application The architect wanted to compare the doc umented architecture with the as built architecture recovered using an architectural recovery technique other than ArchJava Duke s Bank is also representative of industrial code that uses middleware and furthermore has a documented as designed architecture As Designed Architecture The architect converted an informal diagram See Figure 23 into an Acme model See Figure 24 As Built Architecture The as built architecture was recovered by a dynamic architecture extraction tool DiscoTECT Schmerl et al 2006 DiscoTECT currently generates one component instance for each session and entity bean instance created at runtime So the ar chitect post processed it and unified such multiple instances into one sampkluw tex 16 12 2007 13 23 p 34 Differencing and Merging of Architectural Views EJB Container nm Session Beans BES ControllerEJB ControllerEJB Ek Tx ControllerEJB Entity Beans AccountEJB CustomerEJB Account_Controller_Bea Customer_Controller_Bean ga Tx Controller Bean if a Customer_Bean Database Tables account Figure 23 Informal as designed architecture for the Duke s Bank application Sun Microsystems 2006 Session Beans
68. unordered trees achieve polynomial time complexity either through heuristic methods e g Chawathe and Garcia Molina 1997 Wang et al 2003 Raghavan et al 2004 or under additional assumptions e g Torsello et al 2005 Renames A synchronization approach must of course handle ele ments that are inserted and deleted as supported by ArchDiff Chen et al 2003 But effective synchronization must also go beyond inser tions and deletions and support renames Name differences between two C amp C views can arise for a variety of reasons For instance the architect may update a name in one view and forget to update another view Names are often modified during software development and maintenance A name may turn out to be inappropriate or misleading due to either careless initial choice or name conflicts from separately developed modules Ammann and Cameron sampkluw tex 16 12 2007 13 23 p 4 Differencing and Merging of Architectural Views 5 1994 Furthermore developers tend to avoid using names that mav be in use bv an implementation framework or librarv a minor detail for the architect Finallv architectural view elements mav not have persistent names or their names may be generated automatically by tools This suggests that an algorithm should be able to match renamed el ements Identifying an element as being deleted and then inserted when in fact it was renamed would result in losing crucial style and property i
69. ween components placeRouteViewer and model The architect inves tigated that unexpected communication and traced it to a callback Aphyds is a multi threaded application with long running operations moved onto worker threads So the architect made note of the fact that developers should not carelessly add callbacks from a worker thread onto the user interface thread Finally the architect decided to use the up to date C amp C view with types and styles for evolving the system Performance Evaluation On an Intel Pentium 4 CPU 3GHz with 1 5GB of RAM comparing an Acme tree of around 650 nodes with an ArchJava tree of around 1 150 nodes Figure 21 with MDIR took under 2 minutes In comparison THP took around 30 seconds but produced less accurate results In particular THP did not treat component privateAphyds as an insertion and mismatched all the sampkluw tex 16 12 2007 13 23 p 33 34 Differencing and Merging of Architectural Views windowBus use provrec b provide Q Roles provider l both user Figure 22 As built Aphyds architecture with Acme styles and types top level components For Aphyds the edit script consisted of over 300 renames over 600 inserts and over 100 deletes 6 2 EXTENDED EXAMPLE DUKE S BANK In this example we synchronize two C amp C views where one the as built view is recovered by instrumenting the running system This example mainly highlights the ability of the
70. y re ferred to as a Component and Connector C amp C view As architecture based techniques become more widely adopted software architects face the problem of reconciling different versions of architectural models including differencing and sometimes merging architectural views i e using the difference information from two versions to produce a new version that includes changes from both earlier versions For instance during analysis a software architect may want to reconcile two C amp C views representing two variants in a product line architecture Chen et al 2003 Once the system is implemented an ar chitect may want to compare a conceptual as designed C amp C view with an as built C amp C view retrieved from the implementation using various architectural recovery techniques Eixelsberger et al 1998 Murphy This article is an expanded version of the following paper Abi Antoun M Aldrich J Nahas N Schmerl B and Garlan D 2006 Differencing and Merging of Architectural Views In Proceedings of the 21st IEEE International Conference on Automated Software Engineering pp 47 58 2007 Kluwer Academic Publishers Printed in the Netherlands sampkluw tex 16 12 2007 13 23 p 1 2 Differencing and Merging of Architectural Views et al 2001 Medvidovic and Jakobac 2006 Schmerl et al 2006 The architect might be interested in implementation level violations of the architectural stvles or other intent
Download Pdf Manuals
Related Search
Related Contents
FT - Nettoyant sol Autolaveuse Ecologique avril April 2008 RA5 -RA55 tüv rl cover.qxp Modo de empleo mod. CAPRI XLC SAP - SED Copyright © All rights reserved.
Failed to retrieve file