Home
        Capstone Writeup - University of Massachusetts Amherst
         Contents
1.   generate    edges you can simply walk across these  boxes performing  with high probability  a small number of comparisons for each of  the at most 4 pairs of boxes  This also prevents the possibility of duplicate edges  This    method is employed in both SparseSpanner java and Weighted Matching  java     23    C ARCHITECTURE DIAGRAM         BaseCode java    static methods    getLauncherDisplay      Launcher 1    update         24    D SCREENSHOTS    D 1 Algorithm Launcher    This is a countMin matching   The countMin matching takes a stream of  updates for the form  index  increment   and maintains an estimate of what the  true value of each index would be    seiting all of the increments to 1 we    Unit Size     Vv        25    D 2 Count Min Launcher        S  Count Min       mod 7    index    update    Auto Generate Data   y    Automate       87  mod 28    query index  s    actual result  0    estimated result       query    1          26    D 3 Sparse Spanner Launcher          a     4  Sparse Spanner          Step Show Excess Edges       Finish Show Cross Edges     Show Tree Edges                 27       D 4 Maximum Matching Launcher    q  x       Maximum Matching                 Step Trail of the Dead Edges    Finish Discarded Edges    Selected Edges                28    
2.  of the algorithm only  Screenshots of the three    included visualization launchers can be found in appendices D 2  D 3  and D 4     3 3 Implemented Algorithms    The first algorithm that is included with SAVY is the Count Min sketch of Graham  Cormode and S  Muthukrishnan in their paper    An improved data stream summary  the  Count Min sketch and its applications     To understand how the algorithm works it is  important to first understand the problem that it is trying to solve  Imagine that we are  being handed flash cards which have some number of a certain type of shape  In order  to receive the next flash card it is necessary that we discard the flash card that we are  holding  Once we have finished looking at all of the flash cards we will be asked how  many of each shape we have been shown    Since computers don   t have hands or eyes we formulate the problem in the following  way  We represent each flash card as a pair of numbers  x n  where x represents which  type of shape is on the card and n represents how many copies of that shape were on  the card  We then present these flash cards to the computer as a stream  which is to say  that the computer can   t look at the old flashcards   Now let us imagine that the stream  of flash cards is so long that it is infeasible to have the computer remember all of them   Instead of maintaining a count for each shape  we can instead maintain a    sketch    of the  data     Our solution is to maintain a matrix of counters  Th
3.  re   placing old edges instead of checking if it is larger by a significant factor  This greatly  simplifies the analysis of the algorithm  but requires that the effects of the randomized  rounding be evaluated  The application of randomized rounding is borrowed from    Im   proved approximation guarantees for weighted matching in the semi streaming model     by Leah Epstein et al  For independent result regarding the approximation ratio with    rounding see appendix A     3 4 Uland Visualization Considerations    From a distance  algorithm visualization may appear to be rather cut and dry  but  this is not so  there are all sorts of things which must be considered  For example  in  visualizing the Count Min sketch it may seem obvious that to visualize the algorithm  the matrix should be displayed as a matrix and the cells chould be updated in real time   While this covers the basics it does not make the visualization clear  Without highlight   ing the cells that are changing and slowing the whole visualizatiion down it is difficult  for the user to understand what the algorithm is doing  Even with the cells highlighted  the use of the hash functions is difficult to understand unless they are presented visually  as well  Similarly  query answering is entirely unclear unless the cells that are being  compared are highlighted     The other two algorithms invited a whole lot of other visualization issues peculiar to    10    graphs  The first issue was figuring out how to gener
4.  this total weight by w T M    since this weight must be at least as much  as the actual sum   Once the algorithm terminates the only edges which can be charged  for more than one edge of OPT are those that are in the matching M  Since we don   t  know which edges in M caused rejections of edges in OPT we can approximate the sum    of the weights of these edges by the weight of the matching  w M   This allows us to       write w OPT   lt  w T M     2w M               By combining our two lemmas we can write        w OPT   lt     w M    2w M    w M   2     Because we have rounded our input based on a randomized variable we cannot reason    about the actual values of w M   so we reason about the expected value instead     27    ElOPT      lt     Es w M    3     To deal with the rounding we need one more lemma      1 7 ln 1   9 rel      OPT  6    gt  OPT  4        Let us consider the expected value of the weight of a single edge  If we can prove  Sayer Es  ws e   then the lemma follows  Let us consider an edge with unrounded  weight w e     1 y      where 0  lt  a  lt  1  Now we integrate  1   y  t   forO lt d  lt a    and a  lt  6  lt  1  For     lt a we round down to  1   y     and for     gt  a we round down    16    to  1  y               a   1 1  E      1  prais   14 4   tt    d     1    5  slus e     f atas f  a  WOT pa     We can combine equations 3  and 4 to get a statement about OPT in terms of w M   2 2 1 In 1  w OPT   lt   l 7 md y7  27 Es w M     py  inte 9  Flw   a 
5. If we separate the trail of an edge into sets C  such that set C  was bumped by  set Ci 1  then we can say that each set is at least a factor Le larger than the set that    it bumps  If we let k be the index of the last set that was bumped then we can write       GDC   lt  w  M   If we sum over the sets C  then we can say  DS   lt n  Ci   lt   Li lt pw C   w M   Noting that U  lt  w C     w T M   we can write 1w T M    lt     w T M     w M   By simple algebra we have 42w T M    lt  w T M    w M    gt        1 w T M    lt  w M    w T M    lt  4au M      2     y              Lemma A 0 2  w OPT   lt  w T M     2w M     Proof  If we think of OPT as a set of edges  0   02    0  then we can say that each  edge o  in OPT was either added to M  or prevented by at least one corresponding    edge e   If we consider charging the costs of the edges in OPT to the edges in M we    15    can write a relationship between OPT and M  It may be the case however  that some  edges in OPT were      rejected    because of edges that were in M at the time  but were  later rejected themselves  This means that we will also have to charge costs to T S    If an edge e  is responsible for the rejection of edge o  in OPT then we can charge  it for w o    Since we don   t know w o   we can approximate it by w e   which we  know is at least as much as w 0    We can extend this reasoning to the case where two  edges are responsible for the rejection of o   since w e     w e    gt  w a     We can  approximate
6. SAVY  STREAM ALGORITHM VISUALIZATION FOR YOU    Presented By     Nicolas Scarrci    Completion Date     May 11  2011    Approved By     Assistant Professor Andrew McGregor  Computer Science    Associate Professor Ramesh Sitaraman  Computer Science    ABSTRACT    Title  SAVY  Stream Algorithm Visualization for You  Author  Nicolas Scarrci  Computer Science   CE Type  Independent Capstone Project   Approved By  Andrew McGregor  Computer Science  Approved By  Ramesh Sitaraman  Computer Science    This software  SAVY  is intended to provide its users an environment in which to  become acquainted to stream algorithms  experiment with stream algorithms  and  eventually to implement  analyze  and improve their own stream algorithms  For  introducing users to stream algorithms SAVY provides an implementation of the  following algorithms  the    Count Min sketch    as presented by Graham Cormode and  S  Muthukrishnan  a sparse spanner algorithms as presented by Michael Elkin  and a  maximum matching approximation algorithm as presented by Andrew McGregor   SAVY allows users to enter their own data and observe how each algorithm reacts   Users can also choose to allow the software to generate random data so that users can  view the execution of the algorithms over time  SAVY also exposes an API to allow  users to add their own algorithms to the software  To encourage the analysis of  approximation algorithms such as the Count Min sketch SAVY provides the absolute  answers for compariso
7. and therefore does not have a network architec   ture  Herein we discuss the architecture of the software itself  For reference a diagram  has been included in appendix C  While SAVY is provided as both an applet and an  application  for accessibility purposes   we will refer to them both in this section as  SAVY  since the bulk of their code is the same    When SAVY is launched the user is presented with the algorithm launcher  The  purpose of this window is to allow the user to launch multiple different visualizations  without needing to open multiple windows or launch multiple copies of the application   A screenshot of this window is available in appendix D 1  The algorithm launcher    allows the user to select from a list of included visualizations  see a description of the    visualization  set the parameters of the visualization and then launch it in a separate  window  When a new visualization is started its launcher is initialized    The launcher serves as an intermediary between the implemented algorithm  the  user  and the data that the algorithm is processing  By separating the algorithm im   plementation and the launcher implementation SAVY encourages users to create more  general launchers which can be used to visualize multiple algorithms  e g  a graph  launcher which provides functionality unique to graphs   This separation also prevents  users from unintentionally changing the functionality of the algorithm when they are  intending to change the visualization
8. ate a graph that would be visually  pleasing to a human  This included things like not allowing duplicate edges  Whenever  a duplicate edge was added to the graph it appeared instead as though the original edge  was being updated  I also noticed that once an algorithm ran for a while it would become  cluttered  obscuring the algorithm   s execution  This is because the graphs contained a  large number of edges  and stream algorithms usually maintain only a small subset of  those edges    It also quickly became apparent that color could be used to bring a    third dimension     to any graph  Unlike how color was used in the Count Min sketch to highlight changes  that are interesting for the user to notice  or suggest how a process works  graph algo   rithms can use color to encode additional data about the graph  In the sparse spanner  problem different colors are used to mark the different subgraphs  as well as the    cross  edges    which connect them  For the maximum matching problem the darkness of a  color was used to present data about how    old    an edge was  In this specific case the  edges that were bumped from the matching were darkened over time to imply that they  were less relevant to current execution the algorithm    Though it does not occur in any of the three visualizations in SAVY  additional data  could be encoded either in text or in shape  As more visualizations are implemented  more unique and useful ways of encoding information visually will be discove
9. e for your visualization    goes  as other components like buttons are repainted automatically      B 2 5 Data Initialization    genData     This function is an abstract function which should be overridden to generate ran   dom data  Note that this function is only intended for generating the data in MySlow   DataQueue  Any other data  for example a list of random nodes  should be generated  when data structures are initialized during init        genBase     This is a safety function which simply generates a blank copy of whatever compo   nent you decide to use for visualization  This is intended for use in initialization and    error handling  but can be overridden by a blank function if desired     21    B 3 Creating a Launcher    B 3 1 Understanding the Constructor    Launchers currently work in two phases  The first is where the launcher allows for  input from the user before the visualization begins  This code exists in the constructor   I recommend setting default values and setting up error handlers which catch invalid    input and substitute default values     B 3 2 Understanding init          The second phase is the initialization phase  Once the user has started a visualization  the init method should be called  This method is in charge of processing the user   s  input  creating an instance of your algorithm class  and setting up the screen to reflect  whichever commands the user should now have access to  Don   t forget to remove the    buttons you added in the co
10. e height and width of this matrix    are dependent on how accurate we want our approximation to be and how often we  are willing to have the algorithm make a mistake  When we receive an update  x n   we update some cell in each row of the matrix by n  where the cells we update are  dependent on x   To answer a question about how many of shape x we have seen we  look at the cells that we update for shape x  Because these cells may have also been  updated by other shapes we cannot simply return the value of one of the cells  Since we  know that we only make positive updates  there cannot be  5 squares on a page   we can  assume that the cell with the smallest value is the cell which was updated by the fewest    other shapes  so we return the value in this cell as our estimate     PSEUDOCODE    for all update x n  do  for all row do  Table H  row  x   row    Table H row  x   row   n  end for    end for    For the implementation of this algorithm see CountMin java available at  9   For proofs  regarding the algorithm as well as its original description see  1    Before we move on from this algorithm it is important to explain how we chose  which cells to update for each shape  For each row of the table we have a hash function  which is randomly produced  Our hash functions belong to the family H x     ax b  mod p  mod w  where a      l      p 1   b      0     p 1   p is a prime larger than the range  of x  and w is the width of the matrix  This family of hash functions helps to m
11. easing weights  perhaps 1 e  1 2e  etc    Clearly the best matching  that we can make is to pick every other edge in the line  If our current algorithm were to  process them in order it would accept the first  then reject it to accept the second  then  reject that to accept the third  and so on  In the end this algorithm would have only the  last edge  which is far from optimal  To combat this we require that for an incoming  edge to replace old edges it must be at larger than the sum of the conflicting edges by  at least a multiplicative factor of  1   y   where y is selected by the user  This improves    the approximation ratio greatly     DATA STRUCTURES  A multiplicative increase factor y     A list of selected edges     PSEUDOCODE  for all updates  u v  do  for all selected edges that conflict with  u v  do  if weight u v   gt  weight conflicted edges  1 7  then    selected edges   selected edges   conflicted edges    selected edges   selected edges    u v   end if  end for  end for  For the implementation of this algorithm see WeightedMatching java available at  9    For proofs regarding the algorithm or its original description see  7     This algorithm is implemented with optional randomized rounding  In the case that  randomized rounding is selected the edges are rounded into weight classes  1          1  ayers  and so on  where    is selected at random from  0 1   This allows the al   gorithm to check whether an edge belongs to a sufficiently large weight class when
12. ge  god    6     The analysis regarding the effects of rounding the input is heavily borrowed from  3                  17    B USER MANUAL    B 1 Adding Your Own Algorithms    Now that you   ve become acustomed to the visualizations that ship with SAVY per   haps you   d like to start adding your own visualizations  The followinig section details    the API that SAVY exposes to you     B 1 1 Creating Your Own SlowData    Disclaimer   The first step to understand MySlowDataQueue is to understand that it should not  be altered in any way  MySlowDataQueue should provide enough of an abstraction that  any changes that need to be made can be made by creating your own SlowData that  extends MySlowData  If you find it absolutely necessary to edit the way that MyS   lowDataQueue works please extend the class so as to avoid breaking all of the built in    algorithms  as well as any user defined algorithms that you may have acquired     MySlowDataQueue contains three lists of MySlowData  posted  others  and perma   nentData  One important feature of MySlowDataQueue is that it allows for the simu   lation of real time acquisition of data  It does this by moving MySlowData from the  others list  which is not accessible from outside of the class to the posted list which pro   vides access to the first element of the list  via MySlowDataQueue poll     This move   ment is done by the update   method which is called periodically by each algorithm   Whenever an item is moved to the posted lis
13. http   dx doi org 10 1007 11538462_15     8    ey    Thomas L  Naps  James R  Eagan  and Laura L  Norton  2000   JHAVE   an environment to actively engage students in Web based al   gorithm visualizations  SIGCSE Bull  32  1  March 2000   109 113   http   doi acm org 10 1145 331795 331829    13     9  Nicolas Scarrci  2011  SAVY Home Page     http   www cs umass edu    nscarrci SAV Y     10  Clifford A  Shaffer  Matthew Cooper  and Stephen H  Edwards  2007  Algorithm  visualization  a report on the state of the field  SIGCSE Bull  39  1  March 2007    150 154  http   doi acm org 10 1145 1227504 1227366    14    APPENDIX    A APPROXIMATION RATIO OF MAXIMUM MATCHING WITH ROUNDING    For this algorithm we have rounded the input by a method called randomized ge   ometric grouping  All edge weights are separated into classes from  1   y      to     1  y P      where p     Z and 6      0  1   Edge weights are always rounded down     Theorem A 1  With rounding the maximum matching algorithm presented achieves an    approximation ratio of IR  which is maximized at   187 for y   4 36     When one edge bumps one or two other edges from the matching we know that the  weight of this edge is greater than the combined weight of the edges that were discarded    by a factor of      7   2        gt 1  1     Let w M  be the weight of our matching  and let T M  be the edges which were removed    from M  Next we prove two lemmas about w M      Lemma A 0 1  w T M    lt      w M     y   1  Proof  
14. ine allow for custom data in order to engage the users  It is also important  that it be easy for individuals to produce new algorithm visualizations so that users can    contribute to the base of available algorithms     3 THE SAVY SOFTWARE    SAVY is an algorithm visualization suite which is currently being developed by  Nicolas Scarrci at the University of Massachusetts Amherst  SAVY consists of base  java classes which provide the framework for algorithm visualizations  To aid users in  generating new visualizations SAVY provides abstract super classes as well as a user  manual available at  9  which explains how to properly extend these base classes  SAVY    currently ships with three example visualizations     3 1 Contributions to the Field    As stated by Shaffer et al     we need to encourage     visualizations on under   represented topics     One such under represented topic is stream algorithms  Stream  algorithms are characterized by having limited  often sequential  data access and very  small  polylogarithmic  space bounds  Because of this  stream algorithms are partic   ularly well suited to large datasets  With commercial giants like Google and Apple  gathering datasets of unprecedented size one would expect stream algorithms to quickly  gain standing in the standard computer science curriculum  With that in mind  SAVY  ships with three new visualizations which are based on relatively new stream algorithms   published between 2005 and 2011   In addition SAVY a
15. inimize  cell collisions between different values of x    The next algorithm that we include is the 2t 1 spanner presented by Michael Elkin  in his paper    Streaming and fully dynamic centralized algorithms for constructing and  maintaining sparse spanners     The problem solved by a sparse spanner is a common  problem in computer science  especially with ever increasing datasets  Imagine that we    have a graph  but this graph is too large to hold at once  so we want to instead remember    a subset of the edges of the graph such that we do not inflate any path lengths by more  than a factor of 2t 1  Since the graph is too large to hold at once  we must process the  edges as a stream of data    Elkin suggests that in order to generate a spanner we can maintain a small subgraph  centered on each node  As our edges come in we will grow these subgraphs  and even   tually connect them  When an edge  u v  comes in we check the subgraph coming out  of u and check to see if it already contains a path to v  If it does then we can ignore the  edge  u v   but if it doesn   t then we add  u v  to our subgraph if it is not already large  If  our subgraph on u is already large then we stop adding edges and instead check to see  if the subgraph on u is connected to the subgraph on v  If it is then we can ignore the  edge  if it is not then we add this edge to a separate subgraph  Once we have read all of    the data we return the union of these subgraphs     DATA STRUCTURES  A stretch fac
16. k  In addition to this I would like to allow for  the user to select two algorithms and run them in a side by side comparison mode that  with highlight whenever the two algorithms make different decisions  When compar   ing two algorithms it would also be useful to provide information about the amount of  space each algorithm is using as well as how fast each algorithm is running  To assist  users in generating data  especially in the case of graph algorithms it would be useful  to present a tool that would allow them to draw a graph directly on the screen  instead  of forcing them to enter edges in the form  u v   I feel that it would also be useful to  expose a new superclass which users could extend when writing algorithms that run in  the standard model  This would allow users to qualitatively compare the performance  of their approximation algorithms to the actual best solutions    I feel that this software has a lot of potential for use in the field if these suggested  features are added  By increasing the ability of users to add algorithms it is my hope  that I can create a small online community where user generated visualizations can be  submitted to a repository and then downloaded by other users  Ideally I could then  release bundles of visualizations for use by teachers  Eventually I could provide a  message board service to allow users to provide feedback on specific visualizations as    well as post request an algorithm be added     12    REFERENCES     1  Graha
17. lizations outperformed students who did not view visual   izations  A more interesting result of the study was that students who    responded    to  the visualizations  which is to say answered questions about what the visualization was  doing while watching it  outperformed all other groups    Shaffer et al  also showed that the majority of the visualizations that exist are on  low level topics  Of the data they collected in 2006 over 38  of the visualizations  elucidated sorting algorithms  and over 17  elucidated search structures  Furthermore   some of the visualizations did not actually visualize the execution of the algorithm and  instead would simply show the result of a series of operations  One example of this  is a visualization of a heap structure which shows the updated heap after an insertion    or a deletion  but does not visualize how the heap is restructured  In their concluding    6       remarks they state     we     simply need more implementors sic  to produce more  quality visualizations  And we need to encourage them to provide visualizations on    under represented topics        2 LITERATURE REVIEW    While the problem we are attempting to solve is the lack of quality visualizations  with useful features  our goal is to provide a single environment in which visualizations  can be created  explored  analyzed  mutated  and eventually shared  Similar things have  been attempted before at various times in the past with varying goals and various levels  of s
18. lso includes one entirely new    stream algorithm variant     SAVY is also unique in that it concerns itself with algorithm development analysis  and testing instead of processor use or memory leak diagnosis  Unlike JHAVE  SAVY  visualizes algorithms while it executes them  allowing the user to run their visualiza   tions on large datasets  This is particularly useful when developing stream algorithms   By visualizing the algorithm as it executes  instead of running the visualization after the  algorithm completes  SAVY also lends itself to online algorithms which can be queried  for their solutions at any point during their execution  not just at the end  By allowing  on the fly data entry SAVY increases interaction with the user who can now experiment  at will instead of having to halt the visualization  send an amended dataset to a server   and then waiting for a result    SAVY is also amenable to testing any time algorithms  Because the architecture  is such that visualizations allow their algorithms to process when they receive a new  update it is possible to delay these updates to simulate receiving them in real time   By simulating the types of update traffic that certain algorithms would experience in  their intended deployment it is possible to make empirical comparisons independent of  theoretical results  Though development is ongoing  a base implementation of delayed    data is currently available in SAVY     3 2 Architecture    SAVY does not utilize the internet 
19. m Cormode and S  Muthukrishnan  2005  An improved data stream summary   the Count Min sketch and its applications  J  Algorithms 55  1  April 2005   58 75   http   dx doi org 10 1016 j jalgor 2003 12 001     2    ey    Michael Elkin  2011  Streaming and fully dynamic centralized algorithms for con   structing and maintaining sparse spanners  ACM Trans  Algorithms 7  2  Article 20     March 2011   17 pages  http   doi acm org 10 1145 1921659 1921666     3    ey    Leah Epstein  Asaf Levin  Julian Mestre and Danny Segav  2009  Improved appoci   mation guarantees for weighted matching in the semi  streaming model  In Sympo     sium on Theoretical Aspects of Computer Science  347 358      4    ey    Scott Grissom  Myles F  McNally  and Tom Naps  2003  Algorithm visualization  in CS education  comparing levels of student engagement  In Proceedings of the  2003 ACM symposium on Software visualization  SoftVis    03   ACM  New York   NY  USA  87 94  http   doi acm org 10 1145 774833 774846     5    ey    The Jinsight Group at IBM  IBM Research   Programming Languages and Software    Engineering   Research Areas  http   www research ibm com jinsight     6    Sd    Benjamin Kurtz  Softviz  A Runtime Software Visualization Environment  Mas     ter   s Thesis at WPI     7           Andrew McGregor  2005  Finding Graph Matchings in Data Streams  Ap   proximation  Randomization and Combinatorial Optimization  Algorithms  and Techniques  Lecture Notes in Computer Science   3624  611 612   
20. n     1 INTRODUCTION    Since the execution of an algorithm is effectively the same as series of operations  performed on data structures over time  they are difficult to represent statically  e g   with lecture slides or handouts   One often overlooked and consistently underused  technique is algorithm visualization  Algorithm visualization is the act of visually dis   playing the data structures that an algorithm manipulate s and then updating them as  the algorithm executes  As a field  algorithm visualization has not seen much attention  from the computer science community  In their 2006 paper    Algorithm Visualization   A Report on the State of the Field     Shaffer et al   10  state that there simply are not  many algorithm visualizations available on the internet  and that many of those which  are available are of low quality    In addition to addressing the state of the field  they also address problems commonly  found in algorithm visualizations  For example  they state that many visualizations have  little to no user interaction  In some cases this means that the user cannot enter the data  that the algorithm uses  In the most extreme cases this means that the visualization is  actually an animation with no dynamic components  This directly contradicts research  which shows that the effectiveness of a visualization as a tool to assist understanding is  directly related to how much they user interacts with it  As Grissom et al   4  showed   students who viewed visua
21. nstructor     B 3 3 Writing an actionPerformed e  method    The actionPerformed e  method is inherited from ActionListener and is required  to make your buttons work  One point that may seem confusing is that there is one  actionPerformed method for both the buttons added in our launcher   s constructor  and  the buttons added in init       Though it is not necessary for a visualization I highly  recommend copying the code for    Step       Finish    and    End     as these actions provide    the user control over the execution of the visualization     B 4 Tips and Tricks    B 4 1 Graphs    The first tip I would suggest for graphs is to use geometric graphs  These are the  easiest to visualize  and generally the easiest to understand   The second tip I would suggest is to remember that color allow essentially a    third     dimension  with    darkness    representing a fourth  The third tip I would suggest is    to generate edges in an efficient manner  For example  if you have already randomly    22    generated your nodes is it any less random to simply select all edges of certain lengths   If you choose this path you can speed up your algorithm by    boxing    your nodes  If you  were to draw a grid across your plane each square would represent a box  By selecting  a certain length for the sides of the boxes  you can ensure that the only edges that can  exist within certain bounds must be either between nodes in the same box  or nodes in  adjacent boxes  Thus in order to  
22. orithm  B 2 1 Writing a Constructor    The constructor is where the initialization of all of the data structures you intend to  use happens  Note that this is NOT where the generation of random data  or the loading  of data from files should happen  This IS however the place where data structures that    are randomized should be generated     B 2 2 Understanding churn      The churn method simply calls a series of overrideable functions in the logical or     der  Churn calls     e getCanvas    e canvas data updateData      e canvas onUpdateSuccess      19    e canvas onUpdateFailure    e canvas otherProcessing      get Canvas     getCanvas   is an abstract function which simply returns the paintable component  that you have used for your visualization  Because you must override getCanvas   you  can place your canvas anywhere you would like   updateData     updateData   simply tells MySlowDataQueue to post any new items and then returns  the oldest new item   onUpdateSuccess     This is the code that gets executed every time a new data item is posted  This  is where the majority of your algorithm should go  Don   t forget to include a call to  repaint   at the end  or your algorithm won   t appear to do very much   onUpdateFailure     This is the code that gets executed if the environment checks for new data  but no  data is found  This can be used for performing processing that the algorithm does during     down time      ex enriching data structures  or sorting   otherProces
23. red  and  hopefully applied to old visualizations which were not sufficiently clear  As a field the    issue of algorithm visualization is far from solved     4 FUTURE WORK    After    finishing    SAVY I find that there are still a lot of features that I would like  to add to the software  For this project I focused more on the teaching aspect of the  software than I would like  I feel that novel research is still to be done on the algorithm  design and analysis possibilities of visualization software  In the future I would like  to provide more support for design and analysis  For example  the current software    requires that a user add to the source code and then recompile it in order to add their    11    own algorithm visualization  This is cumbersome  and while the software exposes an  API  see appendix B   it is still unreasonable to assume that most users will be willing  to edit the source code  By requiring source code edits I am also preventing users  from easily sharing or modifying their algorithms  To change this I would like to add  a tool which allows the user to write pseudocode describing how their algorithm works  and then  by adding special marks to the pseudocode describe how the would like the  execution of the algorithm to be visualized    I would also like to add the ability to load pre made sets of data into each visualiza   tion  This would let users run repeatable experiments as well as allow users to create  a set of datasets for use as a benchmar
24. rted to the system as the program executes  These events are then mapped to    animations  In this way the system avoids constantly updating choosing instead to only  update when something changes  This method is clearly more conducive to programs  where sparse visual updates are expected    In 2000 Naps et al  released a paper discussing their project JHAVE  an    envi   ronment designed to actively engage students in web based algorithm visualizations      8   JHAVE itself is not actually a visualization engine  Instead it is a wrapper which  provides extra functionality to stand alone visualization code  JHAVE interfaces with  executing algorithms through the use of a script file  As an algorithm executes it writes  visualization commands to this script file  which is then used by JHAVE to actually  produce a manipulable visualization  This design requires that the algorithm finish exe   cuting before JHAVE can be used  By taking this approach  JHAVE requires that users  wait for algorithms with high order runtimes  or very large datasets to finish executing  before the user can see the visualization    Despite this one feature which inhibits user interaction with the executing algorithm   JHAVE does a very good job of promoting user interaction  This includes the support of     start and stop questions     as suggested by Grissom et al   as well as       context sensitive  documentation    which explains what the algorithm is doing at each point in the visu   alization  o
25. sing     This function is executed regardless as to whether there is new data or not  Gener   ally speaking this method is more for things like timing execution and debugging  but    it can be overridden if required by the algorithm     B 2 3 Courtesy Methods    These methods are not strictly required should be implemented as a courtesy to the  user of your visualization   getInfo     This method returns a block text which is presented to the user when they consider    selecting your visualization  This text should explain which problem your visualization    20    solves as well as anything unique about the specific algorithm that it implements  This  text should also explain what the starting parameters are as well as how to control the  visualization    get Launcher Display      This method returns a container which contains all of the buttons and fields neces   sary to retrieve input from the user  If for example you are visualizing an approximation  algorithm which has a specifiable confindence you can use this method to display a field  in which the user can enter this parameter  The ordering and display of your fields are  used as is and are not changed by the software  This allows you to access them directly    when calling the init      method of your launcher class     B 2 4 Understanding the Paint Method    paint     This is arguably the most important function that you need to override  This is where  all of the visualization goes on  Specifically this is where cod
26. stensibly controlled through script files   One of JHAVE   s unique features  is the presence of a    rewind    feature which allows users to    undo    previous actions  undertaken by the visualization    JHAVE is also unique in that it implements a client server architecture  In JHAVE  users connect to a server which exposes a list of available algorithms  Users then re   quest a certain algorithm and  presumably  send the dataset on which they would like  it to be run to the server  The server executes the algorithm and then sends the script  file generated to the client which the client then parses and presents on the screen   This is a particularly interesting decision  While it does take advantage of the server   s   most likely  superior computing power it also introduces the problems inherent to the  client server architecture  e g  intermittent availability and introducing a single point of    failure  without necessarily gaining anything     The literature overwhelmingly suggests that one unified engine which supports the  addition of individual algorithms at a later date is needed as it would allow the stan   dardization of algorithm visualization  While JHAVE makes a strong showing toward  this end it accidentally prohibits algorithms from certain areas of research  Moreover   the presence of a single engine which is linked to a large repository of algorithm visu   alizations would provide the community with a single point of focus  It is imperative  that this eng
27. t it is also moved to the permanentData  list  Data items that are added to the permanentData list are never removed  This allows  you as the analyst to view and process the data that the algorithm has seen so far  This  is useful when comparing an estimate to the actual value  as is done in the Count Min  sketch     MySlowDataQueue also provides the utility function randomize    This function    does NOT generate random data  That is done on a per algoritm basis  since MyS     18    lowDataQueue does not know very much about the structure of the MySlowData that  your algorithm will be using  The randomize function simply randomizes the order of  the data in the others list  Arguably this should only be done once directly after data  has been generated or loaded  though it may be called at arbitrary times during the    algorithm   s execution     B 1 2 Understanding MySlowData    The MySlowData class requires that each implementing subclass override the de   fault toString   method with some meaningful toString   method  The MySlowData  class also provides the default expired   method which is used to test if an item should  be added to the posted queue or not  MySlowData also provides a utility method for  setting an expiration date in the future called fixDate long   This simply allows you to   delay    a data item by some number of seconds  This method should be overridden in    an implementing subclass if finer granularity than seconds is desired   B 2 Implementing an Alg
28. tor t   A list of levels  L x  which represent the level of x  or distance from the root   A list of base values  B x  which represent the identity of the tree that x belongs to   A list of trees  T x  which represent the subgraphs which we grow out of nodes x   A list of sets  X x  which represent edges which connect the tree that x is in to other    trees     PSEUDOCODE    for all edges  u v  do   wlog assume u is closer to its root than v   if u belongs to a small enough tree then  Liv    L u  1  B v    B u   T v    T v     u v    end if   if T u  is not connected to T v  then    X v     u v     end if  end for  For the implementation of this algorithm see SparseSpanners java available at  9   For  a proof of the stretch guarantee or the original description of the algorithm see  2     The final algorithm that is included in SAVY is Andrew McGregor   s Maximum  Matching algorithm from his paper    Finding graph matchings in data streams     This  algorithm solves the problem of finding the set of edges with maximum weight such  that no two edges share a vertex  While the maximum matching problem has an exact  answer computable in polynomial time in the standard model  unlimited data access    only an approximation is known in the streaming model  The idea behind this algorithm  is rather simple  accept a new edge if it is better than the edges that it conflicts with    This approach is not quite sufficient  Imagine that our graph was simply a line edges  with slightly incr
29. uccess    Algorithm visualization has a closely related sister field called software visualiza   tion  Software visualization is concerned with the workings or performance of individ   ual programs  Software visualization suites are generally more concerned with tracking  memory allocation and efficiency of a running program than they are with individ   ual algorithms  A very good example of this is IBM   s Jinsight project which focuses  on    Object oriented visualization for performance tuning and program understanding         Pattern visualization to study repetitive behavior and explore data structures     and     Memory leak diagnosis     5     A good survey of algorithm visualization systems can be found in  6   To summarize  the relevant parts of that work  the major systems in algorithm visualization are BALSA  and TANGO  BALSA is a program visualizer that was developed at Brown University   BALSA was originally designed for use with PASCAL and would follow the execution  of a program by highlighting which line of code was currently being evaluated  For  ease of use  BALSA would reformat the source code to make it more readable  Since  the original BALSA project many smaller derivative systems have been created  all of  which appear proprietary to Brown University  TANGO was also developed at Brown   although somewhat later  The main difference between BALSA and TANGO is that  TANGO visualizes program execution by having the program establish events which    are repo
    
Download Pdf Manuals
 
 
    
Related Search
    
Related Contents
Instruction Manual Model 2166 Mode d`emploi Modèle 2166  Manual de instruções  Manual de instruções  La notion de réalité dans la peinture des années cinquante  Recueil de questions à choix multiples  Toro IBOC Plus Series Parts Manual  KULINARISK Backofen    CHIP o7/2004 - Mailer    Copyright © All rights reserved. 
   Failed to retrieve file