Finished Eval

2018-03-03 19:38:27 +00:00
parent 3bdd68dffe
commit c1f1f0ba30
34 changed files with 334 additions and 72 deletions
--- a/graphs/hybridParams_2ndTime_10k.png
+++ b/graphs/hybridParams_2ndTime_10k.png
--- a/graphs/linkOptimize.png
+++ b/graphs/linkOptimize.png
--- a/graphs/multiAlgoStress.png
+++ b/graphs/multiAlgoStress.png
--- a/graphs/multiAlgoTime.png
+++ b/graphs/multiAlgoTime.png
--- a/graphs/ram.png
+++ b/graphs/ram.png
--- a/graphs/ramMultiSize.png
+++ b/graphs/ramMultiSize.png
--- a/l4proj.tex
+++ b/l4proj.tex
@@ -5,7 +5,48 @@
 \usepackage[final]{pdfpages}
 \usepackage{algpseudocode}
 \usepackage{wrapfig}
+\usepackage{graphicx}
+\usepackage{subcaption}
+\usepackage{listings}
+\usepackage{color}

+\renewcommand{\lstlistingname}{Code}% Listing -> Algorithm
+
+%define Javascript language
+\definecolor{lightgray}{rgb}{.9,.9,.9}
+\definecolor{darkgray}{rgb}{.4,.4,.4}
+\definecolor{purple}{rgb}{0.65, 0.12, 0.82}
+
+\lstdefinelanguage{JavaScript}{
+  keywords={typeof, new, true, false, catch, function, return, null, catch, switch, var, if, in, while, do, else, case, break, let, for},
+  keywordstyle=\color{blue}\bfseries,
+  ndkeywords={class, export, boolean, throw, implements, import, this},
+  ndkeywordstyle=\color{darkgray}\bfseries,
+  identifierstyle=\color{black},
+  sensitive=false,
+  comment=[l]{//},
+  morecomment=[s]{/*}{*/},
+  commentstyle=\color{purple}\ttfamily,
+  stringstyle=\color{red}\ttfamily,
+  morestring=[b]',
+  morestring=[b]"
+}
+
+\lstset{
+   language=JavaScript,
+   backgroundcolor=\color{lightgray},
+   extendedchars=true,
+   basicstyle=\footnotesize\ttfamily,
+   showstringspaces=false,
+   showspaces=false,
+   numbers=left,
+   numberstyle=\footnotesize,
+   numbersep=9pt,
+   tabsize=2,
+   breaklines=true,
+   showtabs=false,
+   captionpos=b
+}

 \begin{document}
 \title{Faster force-directed layout algorithms for the D3 visualisation toolkit}
@@ -98,7 +139,7 @@ In this process, step \ref{step:hybridFindPar} has the highest time complexity o

 Finally, the Chalmers' spring model is applied to the full data set for a constant number of iterations. This operation have the time complexity of $O(N)$.

-Previous evaluations show that this method is faster that the 1996 algorithm alone, and can create a layout with lower stress, thanks to the more accurate positioning in the interpolation process.
+Previous evaluations show that this method is faster that the Chalmers' 1996 algorithm alone, and can create a layout with lower stress, thanks to the more accurate positioning in the interpolation process.

 \section{Hybrid MDS with Pivot-Based Searching algorithm}

@@ -189,9 +230,70 @@ something
 \section{Outline}

 \section{Algorithms}
+This section discusses implementation decisions for each algorithm, some of which are already implemented in D3 force module and the d3-neighbour-sampling plugin. Adjustments made to third-party implemented algorithms are also discussed.
+
 \subsection{Link force}
 \label{sec:imp_linkForce}
-Reduce mem usage, Section \ref{ssec:eval_ram} for details.
+D3-force module have implemented an algorithm to produce a force-directed layout. The main idea is to change the velocity vector of each pair connected via a link at every time step, simulating force application. For example, if two nodes are further apart than the desired distance, a force is applied to both nodes to pull them together. The implementation also supports incomplete graphs, thus the links have to be specified. The force is also, by default, scaled on each node depending on how many spring it is attached to, in order to balance the force applied to heavily and lightly connected nodes, improving the stability. Without such scaling, the graph would expands into every direction.
+
+Looking at the use case of Multidimensional scaling, many features are unused and could be removed to reduce computation time and memory usage. Firstly, to accommodate an incomplete graph, the force scaling have to be calculated for each node and each link. The calculated values are then cached in a similar manner to the distances ($bias$ and $strengths$ in code \ref{lst:impl_LinkD3}). In a fully-connected graph, these values are the same for every links and nodes. To save on memory and startup time, the arrays is replaced by a single number value instead.
+
+\begin{lstlisting}[language=JavaScript,caption={Force calculation function of Force Link as implemented in D3.},label={lst:impl_LinkD3}]
+  function force(alpha) {
+    for (var k = 0, n = links.length; k < iterations; ++k) {
+      for (var i = 0, link, source, target, x, y, l, b; i < n; ++i) {
+        link = links[i], source = link.source, target = link.target;
+        x = target.x + target.vx - source.x - source.vx || jiggle();
+        y = target.y + target.vy - source.y - source.vy || jiggle();
+        l = Math.sqrt(x * x + y * y);
+        l = (l - distances[i]) / l * alpha * strengths[i];
+        x *= l, y *= l;
+        target.vx -= x * (b = bias[i]);
+        target.vy -= y * b;
+        source.vx += x * (b = 1 - b);
+        source.vy += y * b;
+      }
+    }
+  }
+\end{lstlisting}
+
+Secondly, D3's Force Link require the user to specify and array of links to describe the graph. Each link is a string-indexed dictionary which is not the most memory-friendly data type. The cached distance values are stored in a separated array with the index parallel to that of the links array. Since nodes are also stored in an array, the links array is entirely replaced with a nested loop over the nodes array, reducing the memory footprint even further and eliminating time required to construct the array. The index for the cached distance is then adjusted accordingly.
+
+\begin{lstlisting}[language=JavaScript,caption={Part of the customized force calculation function.},label={lst:impl_LinkCustom}]
+  function force(alpha) {
+    let n = nodes.length;
+    ...
+    for (var k = 0, source, target, i, j, x, y, l; k < iterations; ++k) {
+      for (i = 1; i < n; i++) for (j = 0; j < i; j++) { // For each link
+        // jiggle so l won't be zero and causes divide by zero error after this
+        source = nodes[i];
+        target = nodes[j];
+        x = target.x + target.vx - source.x - source.vx || jiggle();
+        y = target.y + target.vy - source.y - source.vy || jiggle();
+        l = Math.sqrt(x * x + y * y);
+        //dataSizeFactor = 0.5/(nodes.length-1), pre-calculated only once
+        l = (l - distances[i*(i-1)/2+j]) / l  * dataSizeFactor * alpha;
+        x *= l, y *= l;
+        target.vx -= x;
+        target.vy -= y;
+        source.vx += x;
+        source.vy += y;
+      }
+    }
+  ...
+  }
+\end{lstlisting}
+
+After optimisation, the execution time decreases marginally while memory consumption decreases by a seventh, raising data size limit from 3,200 data points\cite{LastYear} to over 10,000 in the process. Details on the evaluation procedure and data size limitation will be discussed in section \ref{ssec:eval_ram}.
+
+\begin{figure}[h] % Poker 100 BAD
+  \centering
+  \includegraphics[height=5cm]{graphs/linkOptimize.png}
+  \caption{A comparison in memory usage and execution time between versions Force Link at 3,000 data points from Poker Hands data set for 300 iterations.}
+  \label{fig:imp_linkComparison}
+\end{figure}
+
+Finally, a feature is added to track the average force applied to the system in each iteration. A threshold value can be set so once average force falls below the threshold, a function can be called allowing the user to stop the simulation. This is feature will be heavily used in the evaluation process (section \ref{ssec:eval_termCriteria}).
 %============================
 \subsection{Chalmers' 1996}
 Force scaling
@@ -232,14 +334,13 @@ The Poker Hands is another classification dataset containing possible hands of 5
 \begin{figure}
  \centering
   \includegraphics[height=6cm]{layout/Link10000Stable_crop.png}
-\caption{A subset of 10,000 data points of the Poker Hands data set, visualised by Link force which should produces the most accurate layout.}
+  \caption{Visualisation of 10,000 data points from the Poker Hands data set, using Link Force.}
  \label{fig:eval_idealSample}
 \end{figure}

 The Antarctic data set contain 2,202 measurements by remote sensing probes over 2 weeks at a frozen lake in the Antarctic. Features includes water temperature, UV radiation levels, ice thickness, etc. The data is formatted into CSV by Greg Ross and is used to represent a data set with complex structure. Due to the relatively small size of this data set, it is only used to compare the ability to show fine details.

 \section{Experimental Setup}
-% TODOO TODO TODO TODO TODO TODO TODO TODO TODO TODO TODO TODO TODO TODO TODO TODO TODO TODO TODO TODO
 Hardware and web browser can greatly impact the JavaScript performance. In addition to from the code and dataset, these variables have to be controlled as well.
 The computers used are all the same model of a Dell All-in-One desktop computer with Intel\textregistered{} Core\texttrademark{} i5-3470S and 8GB of DDR3 memory, running CentOS 7 with Linux 3.10-x86-64.
 As for web browser, the official 64-bit build of Google Chrome 61.0.3163.79 is used to both run and analyse CPU and memory usage with its performance profiling tool.
@@ -248,14 +349,10 @@ Other unrelated parameters have to also be controlled as much as possible. The s

 \subsection{Termination criteria}
 \label{ssec:eval_termCriteria}
-Both Link force and the 1996 algorithm create a layout that stabilises over time. In D3, calculations are performed for a predefined number of iterations. This have a drawback of having to select an appropriate number. Choosing the number too high means that execution time is wasted calculating minute details with no visible change to the layout while the opposite can results in a bad layout.
+Both Link force and the Chalmers' 1996 algorithm create a layout that stabilises over time. In D3, calculations are performed for a predefined number of iterations. This have a drawback of having to select an appropriate value. Choosing the number too high means that execution time is wasted calculating minute details with no visible change to the layout while the opposite can results in a bad layout.
 Determining the constant number can be problematic, considering that each algorithm may stabilise after different number of iterations, especially when the interpolation result can vary greatly from run-to-run.

-An alternative method is to stop when a condition is met. One of such condition purposed is the difference in velocity of the system between iterations\cite{Algo2002}. In other word, once the amount of force applied in that iteration is lower than a scalar threshold, the calculation may stop. Taking note of stress and average force applied over multiple iterations as illustrated in figure \ref{fig:eval_stressVeloOverTime}, it is clear that Link force converges zero while 1996 algorithms reaches and fluctuate around a constant. Because the $Samples$ set keeps changing, the system will not reach a state where every spring forces cancel each other out nearly completely. This is also reflected in the animation where every nodes keep wiggling but the overall layout remains constant. It can also be seen that stress of each layout converges a minimal value as the average force converges a constant, indicating that the best layout from each algorithm can be obtained once the system stabilizes.
-
-Since stress is takes too long to calculate iteration, the termination criteria is steeled with average force applied. This criteria is used for all 3 algorithms. The cut-off constant is then manually selected for each algorithm for every subset used. Link force's threshold is a value that is very low that stress is stabilized and does not make any visible changes while 1996's is the lowest possible value that is reached most of the time.
-
-By selecting this termination condition, the goal of the last phase of the Hybrid Layout algorithm is changed. Rather than performing the 1996 algorithm over the whole dataset to correct interpolation errors, the interpolation phase's role is to help the final phase reaches stability quicker. Thus, parameters of the interpolation phase can not be evaluated on their own. Taking more time to produce a better interpolation result may or may not effect the number of iterations in the final phase, creating the need to balance between time spent and saved by interpolation.
+An alternative method is to stop when a condition is met. One such condition purposed is the difference in velocity of the system between iterations\cite{Algo2002}. In other word, once the amount of force applied in that iteration is lower than a scalar threshold, the calculation may stop. Taking note of stress and average force applied over multiple iterations as illustrated in figure \ref{fig:eval_stressVeloOverTime}, it is clear that Link Force converges complete stillness while the Chalmers algorithm's average force reaches and fluctuate around a constant. Because the $Samples$ set keeps changing randomly, the system will not reach a state where every spring forces cancel each other out completely. This is also reflected when the animation is drawn where every nodes keep wiggling about but the overall layout remains constant. It can also be seen that stress of each layout converges a minimal value as the average force converges a constant, indicating that the best layout from each algorithm can be obtained once the system stabilizes.

 \begin{figure}
  \centering
@@ -263,14 +360,19 @@ By selecting this termination condition, the goal of the last phase of the Hybri
  \caption{A log-scaled graph showing decreasing stress and forces applied per iteration over time covering a constant number.} %10,000 data points
  \label{fig:eval_stressVeloOverTime}
 \end{figure}
+
+Since stress takes too long to calculate every iteration, termination criteria selected is the average force applied. This criteria is used for all 3 algorithms for consistency. The cut-off constant is then manually selected for each algorithm for each subset used. Link force's threshold is a value low enough that there are no visible changes and stress have reached near minimum. The Chalmers' threshold is the lowest possible value that will be reached most of the time.
+
+By selecting this termination condition, the goal of the last phase of the Hybrid Layout algorithm is flipped. Rather than performing the Chalmers' algorithm over the whole dataset to correct interpolation errors, the interpolation phase's role is to help the final phase reaches stability quicker. Thus, parameters of the interpolation phase can not be evaluated on their own. Taking more time to produce a better interpolation result may or may not effect the number of iterations in the final phase, creating the need to balance between time spent and saved by interpolation.
+
 %============================

 \subsection{Selecting Parameters}
 \label{ssec:eval_selectParams}
-Some of the algorithms have variables that are predefined constant numbers. Choosing the wrong values could lead the algorithm to produce bad results or takes unnecessarily long computation time and memory. To compare each algorithm fairly, an optimal set of parameters have to be chosen for each.
+Some of the algorithms have variables that are predefined constant numbers. Care have to be taken in choosing these values as bad choices could cause the algorithm to produce bad results or takes unnecessarily long computation time. To compare each algorithm fairly, an optimal set of parameters have to be chosen for each.

-The 1996 algorithm have two adjustable parameters: $Neighbours_{size}$, $Samples_{size}$.
-According to previous evaluations\cite{LastYear}\cite{Algo2002}, favorable layout could be achieved with values as low as $10$ for both variables. Preliminary testings seems to confirm the findings and the value is selected for the experiments. In the other hand, Link force have no adjustable parameter whatsoever so no attention is required.
+The Chalmers' algorithm have two adjustable parameters: $Neighbours_{size}$, $Samples_{size}$.
+According to previous evaluations\cite{LastYear}\cite{Algo2002}, favorable layout could be achieved with values as low as $10$ for both variables. Preliminary testings seems to confirm the findings and the values are selected for the experiments. In the other hand, Link force have no adjustable parameter whatsoever so no attention is required.

 \begin{figure}
  \centering
@@ -280,87 +382,247 @@ According to previous evaluations\cite{LastYear}\cite{Algo2002}, favorable layou
  \label{fig:eval_pivotHits}
 \end{figure}

-Hybrid layout have multiple parameters during the interpolation phase. For the parent-finding stage, there is a choice of weather to use brute-force or pivot-based searching method. In case of pivot-based, the number of pivots ($k$) have to also be chosen. Experiments have been run to find the accuracy of pivot-based searching, starting from $1$ pivot to determine reasonable numbers to use in subsequence experiments. However, as shown in figure \ref{fig:eval_interpVariations}, the randomly selected $S$ set (the $\sqrt{N}$ samples used in the first stage) can greatly affect the interpolation result, especially with smaller data set with many small clusters. Therefore, each tests have to be run multiple times to generalise the result. As illustrated in figure \ref{fig:eval_pivotHits}, the more pivots used, the higher accuracy and consistency. The diminishing returns can be observed at around 6 to 10 pivots, depending on number of data point. Hence, higher number of pivots are no longer considered for the experiment.
+Hybrid layout have multiple parameters during the interpolation phase. For the parent-finding stage, there is a choice of weather to use brute-force or pivot-based searching method. In case of pivot-based, the number of pivots ($k$) have to also be chosen. Experiments have been run to find the accuracy of pivot-based searching, starting from 1 pivot to determine reasonable numbers to use in subsequence experiments. As shown in figure \ref{fig:eval_interpVariations}, the randomly selected $S$ set (the $\sqrt{N}$ samples used in the first stage) can greatly affect the interpolation result, especially with smaller data set with many small clusters. Therefore, each tests have to be run multiple times to generalise the result. From figure \ref{fig:eval_pivotHits}, it can be seen that the more pivots used, the higher accuracy and consistency. The diminishing returns can be observed at around 6 to 10 pivots, depending on number of data points. Hence, higher number of pivots are not considered for the experiment.

-\begin{figure}
+Finally, the last step of interpolation is to refine the placement for a constant number of times. Preliminary testings shows that this step helps clean up a lot of interpolation artifacts. For example, a clear radial pattern and straight lines can be seen in figure \ref{sfig:eval_refineCompareA}, especially in the lower right corner. While these artifacts are no longer visible in figure \ref{sfig:eval_refineCompareB}, it is still impossible to obtain a desirable layout, even after more refinement steps were added. Hence, the Chalmers' algorithm has to be run over the entire data set after the interpolation phase. For the rest of the experiment, only two values, 0 and 20 were selected, representing with and without interpolation artifacts cleaning.
+
+\begin{figure}[h]
  \centering
-\includegraphics[height=10cm]{layout/interpVar.png}
-\caption{Difference in interpolation results of a subset with 1,000 data points. Left images shows only data points in set $S$ and the right shows the result immediately after interpolation. Set $S$ of the images below only contains samples from class "Unrecognized" and "One pair" (colored blue and orange respectively), resulting in low accuracy in interpolating points of other classes.}
+  \begin{subfigure}{\textwidth}
+    \includegraphics[height=5.5cm]{layout/interpVar1A.png}
+    \includegraphics[height=5.5cm]{layout/interpVar1B.png}
+    \caption{An example of interpolation result with a more-balanced $S$}
+  \end{subfigure}
+
+  \begin{subfigure}{\textwidth}
+    \includegraphics[height=6cm]{layout/interpVar2A.png}
+    \includegraphics[height=6cm]{layout/interpVar2B.png}
+    \caption{An example of interpolation result with a less-balanced $S$}
+  \end{subfigure}
+  \caption{Difference in interpolation results of a subset with 1,000 data points. Left images shows only data points in set $S$ and the right shows the interpolation result.}
  \label{fig:eval_interpVariations}
 \end{figure}

-Finally, the last step of interpolation is to refine placement for a constant number of times. Preliminary testings shows that while this step can clean up interpolation artifacts as shown in figure \ref{fig:eval_refineCompare}, desirable layout can not be obtained now matter how many steps the refinement takes. Hence, the 1996 algorithm has to be run over the entire data set after the interpolation phase. For the rest of the experiment, only two values, 0 and 20 are arbitrarily selected, representing with and without interpolation artifacts cleaning.
-
-\begin{figure}
+\begin{figure}[h]
  \centering
-\includegraphics[height=5cm]{layout/refineCompare.png}
-\caption{A comparison between interpolation without no (left) and 20 (right) refinement steps. The left image shows more interpolation artifacts, especially in the bottom-right corner where the parent nodes of multiple points can be inferred from the way multiple points line up.}
+  \begin{subfigure}{0.45\textwidth}
+    \includegraphics[height=5cm]{layout/refineCompareA.png}
+    \caption{No interpolation refinement} \label{sfig:eval_refineCompareA}
+  \end{subfigure}
+  ~ %add desired spacing between images, if blank, line break
+  \begin{subfigure}{0.45\textwidth}
+    \includegraphics[height=5cm]{layout/refineCompareB.png}
+    \caption{20 steps of Interpolation refinement} \label{sfig:eval_refineCompareB}
+  \end{subfigure}
+  \caption{A comparison between the interpolation result}
  \label{fig:eval_refineCompare}
 \end{figure}
 %============================

 \subsection{Performance metrics}
-As discussed in section \ref{sec:bg_metrics}, there are three main criteria to evaluate each algorithm: execution time, memory consumption, and the produced layout. While Stress is a good metric to judge the quality of a layout, it does not necessary means that layouts of the same stress are equally as good for data exploration. Thus, the looks of the product itself have to also be compared. Since both the 1996 and Hybrid algorithms have the goal of mimicking the Force Link's result while cutting cost as much as possible, the layout of Force Link will be used as a baseline for comparison. The closer the layout from the other algorithms is to the baseline, the better.
+As discussed in section \ref{sec:bg_metrics}, there are three main criteria to evaluate each algorithm: execution time, memory consumption, and the produced layout. Although stress is a good metric to judge the quality of a layout, it does not necessary means that layouts of the same stress are equally as good for data exploration. Thus, the looks of the product itself have to also be compared. Since both the Chalmers' and Hybrid algorithms have the goal of mimicking the Link Force's result while cutting cost as much as possible, the layout of Link Force will be used as a baseline for comparison (figure \ref{fig:eval_idealSample}). The closer the layout is to the baseline, the better.
+
+It should also be noted that for ease of comparison, the visualisations may be uniformly scaled and rotated. This manipulation should not effect the evaluation as the only concern of a spring model is relative distance between data points.
 %============================

 \section{Results}

 \subsection{Memory usage}
 \label{ssec:eval_ram}
-Google Chrome comes with the performance profiling tools, allowing users to measure JavaScript heap usage. While it is straightforward to measure the usage of Link Force, the 1996 algorithm causes problems with the garbage collector. Because the $Samples$ set and, to a certain degree, $Neighbours$ set is reconstructed at every iterations, a lot of new memory spaces are allocated and the old ones are left unreachable, waiting to be reclaimed. As a result, the JS heap usage keeps increasing until the GC runs (figure \ref{fig:eval_neighbourRam}) even though the actual usage is theoretically constant across multiple iterations. Although GC is designed to be only be run automatically by the JavaScript engine, Google Chrome allow it to be manually called in the profiling tool. For this experiment, GC will be manually called periodically during part of the run. The usage immediately after GC is then recorded and will be used for comparison. The peak before GC automatically gets invoked is also noted.
+Google Chrome comes with the performance profiling tools, allowing users to measure JavaScript heap usage. While it is straightforward to measure the usage of Link Force, the 1996 algorithm causes problems with the garbage collector. Because the $Samples$ sets and, to a certain degree, $Neighbours$ sets are reconstructed at every iterations, a lot of new memory spaces are allocated and the old ones are left unreachable, waiting to be reclaimed. As a result, the JS heap usage keeps increasing until the GC runs even though the actual usage is theoretically constant across multiple iterations (figure \ref{fig:eval_neighbourRam}). Even though GC is designed to be only be run automatically by the JavaScript engine, Google Chrome allow it to be manually called in the profiling tool. For this experiment, GC will be manually called periodically during part of the run. The usage immediately after garbage collection is then be recorded and used for comparison. The peak before GC automatically gets invoked is also noted.

 \begin{figure}
  \centering
  \includegraphics[height=3cm]{graphs/neighbourRam.png}
-\caption{Fluctuating JavaScript usage due to frequent memory allocation of the 1996 algorithm. GC is manually invoked every second in the first half and automatically in the later.}
+  \caption{Fluctuating JavaScript usage due to frequent memory allocation of the Chalmers' algorithm. GC is manually invoked every second in the first half and automatically in the later.}
  \label{fig:eval_neighbourRam}
 \end{figure}

-The hybrid layout has multiple phases, each with different theoretical memory complexity. With the final phase essentially being the 1996 algorithm, the memory requirement is expected to be equal or higher. As far as the interpolation phase is concerned, the buckets storage for pivot-based finding requires the most memory at $k(S_{Size})=O(\sqrt{N})$. Compared to $N(Neighbours_{size}+Samples_{size}) = O(N)$ of the 1996 algorithm, final stage should defines the overall memory requirement for the layout.
+The hybrid layout has multiple phases, each with different theoretical memory complexity. As far as the interpolation phase is concerned, the buckets storage for pivot-based finding requires the most memory at $k(S_{Size})=O(\sqrt{N})$. Compared to $N(Neighbours_{size}+Samples_{size}) = O(N)$ of the Chalmers' algorithm used in the final phase, the overall memory requirement should be equals to that of the 1996 algorithm.

 \begin{figure}
  \centering
-\includegraphics[height=6cm]{graphs/ram.png}
+  \includegraphics[height=5cm]{graphs/ram.png}
+  \includegraphics[height=5cm]{graphs/ramMultiSize.png}
  \caption{Comparison of memory usage of each algorithm}
  \label{fig:eval_ram}
 \end{figure}

-The comparison have been made between the 3 algorithms with hybrid layout running 10 pivots to represent the worst case scenario for interpolation. Rendering is turned off to minimize the impact due to DOM elements manipulation\cite{LastYear}. Figure \ref{fig:eval_ram} verifies the hypothesis above. The modified Link Force, which use less memory compared to the D3's implementation (section \ref{sec:imp_linkForce}), perform a lot worse than the 1996 algorithm, even with just automatic garbage collection. The difference in the base memory usage between the 1996 algorithm and the final stage of Hybrid layout is also with the margin of error, confirming that they both have the same memory requirement.
+The comparison have been made between the 3 algorithms with hybrid layout running 10 pivots to represent the worst case scenario for interpolation. Rendering is also turned off to minimize the impact due to DOM elements manipulation\cite{LastYear}. The results are displayed in figure \ref{fig:eval_ram}. The modified Link Force, which use less memory compared to the D3's implementation (section \ref{sec:imp_linkForce}), scales badly compared to all others, even with automatic garbage collection. The difference in the base memory usage between the 1996 algorithm and the final stage of Hybrid layout is also within the margin of error, confirming that they both have the same memory requirement. If the final phase of the Hybrid layout is skipped, memory requirement will grow at a slightly lower rate.

-Due to JavaScript limitation, Force Link crashes the browser tab at 50,000 data points before any spring force is calculated, failing the test entirely. In contrast, the 1996 algorithm (and the hybrid) can process with as much as 470,000 data points. 600,000 points is also possible but the 8GB memory available on the evaluation computers were used up, causing thrashing and slowing down the entire machine. Interestingly, unlike Force Link, the tab does not crash despite heavy paging, suggesting that memory requirement is not the only limiting factor in play.
+Due to JavaScript limitation, Link Force crashes the browser tab at 50,000 data points before any spring force is calculated, failing the test entirely. The similar behavior can also be observed with the D3's implementation. In contrast, the Chalmers' algorithm (and the hybrid) can process with as much as 470,000 data points. Interestingly, while the Chalmers' algorithm can also handle 600,000 data points with rendering, the 8GB memory is all used up, causing heavy thrashing and slowing down the entire machine. Considering that, paging does not occur when Link Force crashes the browser tab, memory requirement may not the only limiting factor in play.

-All in all, since a desirable result can not be obtained from Hybrid algorithm if the final stage is skipped, there is no benefit in term of memory usage from using the Hybrid layout, compared to the 1996 algorithm. Both of them have a lot smaller memory footprint compared to Force Link and can work on a lot more data points on the same hardware constraint.
+All in all, since a desirable result can not be obtained from Hybrid algorithm if the final stage is skipped, there is no benefit in term of memory usage from using the Hybrid layout, compared to Chalmers' algorithm. Both of them have a lot smaller memory footprint compared to Link Force and can work with a lot more data points on the same hardware constraint.
 %============================

-\subsection{Different Parameters of the Hybrid Layout}
-In section \ref{ssec:eval_termCriteria}, it have been concluded that the value of each parameter can not be evaluated on its own. Based on findings discussed in section \ref{ssec:eval_selectParams}, 10 different combinations of interpolation parameters were chosen: Brute force and 1, 3, 6, and 10 pivots, each with and without refinement at the end. Due to possible variations from the sample set $S$, each experiment is also performed 5 times.
+\subsection{Different Parameters for the Hybrid Layout}
+In section \ref{ssec:eval_termCriteria}, it has been concluded that the value of the parameters can not be evaluated on their own. Based on findings discussed in section \ref{ssec:eval_selectParams}, 10 different combinations of interpolation parameters were chosen: Brute force and 1, 3, 6, and 10 pivots, each with and without refinement at the end. Due to possible variations from the sample set $S$, each experiment is also performed 5 times. The data sets used are Poker Hands with 10,000 data points, which is the highest amount where stress can be calculated before crashing the web page, and 100,000 data points to hi-light the widen difference in interpolation time.

 \begin{figure}
  \centering
  \includegraphics[height=5cm]{graphs/hybridParams_2ndTime_10k.png}
+  ~
  \includegraphics[height=5cm]{graphs/hybridParams_stress_10k.png}
  \includegraphics[height=5cm]{graphs/hybridParams_totalTime_10k.png}
  \caption{Comparison of different interpolation parameters of the hybrid layout at 10,000 data points.}
  \label{fig:eval_hybridParams10k}
 \end{figure}

+Figure \ref{fig:eval_hybridParams10k} and \ref{fig:eval_hybridParams100k} shows that most of the execution time is spent in the final phase, making the number of iterations very important. While refining the interpolation result takes nearly insignificant amount of time, it both reduces the stress of the final layout and help the last phase reach stability much faster across the board. Figure \ref{fig:eval_pivotToggleRefine} also suggest that the produced layout is much more accurate. Without refining, it can be seen that a lot of "One pair" (orange) and "Two pair" (green) data points circles around "Unrecognized" (blue) when they should not. Thus, there is no compelling reason to disable this refinement step.
+
+Surprisingly, despite lower time complexity, selecting higher number of pivots on the smaller data set can results in higher execution time than brute-force, negating any benefits of using it. At 10,000 data points, 3 pivots takes approximately as much time as brute-force, marking the highest sensible number of pivots. Looking at the lower number, the time saved from using 1 pivot does not reflect on the total time used by the layout. At 100,000 points however, a significant speed up can be observed and is reflected in the total execution time. This suggests that pivot-based searching could shine with even larger data set and slower distance functions.
+
 \begin{figure}
  \centering
  \includegraphics[height=5cm]{graphs/hybridParams_2ndTime_100k.png}
+  ~
  \includegraphics[height=5cm]{graphs/hybridParams_totalTime_100k.png}
  \caption{Comparison of different interpolation parameters of the hybrid layout at 100,000 data points.}
  \label{fig:eval_hybridParams100k}
 \end{figure}

-Time of Phase 2: too many pivot may be slower than bruteforce. Refinement does not take much longer. Overall time depends only on the no of iterations in the last phase
+\begin{figure}
+  \centering
+  \begin{subfigure}{0.45\textwidth}
+    \includegraphics[height=5cm]{layout/Pivot6_0_100k.png}
+    \caption{No interpolation refinement}
+  \end{subfigure}
+  ~ %add desired spacing between images, if blank, line break
+  \begin{subfigure}{0.45\textwidth}
+    \includegraphics[height=5cm]{layout/Pivot6_20_100k.png}
+    \caption{20 steps of Interpolation refinement}
+  \end{subfigure}
+  \caption{Comparison of the produced layout using 6 pivots, 100,000 data points}
+  \label{fig:eval_pivotToggleRefine}
+\end{figure}

-Different args, look at affect on the no of iterations in the last phase.
+Between brute-force and 1 pivot, there is no visual difference aside from variation from run-to-run. The stress measurement seems to support this subjective opinion. On the other hand, brute-force seems to results in a more consistent total execution time. Considering that refinement is stronger with bigger data set as there are more points to compare against, it make sense that the effect of low accuracy is easily corrected in larger data set.

-Looks: while pivot 1 is fast, bad layout.
-
-Conclusion
+In summary, to obtain quality layout, the refining step of the interpolation phase can not be ignored. Pivot-based searching only provide a significant benefit with very large data set and/or slow distance function. Otherwise, brute-force method can yield a better layout in consistently less time.
 %============================
-\subsection{Comparison between the 3 algorithms with 10,000 data points}
+
+\subsection{Comparison between algorithms}
+Figure \ref{sfig:eval_multiAlgoTime} shows the execution time and stress of the produced layout of each algorithm with various data sets. The results reveal that the Hybrid algorithm is superior to other algorithms across the board. The difference compared to Chalmers' algorithm is so large that the time difference from to parameter settings seems insignificant. It should be noted that with smaller data sets, the processing time in each iteration can be faster than 17 milliseconds, the time between each frame on a typical monitor running at 60 frames per second. In D3-force, the processing is put on idle until the next screen refresh. As a result, the total execution time is limited to the number of iterations.
+
+\begin{figure}[h]  % GRAPH
+  \centering
+  \begin{subfigure}{0.45\textwidth}
+    \includegraphics[height=4cm]{graphs/multiAlgoTime.png}
+    \caption{Execution time for up to 100,000 data points of Poker Hands data set}
+    \label{sfig:eval_multiAlgoTime}
+  \end{subfigure}
+  ~ %add desired spacing between images, if blank, line break
+  \begin{subfigure}{0.45\textwidth}
+    \includegraphics[height=4cm]{graphs/multiAlgoStress.png}
+    \caption{Relative stress compared to Link Force of different data sets}
+    \label{sfig:eval_multiAlgoStress}
+  \end{subfigure}
+  \caption{Comparison between different algorithms}
+  \label{fig:eval_multiAlgo}
+\end{figure}
+
+\begin{figure} % Poker 10k
+  \centering
+  \begin{subfigure}[t]{0.3\textwidth}
+    \includegraphics[width=\textwidth]{layout/Poker10kLink.png}
+    \caption{Link Force}
+  \end{subfigure}
+  ~ %add desired spacing between images, if blank, line break
+  \begin{subfigure}[t]{0.3\textwidth}
+    \includegraphics[width=\textwidth]{layout/Poker10kNeighbour.png}
+    \caption{Chalmers' 1996}
+  \end{subfigure}
+  ~ %add desired spacing between images, if blank, line break
+  \begin{subfigure}[t]{0.3\textwidth}
+    \includegraphics[width=\textwidth]{layout/Poker10kHybrid.png}
+    \caption{Hybrid Layout}
+  \end{subfigure}
+  \caption{Visualisations of 10,000 data points from the Poker Hands data set.}
+  \label{fig:eval_Poker10k}
+\end{figure}
+
+As for the stress, a relative value is used for comparison. Figure \ref{sfig:eval_multiAlgoStress} shows that the Hybrid algorithm results in a layout of lower stress overall. A trend also implies that the more data points available, the better the Chalmers' and Hybrid algorithm perform.
+
+Comparing the produced layout, at 10,000 data points (figure \ref{fig:eval_Poker10k}), Hybrid can better reproduce the space between large clusters as seen in the Link Force's layout. For example, "Unrecongnized" (blue) and "One pair" (orange) have a clearer gap; "Two pairs" (green) and "Three of a kind" (red) overlap less; "Three of a kind" and "Straight" (brown) mixes together in Chalmers' layout but more separated in the Hybrid layout. However, for other classes with less data points (colored brown, purple, pink, ...), the hybrid layout fail to form a cluster, causing them to spread out even more. The same phenomenon can be observed at 100,000 data points (figure \ref{fig:eval_Poker100k}).
+
+\begin{figure} % Poker 100k
+  \centering
+  \begin{subfigure}[t]{0.6\textwidth}
+    \includegraphics[width=\textwidth]{layout/Poker100kNeighbour.png}
+    \caption{Chalmers' 1996}
+  \end{subfigure}
+
+  \begin{subfigure}[t]{0.6\textwidth}
+    \includegraphics[width=\textwidth]{layout/Poker100kHybrid.png}
+    \caption{Hybrid Layout}
+  \end{subfigure}
+  \caption{Visualisations of 100,000 data points from the Poker Hands data set.}
+  \label{fig:eval_Poker100k}
+\end{figure}
+
+Moving to the Antartica data set with a more complicated pattern, all three algorithms produces a very similar results. The big clustering difference is located around top center of the image. In Link Force, day 17 (brown) and 18 (lime) are lined up clearly, compared to others that fail to replicate the fine detail. Hybrid layout also fail to distinguish day 17, 18, 19 (pink) and 20 (grey) from each other in that corner. Aside from that, Hybrid form a layout slightly more similar to that of Force Link. Considering that the time used by Link Force, 1996 and Hybrid are approximately 14, 8, and 3.2 seconds respectively, it is hard to argue against using the Hybrid layout.
+
+\begin{figure} % Antartica
+  \centering
+  \begin{subfigure}[t]{0.6\textwidth}
+    \includegraphics[width=\textwidth]{layout/AntarticaLinkDay.png}
+    \caption{Link Force}
+  \end{subfigure}
+  \begin{subfigure}[t]{0.6\textwidth}
+    \includegraphics[width=\textwidth]{layout/AntarticaNeighbourDay.png}
+    \caption{Chalmers' 1996}
+  \end{subfigure}
+\end{figure}
+\begin{figure}
+  \centering
+  \ContinuedFloat
+  \begin{subfigure}[t]{0.6\textwidth}
+    \includegraphics[width=\textwidth]{layout/AntarticaHybridDay.png}
+    \caption{Hybrid Layout}
+  \end{subfigure}
+  \caption{Visualisations of the Antartica data set, color-keyed to Day.}
+  \label{fig:eval_Antartica}
+\end{figure}
+
+The area where the 1996 and Hybrid algorithm fall short is the consistency in the layout quality with smaller data points. Sometime, both algorithms stops at the local minimum stress, instead of than global, resulting in an inaccurate result. Figure \ref{fig:eval_IrisBad} and \ref{fig:eval_Poker100Bad} shows examples of such occurrence. If the 1996 algorithms were allowed to continue the calculation, the layout will eventually reach the true stable position, depending on when a right combination of $Samples$ set is randomized to trip the system off its local stable position.
+
+\begin{figure} % Iris BAD
+  \centering
+  \begin{subfigure}[t]{0.45\textwidth}
+    \includegraphics[height=4cm]{layout/IrisNeighbour.png}
+    \caption{Typical result from 1996 Algorithm}
+  \end{subfigure}
+  ~ %add desired spacing between images, if blank, line break
+  \begin{subfigure}[t]{0.45\textwidth}
+    \includegraphics[height=4cm]{layout/IrisNeighbourStuck.png}
+    \caption{Result from 1996 Algorithm with an Iris Versicolour point (orange) stuck behind the Iris Virginica (green) cluster.}
+  \end{subfigure}
+  \caption{Variations of the result from the Chalmers' 1996 algorithm on the Iris data set with the same parameters.}
+  \label{fig:eval_IrisBad}
+\end{figure}
+
+\begin{figure}[h] % Poker 100 BAD
+  \centering
+  \begin{subfigure}[t]{0.45\textwidth}
+    \includegraphics[height=5cm]{layout/Poker100Hybrid.png}
+    \caption{A result with less stress.}
+  \end{subfigure}
+  ~ %add desired spacing between images, if blank, line break
+  \begin{subfigure}[t]{0.45\textwidth}
+    \includegraphics[height=5cm]{layout/Poker100HybridBad.png}
+    \caption{A result with higher stress.}
+  \end{subfigure}
+  \caption{Variations of the result from the Hybrid on 100 data points from the Poker Hands data set with the same parameters.}
+  \label{fig:eval_Poker100Bad}
+\end{figure}
+%============================
+
+\section{Summary}
+Each algorithm demonstrates their own strengths and weaknesses in different tests. Link Force works great and perform consistently for smaller data set. Most information visualisations on a web page will not hit any limitation of the algorithm. In addition, it allows the real-time object interactions and produces smooth animations which might be more important to most users. However, for a fully-connected spring model with above 1,000 data points, the startup time spent on distance caching start to become noticeable and the each iteration takes longer than 17ms time limit, dropping the animation below 60fps, causing visible lags and slowdown. Its memory-hungry nature also limit the ability to run on lower-end computers that a significant margin of the Internet users possess.
+
+When interactivity is not a concern, performing the Hybrid layout's interpolation strategy before running the 1996 algorithm results in a better layout in a shorter amount of time. However, this is not the solution for all. The method neither work well nor consistently with smaller data set, making Link Force a better option. As for interpolation, simple brute-force method is the better choice in general. Pivot-based searching does not significantly decrease the computation time, even with a relatively large data set, and the result is less predictable.
+
+Overall, these algorithms are all valuable tools. It depends on the developer to use the right tool for the application.
 %============================


--- a/layout/AntarticaHybrid.png
+++ b/layout/AntarticaHybrid.png
--- a/layout/AntarticaHybridDay.png
+++ b/layout/AntarticaHybridDay.png
--- a/layout/AntarticaLink.png
+++ b/layout/AntarticaLink.png
--- a/layout/AntarticaLinkDay.png
+++ b/layout/AntarticaLinkDay.png
--- a/layout/AntarticaNeighbour.png
+++ b/layout/AntarticaNeighbour.png
--- a/layout/AntarticaNeighbourDay.png
+++ b/layout/AntarticaNeighbourDay.png
--- a/layout/IrisHybrid.png
+++ b/layout/IrisHybrid.png
--- a/layout/IrisLink.png
+++ b/layout/IrisLink.png
--- a/layout/IrisNeighbour.png
+++ b/layout/IrisNeighbour.png
--- a/layout/IrisNeighbourStuck.png
+++ b/layout/IrisNeighbourStuck.png
--- a/layout/Pivot6_0_100k.png
+++ b/layout/Pivot6_0_100k.png
--- a/layout/Pivot6_20_100k.png
+++ b/layout/Pivot6_20_100k.png
--- a/layout/Poker100Hybrid.png
+++ b/layout/Poker100Hybrid.png
--- a/layout/Poker100HybridBad.png
+++ b/layout/Poker100HybridBad.png
--- a/layout/Poker100kHybrid.png
+++ b/layout/Poker100kHybrid.png
--- a/layout/Poker100kNeighbour.png
+++ b/layout/Poker100kNeighbour.png
--- a/layout/Poker10kHybrid.png
+++ b/layout/Poker10kHybrid.png
--- a/layout/Poker10kLink.png
+++ b/layout/Poker10kLink.png
--- a/layout/Poker10kNeighbour.png
+++ b/layout/Poker10kNeighbour.png
--- a/layout/interpVar.png
+++ b/layout/interpVar.png
--- a/layout/interpVar1A.png
+++ b/layout/interpVar1A.png
--- a/layout/interpVar1B.png
+++ b/layout/interpVar1B.png
--- a/layout/interpVar2A.png
+++ b/layout/interpVar2A.png
--- a/layout/interpVar2B.png
+++ b/layout/interpVar2B.png
--- a/layout/refineCompare.png
+++ b/layout/refineCompare.png
--- a/layout/refineCompareA.png
+++ b/layout/refineCompareA.png
--- a/layout/refineCompareB.png
+++ b/layout/refineCompareB.png