L4Proj_Dissertation/l4proj.tex

\documentclass{l4proj}

\usepackage{url}
\usepackage{hyperref}
\usepackage{fancyvrb}
\usepackage[final]{pdfpages}
\usepackage{algpseudocode}
\usepackage{wrapfig}
\usepackage{graphicx}
\usepackage{subcaption}
\usepackage{listings}
\usepackage{color}
\usepackage{multicol}
\usepackage[super]{nth}

\renewcommand{\lstlistingname}{Code}% Listing -> Algorithm

%define Javascript language
\definecolor{lightgray}{rgb}{.9,.9,.9}
\definecolor{darkgray}{rgb}{.4,.4,.4}
\definecolor{purple}{rgb}{0.65, 0.12, 0.82}

\lstdefinelanguage{JavaScript}{
  keywords={typeof, new, true, false, catch, function, return, null, catch, switch, var, if, in, while, do, else, case, break, let, for},
  keywordstyle=\color{blue}\bfseries,
  ndkeywords={class, export, boolean, throw, implements, import, this},
  ndkeywordstyle=\color{darkgray}\bfseries,
  identifierstyle=\color{black},
  sensitive=false,
  comment=[l]{//},
  morecomment=[s]{/*}{*/},
  commentstyle=\color{purple}\ttfamily,
  stringstyle=\color{red}\ttfamily,
  morestring=[b]',
  morestring=[b]"
}

\lstset{
   language=JavaScript,
   backgroundcolor=\color{lightgray},
   extendedchars=true,
   basicstyle=\footnotesize\ttfamily,
   showstringspaces=false,
   showspaces=false,
   numbers=left,
   numberstyle=\footnotesize,
   numbersep=9pt,
   tabsize=2,
   breaklines=true,
   showtabs=false,
   captionpos=b
}

\begin{document}
\title{Faster force-directed layout algorithms for the D3 visualisation toolkit}
\author{Pitchaya Boonsarngsuk}
\date{March 21, 2018}
\maketitle

\begin{abstract}
In the past few years, data visualisation on the web is becoming more popular. D3, a JavaScript library, has a module that focuses on simulating physical forces on particles for creating force-directed layouts. However, the currently-available algorithms does not scale very well for multidimensional scaling. To solve this problem, the Hybrid Layout algorithm and its pivot-based near neighbour search enhancement was implemented and integrated into the D3 module. The existing D3's and Bartasius' implementation of Chalmers' 1996 algorithm were also optimised for the use case and compared against the Hybrid algorithm. Furthermore, experiments were also performed to evaluate the impact of each user-defined parameters. The results show that for larger data sets, the Hybrid Layout consistently produces fairly good layouts in a shorter amount of time. It is also capable of working on larger data sets, compared to the D3's algorithm.
\end{abstract}

%\educationalconsent
%
%NOTE: if you include the educationalconsent (above) and your project is graded an A then
%      it may be entered in the CS Hall of Fame
%
\tableofcontents
%==============================================================================
%%%%%%%%%%%%%%%%
%              %
% Introduction %
%              %
%%%%%%%%%%%%%%%%
\chapter{Introduction}
\label{ch:intro}
\pagenumbering{arabic} % ONLY DO THIS AT THE FIRST CHAPTER

\section{Motivation}
In the age of Web 2.0, new data are being generated at an overwhelming speed. Raw data made up of numbers, letters, and boolean values are hard for humans to comprehend and infer any relation from it. To make it easier and faster for us, humans, to understand a data set, various techniques were created to map raw data to a visual representation.

Many data sets have many features while humans perceive 2D illustrations best, leading to the challenge of dimensionality scaling. There are many approaches to this problem, each with its own pros and cons. One of the approaches is multidimensional scaling (MDS) which hi-light the similarity and clustering of data to the audiences. The idea is to map a data point to a particle in 2D space and place them in a way that the distance between each pair of particle in 2D space represents their distance in high-dimensional space.

\begin{figure}[h]
  \centering
  \includegraphics[height=5cm]{d3-samples/d3-horizons-chart.png}
  ~
  \includegraphics[height=5cm]{d3-samples/d3-radial-box.png}

  \includegraphics[height=5cm]{d3-samples/d3-wordcloud.png}
  ~
  \includegraphics[height=5cm]{d3-samples/d3-sunburst-partition.png}
  \caption{Several different visualisations based on the D3 framework: Horizon Chart (top left), Radial Boxplot (top right), Word Cloud (lower left) and Sunburst Partition (lower right).}
  \label{fig:intro_d3Samples}
\end{figure}

With the recent trend of moving away from traditional native applications to easily-accessible cross-platform web applications, many data visualisation toolkit for JavaScript such as Google Charts, Chart.js and D3.js, are developed. With these libraries, it is easier for website designers and content creators to setup interactive attention-grabbing infographics, allowing more people to understand their works with less cognitive load.

One of the most popular free open-source data visualisation library is Data Driven Documents\cite{D3}\cite{D3Web}. The premise is to bind an arbitrary raw data to a web page content and then apply data-driven transformations, breathing life into it, all while only using standard web technologies and avoiding any restrictions from proprietary software. This makes the library highly accessible, allowing applications to reach wider audience.

The D3-Force module, part of the D3 library, provides a framework for simulating physical forces on particles. Along with that, a spring-model algorithm was also implemented to allow for creation of a force-directed layout. While the implementation is fast for several thousands particles, it does not scale well with larger data sets, both in term of memory and time complexity. By solving these issues, the use cases covered by the module would expand to support more complicated data sets. The motivation of the project is to improve this scalability issues with better algorithms from the School of Computing Science.

\section{Project Description}
The University of Glasgow's School of Computing Science have some of the fastest force-directed layout drawing algorithms in the world. Some of these are Chalmers' 1996 Neighbour and Sampling technique\cite{Algo1996}, 2002 Hybrid Layout algorithm\cite{Algo2002} and its 2003 enhanced variant\cite{Algo2003}. These algorithms provide huge improvements, both in term of speed and memory complexity. However, these algorithms are only implemented in an older version of Java which limits its practical use. In 2017, Bartasius has implemented the 1996 algorithm along with several others and a visual interface in order to compare each algorithm against the other\cite{LastYear}.

In short, the goal of the project is to
\begin{itemize}
  \item implement Hybrid Layout algorithms from School of Computing Science in JavaScript
  \item integrate the implementation into the D3 library and Bartasius' tool set
  \item optimise existing implementations of basic spring model and Chalmers' algorithm
  \item evaluate and compare each algorithm
\end{itemize}

\section{Outline}
The remainder of the report will discuss the following:
\begin{itemize}
  \item \textbf{Background} This chapter discusses approaches to visualise high-dimensional data, and introduce the theory behind each implemented algorithm.
  \item \textbf{Design} This chapters discusses choices of technologies.
  \item \textbf{Implementation} This chapter will briefly show decisions and justifications made during the implementation process, along with several code snippets.
  \item \textbf{Evaluation} This chapter will detail the process used to compare the performance of each algorithm, starting from the experiment design to the final result.
  \item \textbf{Conclusion} This chapter gives a brief summary of the project, reflect on the process in general, and discusses possible future improvements.
\end{itemize}


%==============================================================================
%%%%%%%%%%%%%%%%
%              %
%  Background  %
%              %
%%%%%%%%%%%%%%%%
\chapter{Background}
\label{ch:bg}

With the emergence of more complex data, each having many features, the need of mapping the high-dimensional data down to 2D space is increasing. Figure \ref{fig:bg_many_multidimension} shows two approaches to the problem. One of the earliest method is to align graphs on the basis of one axis they all share. While it is still being used, the use cases are limited due to all graphs having to share an axis. On the other hand, scatterplot matrix performs scatterplot of every pair of dimensions, allowing users to see relations between many different dimensions. However, the screen space usage also rises quadratically, making it unsuitable for data with a very high number of dimensions.

\begin{figure}[h]
  \centering
  \begin{subfigure}{0.45\textwidth}
    \centering
    \includegraphics[height=6cm]{d3-samples/d3-single-axis-composition.png}
    \caption{Single-axis composition}
  \end{subfigure}
  \begin{subfigure}{0.45\textwidth}
    \centering
    \includegraphics[height=6cm]{d3-samples/d3-scatterplot-matrix.png}
    \caption{Scatterplot Matrix}
  \end{subfigure}
  \caption{Different approaches to visualise high-dimensional data}
  \label{fig:bg_many_multidimension}
\end{figure}

Unlike the two previously mentioned techniques, multidimensional scaling (MDS) took another approach by aiming to reduce data dimension by preserving the level of similarity, rather than the values.
Classical MDS (also known as Principal Coordinates Analysis or PCA)\cite{cMDS} achieves this goal by creating new dimensions for scatter-plotting, each made up of a linear combination of the original dimensions, while minimising a loss function called strain. For simple cases, it can be thought of as finding the a camera angle to project the high-dimensional scatterplot onto a 2D image.
Because strain assumes Euclidean distances, making it incompatible with other dissimilarity ratings. Metric MDS improves upon classical MDS by generalising the solution to support a variety of loss functions\cite{mcMDS}. However, the disadvantage of $O(N^3)$ time complexity still remains and linear combination may not be enough for some data sets.

This project focuses on several non-linear MDS algorithms using force-directed layout. The idea is to attach each pair of data points with a spring whose equilibrium length is proportional to the high-dimensional distance between the two points, although the spring model we know today does not necessary use Hooke's law to calculate the spring force\cite{Eades}. Several improvements have been introduced to the idea over the past decade. For example, the concept of 'temperature' purposed by Fruchterman and Reingold\cite{SpringTemp} solves the problem where the system is unable to reach an equilibrium state and improves execution time. The project focuses on an iterative spring-model-based algorithm introduced by Chalmers\cite{Algo1996} and the Hybrid approach which will be detailed in subsequent sections of this chapter.

There is a number of other non-linear MDS algorithms. t-distributed Stochastic Neighbour Embedding (t-SNE)\cite{tSNE}, for example, is very popular in the field of machine learning. It is based on SNE\cite{SNE} where probability distributions are constructed over each pair of data point in a way that the more similar objects have higher probability of being picked. The distributions derived from both high-dimensional and low-dimensional distances are compared using the Kullback–Leibler divergence, a metric to measure the similarity between two probability distributions. Then, the 2D position of each data points are then iteratively adjusted to maximize the similarity. The biggest downside is that it have both time and memory complexity of $O(N^2)$ per iteration. In 2017, Bartasius\cite{LastYear} implemented t-SNE in D3 and found that not only is it the slowest algorithm in his test, the produced layout is also the many times worse in term of Stress, a metric which will be introduced in section \ref{sec:bg_metrics}. However, comparing the Stress of a t-SNE layout is unfair as t-SNE is designed to optimise the Kullback–Leibler divergence and not Stress.

Other algorithms use different approaches. Kernel PCA tricks classical MDS (PCA) into being non-linear by using the kernels\cite{kPCA}. Simply put, kernel functions are used to create new dimensions from the existing ones. These kernels can be non-linear. Hence, PCA can use these new dimensions to create a non-linear combination of the original dimensions. The limitation is that the kernels are user-defined, thus, it is up to the user to define good kernels to create a good layout.
Local MDS\cite{LMDS} performs a different trick on MDS by using MDS in local regions and stitching them together, using convex optimization. While it focuses on Trustworthiness and Continuity, the errors concerning each data points' neighbourhood, its overall layouts fail to form any visible clusters.
Sammon's mapping\cite{Sammon}, on the other hand, find a good position for each data point by using gradient descent to minimise Sammon's error, a function similar to Stress (section \ref{sec:bg_metrics}). However, gradient descent can only find a local minimum and the solution is not guaranteed to converge.

The rest of this chapter will describes each of the algorithm and performance metrics used in this project in detail.

\section{Link Force}
\label{sec:linkbg}
D3 library, which will be described in section \ref{sec:des_d3}, have several different force models implemented for creating a force-directed graph. One of them is Link Force. In this brute-force method, a force is applied between the two nodes at the end of each link. The force pushes the nodes together or apart with varying strength, proportional to the error between the desired and current distance on the graph. Essentially, is the spring model with a custom spring-force calculation formula. An example of a graph produced by the D3 link force is shown in figure \ref{fig:bg_linkForce}. In MDS where the high-dimensional distance between every pair of nodes can be calculated, a link will be created to represent each pair, resulting in a complete graph.

\begin{figure}[h]
  \centering
  \includegraphics[height=9.2cm]{d3-samples/d3-force.png}
  \caption{An example of a graph produced by D3 Link Force.}
  \label{fig:bg_linkForce}
\end{figure}

The Link Force algorithm is inefficient. In each time step (iteration), a calculation has to be done for each pair of nodes connected with a link. This means that for MDS with $N$ nodes, the algorithm will have to perform $N(N-1)$ force calculations per iteration, essentially $O(N^2)$. It is also believed that the number of iterations required to create a good layout is proportional to the size of the data set, hence the total time complexity of $O(N^3)$.
The model also cache the desired distance of each link in memory to improve speed across multiple iterations. While this greatly reduces the number of calls to the distance-calculating function, the memory complexity also increases to $O(N^2)$. Because JavaScript memory heap is limited, it runs out of memory when trying to process a complete graph of more than around three thousands data points, depending on the features of the data.

\section{Chalmers' 1996 algorithm}
In 1996, Matthew Chalmers proposed a technique to reduce the time complexity down to $O(N^2)$, which is a massive improvement over link force's $O(N^3)$, potentially at the cost of accuracy. This is done by reducing the number of spring force calculations per iterations, using random samples\cite{Algo1996}.

To begin, each object $i$, is assigned two distinct sets. $Neighbors$ set, referred to as $V$ in the original paper, stores a sorted list of other objects that are closest to $i$, i.e. have low high-dimensional distance. These objects are expected to be put near $i$ in 2D space. At the start, this set is empty. The second set is $Samples$ (referred to as $S$). This set contains a number of other random objects that not member of the $Neighbors$ set, and is regenerated at the start of every iteration.

In each iteration, each object $i$ only performs spring force calculations against every other objects in its $Neighbors$ and $Samples$ sets. Afterward, each random object is then compared against other objects in the $Neighbors$ set. If a random object is closer to $i$, then it is swapped into the $Neighbors$ set. As a result, the $Neighbors$ set becomes a better representative of the most similar objects to $i$.

The total number of spring calculations per iteration reduces from $N(N-1)$ to $N(Neighbours_{size} + Samples_{size})$ where $Neighbours_{size}$ and $Samples_{size}$ denotes the maximum number of objects each $Neighbours$ and $Samples$ set, respectively. Because these two numbers are predefined constants, the time complexity is $O(N)$.

Previous evaluations indicated that the quality of the produced layout improves as $Neighbours_{size}$ and $Samples_{size}$ grows larger. For larger data sets, setting too small values could cause the algorithm to miss some details. However, favourable results can be obtained from numbers as low as 5 and 10 for $Neighbours_{size}$ and $Samples_{size}$\cite{Algo2002}.

\section{Hybrid Layout for Multidimensional Scaling}
\label{sec:bg_hybrid}
In 2002, Alistair Morrison, Greg Ross, and Matthew Chalmers introduced a multi-phase method, based on Chalmers' 1996 algorithm to reduce the run time down to $O(N\sqrt{N})$. This is achieved by calculating the spring forces over a subset of data, and interpolating the rest\cite{Algo2002}.

In this hybrid layout method, the $\sqrt{N}$ sample objects ($S$) are first placed on the 2D space using the 1996 algorithm. The complexity of this step is $O(\sqrt{N}\sqrt{N})$ or $O(N)$. After that, each of the other objects $i$ are then interpolated as described below.
\begin{enumerate}
  \item \label{step:hybridFindPar} Find the `parent' object $x\in{S}$ with the least high-dimensional distance to $i$. This is essentially the nearest neighbour searching problem.
  \item Define a circle around $x$ with radius $r$, proportional to the high-dimensional distance between $x$ and $i$.
  \item Find the quadrant of the circle which is the most satisfactory to place $i$.
  \item Perform a binary search on the quadrant to determine the best angle for $i$ and place it there.
  \item Select random samples $s$ from $S$. $s\subset{S}$.
  \item \label{step:hybridFindVec} Calculate the sum of force vector between $i$ and each member of $s$.
  \item \label{step:hybridApplyVec} Add the vector to $i$'s current position.
  \item Repeat step \ref{step:hybridFindVec} and \ref{step:hybridApplyVec} for a constant number of times to refine the placement.
\end{enumerate}

In this process, step \ref{step:hybridFindPar} has the highest time complexity of $O(S_{size}) = O(\sqrt{N})$. Because there are $N-\sqrt{N}$ objects to interpolate, the overall complexity of this step is $O(N\sqrt{N})$.

Finally, the Chalmers' spring model is applied to the full data set for a constant number of iterations. This operation have the time complexity of $O(N)$.

Previous evaluations show that this method is faster than the Chalmers' 1996 algorithm, and can create a layout with lower stress, thanks to the more accurate positioning in the interpolation process.

\section{Hybrid MDS with Pivot-Based Searching algorithm}
\label{sec:bg_hybridPivot}

\begin{wrapfigure}{rh}{0.3\textwidth}
  \centering
  \includegraphics[width=0.3\textwidth]{images/pivotBucketsIllust.png}
  \caption{Diagram of a pivot (dark shaded point) with five buckets, illustrated as discs between dotted circle. Each of the other points in $S$ are classified into buckets by the distances to the pivot.}
  \label{fig:bg_pivotBuckets}
\end{wrapfigure}

The bottleneck of the Hybrid Layout Algorithm is the nearest-neighbour searching process during the interpolation. The previous brute-force method results in the time complexity of $O(N\sqrt{N})$. This improvement introduces pivot-based searching to approximate a near-neighbour and reduces the time complexity to $O(N^\frac{5}{4})$\cite{Algo2003}.

The main improvements is gained by pre-processing the set $S$ ($\sqrt{N}$ samples) so that each of the $N-\sqrt{N}$ other points can find the parent is faster. To begin, $k$ points were selected from $S$ as `parent'. Each pivot $p\in{k}$ have a number of buckets. Every other points in $S-\{p\}$ assigned a bucket number, based on the distance from to $p$ as illustrated in figure \ref{fig:bg_pivotBuckets}.

To find a parent of an object, a distance calculation is first performed against each pivot to determine which bucket of each pivot is the object in. From this, the content of each bucket is searched for the nearest neighbor.

\begin{algorithmic}
\item Pre-processing:
\ForAll{pivot in $k$}
	\ForAll{points in $(S-k)$}
		\State Perform distance calculation
	\EndFor
\EndFor
\end{algorithmic}
\begin{algorithmic}
\item Find parent for object $i$:
\ForAll{pivot $p$ in $k$}
	\State Perform distance calculation.
    \State Determine the bucket for $i$ in $p$.
	\ForAll{point in the bucket}
    	\State Perform distance calculation
	\EndFor
\EndFor
\end{algorithmic}

The complexity of the preprocessing stage is $O(\sqrt{N}k)$. For query, the average number of points in each bucket is $\frac{S_{size}}{number of buckets} = \frac{\sqrt{n}}{n^{\frac{1}{4}}}$. Since a query will be performed for each of the $N-\sqrt{N}$ points not in $S$, overall complexity is $O(\sqrt{N}k) + (N-\sqrt{N})N^{\frac{1}{4}} = O(N^{\frac{5}{4}})$.

With this method, the parent found is not guaranteed to be the closest point. Prior evaluations have concluded that the accuracy is high enough to produce good result.

\section{Performance Metrics}
\label{sec:bg_metrics}
To compare different algorithms, they have to be tested against the same set of performance metrices. During the development, a number of metrics were used to objectively judge the produced layout and computation requirements. The evaluation process in chapter \ref{ch:eval} will focuses on the following metrics.

\begin{itemize}
  \item \textbf{Execution time} is a broadly used metric to evaluate any algorithms that require any significant computational power. Some applications aim to be interactive and the algorithm have to finish the calculations within the time constraints for the program to stay responsive. This project, however, focuses on large data sets with minimal user interaction. Hence, the execution time in this project simply measures of the time an algorithm takes to produce its ``final'' result. The criteria to consider a layout ``finished'' will be discussed in details in section \ref{ssec:eval_termCriteria}.

  \item \textbf{Stress} is one of the most popular metric for non-linear MDS algorithms and is modelled from the mechanical stress of a spring system. It is based on sum-of-squared errors of inter-object distance\cite{Algo1996}. The function is defined as follow. $$Stress = \frac{\sum_{i<j} (d_{ij}-g_{ij})^2}{\sum_{i<j} g^2_{ij}}$$ $d_{ij}$ denotes the desired high-dimensional distance between object $i$ and $j$ while $g_{ij}$denotes the low-dimensional distance.

  While Stress is a good metric to evaluate a layout, its calculation is an expensive operation ($O(N^2)$). At the same time, it is not part of operation of any algorithm. Thus, by adding this optional measurement step between iterations, every algorithm will takes a lot longer that complete, invalidating the measured execution time of the run.

  \item \textbf{Memory usage:} With more interests in the field of machine learning, the number of data points in a data set is getting bigger. It is common to encounter data sets with millions instances, each with possibly hundreds of attributes. Therefore, memory usage shows how an algorithm scales to larger data sets and how many data points can a computer system handle.
\end{itemize}

\section{Summary}
In this chapter, several techniques of visualising multidimensional data have been explored. As the focus of the project is on three spring-model-based algorithms, the principles of each of the method have been discussed. Finally, in order to measure the performance of each algorithm, different metrics were introduced and will be used for the evaluation process.

%==============================================================================
%%%%%%%%%%%%%%%%
%              %
%    Design    %
%              %
%%%%%%%%%%%%%%%%
\chapter{Design}
\label{ch:design}

This chapter discusses decisions for selecting technologies and libraries during the development process. It also briefly describes each technology, available alternatives, and Bartasius' application which this project is built on.

\section{Technologies}

With the goal of reaching as wide audience as possible, the project advisor set a requirement that the application must run on a modern web browser. This section briefly introduce web technologies used to develop the project.

%============================
\subsection{HTML, CSS, and SVG}
HTML and CSS are the two core technologies used to build web pages. Modern web applications can not avoid these standards, and this project is no exception. HTML (Hypertext Markup Language) describes the structure and content of a web page. CSS (Cascading Style Sheets) defines the visual layout of the page. The latest major version of the standards are HTML 5 and CSS3, both of which are currently supported by all major web browsers.
Aside from user interface, this project rely heavily on Scalable Vector Graphics (SVG), an open XML-based vector image format, to be supported in the HTML standard to render the produced layout.
%Cite HTML, CSS,SVG
HTML 5 allow SVG to be embedded directly in an \texttt{<svg>..</svg>} tag, rather than a seperate xhtml document in previous HTML versions. In this project, an SVG is used as a base canvas to display the produced graphics. Each data point is then drawn as a circle as shown in figure \ref{fig:des_svgobject}.

\begin{figure}[h]
  \centering
  \includegraphics[height=5cm]{images/svgobject.png}
  \caption{An example of SVG document representing data points.}
  \label{fig:des_svgobject}
\end{figure}
%============================
\subsection{JavaScript}
\label{ssec:des_js}
JavaScript is the most common high-level scripting language for web pages. It is dynamic, untyped and multi-paradigm, supporting event-driven, functional, prototype-based, and object-oriented programming styles.
It mostly runs on client's browser interpreter, although platforms such as Node.js allow it to run on servers as well. Many APIs are designed for manipulating the HTML document, allowing programmers to create dynamic web pages by changing contents according to a variety of events.
%Cite JS
Alternative languages such as CoffeeScript and TypeScript are emerging, each adding more features, syntactic sugars, or syntax changes to improve code readability. However, in order to run these languages on browsers, they have to be compiled back to JavaScript. The learning resources availability is also leagues behind JavaScript. For these reasons, standard JavaScript was chosen for this project.

Being an interpreted high-level language, it is relatively slow. Due to limitations from standard APIs, it is also single-threaded. asm.js is an effort to optimize JavaScript by using only a restricted subset of features. It is intended to be a compilation target from other statically-typed languages such as C/C++ rather than a language to code in\cite{asmjs}. Existing JavaScript engines may gain performance from asm.js' restrictions such as preallocated heap, reducing load on the Garbage Collector. Firefox and Edge also recognize asm.js and compile the code to assembly ahead-of-time (AOT) to eliminate the need to run code through interpreter entirely, resulting in a significant speed increase\cite{asmjsSpeed}. However, the D3 library is still using standard JavaScript. A large chunk of the library have to be ported in order to be able to compare different algorithms fairly. Since a lot of effort is required to potentially improve performance significantly on 2 browsers and marginally on others, asm.js was not selected for this project.

WebAssembly (wasm) is another recent contender. Unlike JavaScript, it is a binary format designed to run with JavaScript on the same sandboxed stack machine\cite{WebAssembly}. Similar to asm.js, it is intended to be a compile target. With support for additional CPU instructions not available in JavaScript, it also perform predictably better than asm.js. Only recently exited the preview phase in March 2017, the support was not widespread and learning resources was hard to find. It also inherit a risk of not being widely adopted by browsers. As a result, WebAssembly was not considered as a viable option.
%============================

\section{Data Driven Document}
\label{sec:des_d3}
Data Driven Documents (D3 or D3.js)\cite{D3}\cite{D3Web} is one of the most popular JavaScript library for interactive data visualisations in web browsers. The focus is to bind data to DOM (Document Object Model) elements in HTML or SVG documents and apply data-driven transformations to make the visualisation visually appealing. Its modular and free open-source nature also makes it flexible. Many visualisation algorithms can be easy integrated into it. In this project, aside from Force Link algorithm, the complicated process of translating velocities and location onto an SVG document are handled by the D3 library, allowing the project to just focus more on the algorithms.

There are several other data visualisation libraries such as Google Charts and Chart.js. However, most of them do not support force-directed layout and are not as flexible. In addition to being a requirement set by the project advisor, D3 seems like the best choice for this project.
%============================

\section{Bartasius' D3 Neighbour Sampling plug-in}
In 2017, Bartasius implemented the Chalmers' 1996 algorithm and several others algorithms for his level 4 project at the School of Computing Science. All source files are released on GitHub under the MIT license. To reduce the amount of duplicated work, the project advisor recommended using the repository as a groundwork to implement other algorithms upon.
%============================

\subsection{Input Data}
The data is one of the most important element of the project. Without it, nothing can be visualised. Since the data may consist of many different features (attributes), each with a unique name, it makes sense to store each data point (node) as an JavaScript object, a collection of \texttt{key:value} pairs. To conform with D3 API, all nodes are stored in a list (array).
Two example data structures are shown in figure \ref{fig:des_jsobject}

\begin{figure}[h]
  \centering
  \includegraphics[height=7cm]{images/jsobj.png}
  \caption{Examples of the data structure used to store the input data. On the left is nodes from the Poker Hand data set. Right shows nodes from the Antarctica data set. The two data sets will be explained in section \ref{sec:EvalDataSet}.}
  \label{fig:des_jsobject}
\end{figure}
%============================

\subsection{Graphical User Interface}
Due to the sheer amount of experiments to run, manually changing functions and file names between each run will be a tedious task. Bartasius developed a GUI for the plug-in to ease the testing process. For this project, modifications have been made to accommodate newly implemented algorithms.

Figure \ref{fig:des_gui} shows the modified GUI used in this project. At the top is the canvas to draw the produced layout. The controls below are then divided into 3 columns. The left column controls data set input, rendering, and iterations limit. The middle column are a set radio and slider buttons for selecting the algorithm and parameters to use. The right contains a list of distance functions to choose from.

\begin{figure}[h]
  \centering
  \includegraphics[height=10cm]{images/GUI.png}
  \caption{The graphical interface.}
  \label{fig:des_gui}
\end{figure}
%============================

\section{Summary}
In this chapter, several technologies and alternatives were discussed. In the end, the project is set out to reuse Bartasius's repository, using D3.js with standard JavaScript, HTML, CSS and SVG for their learning resources.


%==============================================================================
%%%%%%%%%%%%%%%%
%              %
%   Implement  %
%              %
%%%%%%%%%%%%%%%%
\chapter{Implementation}
\label{ch:imp}

\section{Outline}
The D3 library is modular. D3-force is the most relevant module for this project. The module provides a simplified Simulation object to simulate various physical force calculations. Each Simulation contain a list of nodes and Force objects. Interfaces were defined, allowing each Force to access the node list. To keep track of positions, each node will be assigned values representing its current location and velocity vector. These values can then be used by the application to draw a graph. In each constant unit time step (iteration), the Simulation will trigger a function in each Force object, allowing them to calculate and add values to each particle's velocity vector, which will then be added to the particle's position by the Simulation. For MDS, each data point is represented as a particle in the simulation.

Because D3 is a library to be built into other web applications, the algorithms implemented can not be used on their own. Fortunately, as part of Bartasius' level 4 project in 2017, a web application for testing and evaluation has already been created with graphical user interface designed to allow the user to easily select an algorithm, data set, and parameter values to use. Various distance functions, including one specifically created for the Poker Hands data set\cite{UCL_Data} which will be used for evaluation (section \ref{sec:EvalDataSet}), are also in place and fully functional.

The csv-formatted data file can be loaded locally. Next, it is parsed by Papa Parse JavaScript library\cite{PapaParse} and then loaded into the Simulation.
Depending on the distance functions, per-dimension mean, variance, and other attributes may also be calculated as well. These values are used in several general distance functions to scale values of each feature. The D3 simulation layout is shown on an SVG canvas with zoom functionality to allow graph investigation. The distance function scaling was tweaked to only affect rendering and not the force calculation.

Several values used for evaluation such as execution time, total force applied per iteration, and stress may also be computed. However, these values are printed out to JavaScript console instead.

Due to the growing number of algorithms and variables, the main JavaScript code has been refactored. Functions for controlling each algorithm have been extracted to their own file, and some unused codes are removed or commented out.

\section{Algorithms}
This section discusses implementation decisions for each algorithm, some of which are already implemented in D3 force module and the d3-neighbour-sampling plugin. Adjustments made to third-party-implemented algorithms are also discussed.

\subsection{Link force}
\label{sec:imp_linkForce}
D3-force module has an algorithm implemented to produce a force-directed layout. The main idea is to change the velocity vector of each pair of node connected via a link at every time step, simulating force application. For example, if two nodes are further apart than the desired distance, a force is applied to both nodes to pull them together. The implementation also supports incomplete graphs, thus the links have to be specified. The force is also, by default, scaled on each node depending on how many springs it is attached to, in order to balance the force applied to heavily and lightly connected nodes, improving overall stability. Without such scaling, the graph would expands in every direction.

In the early stages of the project, when assessing the library, it is observed that many of the available features are unused for Multidimensional scaling. In order to reduce the computation time and memory usage, I created a modified version of Force Link as part of the plug-in. The following are the improved aspects.

Firstly, to accommodate an incomplete graph, the force scaling has to be calculated for each node and each link. The calculated values are then cached in a similar manner to the distances ($bias$ and $strengths$ in code \ref{lst:impl_LinkD3}). In a fully-connected graph, these values are the same for every links and nodes. To save on memory and startup time, the arrays were replaced by a single number value.

\begin{lstlisting}[language=JavaScript,caption={Force calculation function of Force Link as implemented in D3.},label={lst:impl_LinkD3}]
  function force(alpha) {
    for (var k = 0, n = links.length; k < iterations; ++k) {
      for (var i = 0, link, source, target, x, y, l, b; i < n; ++i) {
        link = links[i], source = link.source, target = link.target;
        x = target.x + target.vx - source.x - source.vx || jiggle();
        y = target.y + target.vy - source.y - source.vy || jiggle();
        l = Math.sqrt(x * x + y * y);
        l = (l - distances[i]) / l * alpha * strengths[i];
        x *= l, y *= l;
        target.vx -= x * (b = bias[i]);
        target.vy -= y * b;
        source.vx += x * (b = 1 - b);
        source.vy += y * b;
      }
    }
  }
\end{lstlisting}

Secondly, D3's Force Link requires the user to specify an array of links to describe the graph. Each link is a string-indexed dictionary which is not the most memory-friendly data type. The cached distance values are stored in a separated array with index parallel to that of the links array. Since nodes are also stored in an array, I replaced the entire links array with a nested loop over the nodes array, reducing the memory footprint even further and eliminating time required to construct the array. The index for the cached distance is then adjusted accordingly.

\begin{lstlisting}[language=JavaScript,caption={Part of the customized force calculation function.},label={lst:impl_LinkCustom}]
  function force(alpha) {
    let n = nodes.length;
    ...
    for (var k = 0, source, target, i, j, x, y, l; k < iterations; ++k) {
      for (i = 1; i < n; i++) for (j = 0; j < i; j++) { // For each link
        // jiggle so x, y and l won't be zero and causes divide by zero error later on
        source = nodes[i];  target = nodes[j];
        x = target.x + target.vx - source.x - source.vx || jiggle();
        y = target.y + target.vy - source.y - source.vy || jiggle();
        l = Math.sqrt(x * x + y * y);
        //dataSizeFactor = 0.5/(nodes.length-1), pre-calculated only once
        l = (l - distances[i*(i-1)/2+j]) / l  * dataSizeFactor * alpha;
        x *= l, y *= l;
        target.vx -= x;  target.vy -= y;
        source.vx += x;  source.vy += y;
      }
    }
  ...
  }
\end{lstlisting}

After optimisation, the execution time decreases marginally while memory consumption decreases by a seventh, raising data size limit from 3,200 data points\cite{LastYear} to over 10,000 in the process. Details on the evaluation procedure and data size limitation will be discussed in section \ref{ssec:eval_ram}.

\begin{figure}[h]
  \centering
  \includegraphics[height=5cm]{graphs/linkOptimize.png}
  \caption{A comparison in memory usage and execution time between versions Force Link at 3,000 data points from Poker Hands data set for 300 iterations.}
  \label{fig:imp_linkComparison}
\end{figure}

Next, \texttt{jiggle()} function was assessed. As shown in line 5-7 of code \ref{lst:impl_LinkD3}, in cases where two nodes are projected to be on the exact same location, \texttt{x}, \texttt{y} and, in-turn, \texttt{l}, could be 0. This would cause a divide-by-zero error in line 8. Rather than throwing an error, JavaScript would return the result as \texttt{Infinity} or \texttt{-Infinity}. Any subsequent arithmetic operations, except for modulus, with other numbers will also results in \texttt{$\pm{}$Infinity}, effectively deleting the coordinate and velocity values from the entire system. To prevent such error, when \texttt{x} or \texttt{y} is calculated to be zero, D3 will replace the values with a very small random number generated by \texttt{jiggle()}. While extremely unlikely, there is still a chance that \texttt{jiggle()} will random return a random value of 0. This case can rarely be observed when every nodes are initially placed at the exact same position. To counter this, I modified \texttt{jiggle()} to re-random a number until a non-zero value is found.

Finally, a feature is added to track the average force applied to the system in each iteration. A threshold value is set so that once average force falls below the threshold, a user-defined function is called. In this case, a handler is added to Bartasius' application to stop the simulation. This is feature will be heavily used in the evaluation process (section \ref{ssec:eval_termCriteria}).
%============================
\subsection{Chalmers' 1996}
\label{sec:imp_neigbour}
Bartasius' d3-neighbour-sampling plug-in has the main focus on Chalmers' 1996 algorithm. The idea is to use the exact same force calculation function as D3 Force Link for a fair comparison. The algorithm was also implemented as a Force object to be used by a Simulation. As part of the project, I refactored the code base to ease the development process and improved a shortcoming.

Aside from formatting the code, Bartasius' implementation does not have spring force scaling, making the graph explodes in every direction. Originally, the example implementation used decaying $alpha$, a variable controlled by the Simulation used for artificially scaling down the force applied to the system over time, to make the system contract back. A constant \texttt{dataSizeFactor}, similar to that in the custom Link Force, have been added to mitigate the requirement of decaying alpha.

Next, after seeing memory footprint of the optimized Link Force, an idea occurred to also cache all the distances between every pair of nodes. After the implementation, an experiment was ran to compare the performance.
However, even with a data set of moderate size and higher number of iterations than typical requirement, the time spent on caching is higher than time saved, resulting in longer total execution time (figure \ref{fig:imp_chalmersCache}). The JavaScript heap usage also raises to 128 MB with manually-invoked garbage collector (will be discussed in section \ref{ssec:eval_ram}) when originally, it has never used more than 50 MB. With all these drawbacks, this patch was withdrawn.

\begin{figure}[h]
  \centering
  \includegraphics[height=5cm]{graphs/neighbourCache.png}
  \caption{A comparison in execution time between force and distance calculation, with and without caching on 5,000 data points of Poker Hands data set for 300 iterations.}
  \label{fig:imp_chalmersCache}
\end{figure}

Lastly, the average force applied tracker, similar to that added to Force Link, is also added for the evaluation process. It should be noted that unlike Force Link, the system will not stabilise to a freezing point. Because the $Samples$ set keeps changing randomly, there is no single state where every spring forces cancel each other out completely. This can also be seen when the animation is drawn where every nodes keep wiggling about but the overall layout remains constant.
%============================
\subsection{Hybrid Layout}
Because Hybrid Layout is a multi-phase use of the Chalmers' algorithm, it does not fit well with the limited interfaces designed for Force objects. The approach taken is to implement the Hybrid algorithm as a new JavaScript object that takes control of the Simulation instead. To make it fit with other D3 APIs, the designs have to first be studied.

The D3 API extensively with the Method Chaining design pattern. The main idea is that by having each method returns the reference to itself rather than nothing, the method calls on the same object can be chained together in a single statement\cite{CleanCode}. In addition, the code readability is also improved to a certain degree. With the knowledge in mind, the same trend is followed for this Hybrid object.

\begin{lstlisting}[language=JavaScript,caption={Simplified example of using the API.},label={lst:impl_HybridUsage}]
  let simulation = d3.forceSimulation() // Configured D3 Simulation object
    .nodes(allNodes);

  let firstPhaseForce = d3.forceNeighbourSampling() // Configured Chalmers force object
    .neighbourSize(NEIGHBOUR_SIZE)
    .sampleSize(SAMPLE_SIZE)
    .distance(distanceCalculator);

  let thirdPhaseForce = d3.forceNeighbourSampling() // Configured Chalmers force object
    .setParameter(Value); // Similar to above

  let hybridSimulation = d3.hybridSimulation(simulation)
    .forceSample(firstPhaseForce)
    .forceFull(thirdPhaseForce)
    .numPivots(PIVOTS ? NUM_PIVOTS:0) // brute-force is used when < 1 pivot is specified
    .on("startInterp", function () {setNodesToDisplay(allNodes);})
    ...;

  let firstPhaseSamples = hybridSimulation.subSet();
  setNodesToDisplay(firstPhaseSamples);

  hybridSimulation.restart(); // Start the hybrid simulation
  }
\end{lstlisting}

As shown in code \ref{lst:impl_HybridUsage}, the algorithm-specific parameters for each Chalmers' force objects are set in advance by the user. Since the Hybrid object interacts with the Simulation and force-calculation objects via general interfaces, other force calculators could potentially be used without having to modify the Hybrid object as well. In fact, D3's original implementation of Force Link also works with the Hybrid object. To terminate the force calculations in the first and last phase, the Hybrid object have an internal iteration counter to stop the calculations after predefined number of time steps. In addition, the applied force threshold events are also supported as an alternative termination criterion.

For interpolation, two separate functions were created for each method. After the parent is found, both functions call the same third function to handle the rest of the process (step 2 to 8 of in section \ref{sec:bg_hybrid}).

\begin{multicols}{2}
\begin{algorithmic} % BRUTEFORCE
\item Interpolation with brute-force parent finding:
\item Select random samples $s$ from $S$. $s\subset{S}$.
\ForAll{node $n$ to be interpolated}
	\State Create distance cache array, equal to size of $s$
	\ForAll{node $i$ in $S$}
        \State Perform distance calculation.
        \If{$i\in{s}$} cache distance \EndIf
    \EndFor
    \State \Call{Place n}{$n$, closest $i$, $s$, distance cache}
\EndFor
\end{algorithmic}
\columnbreak
\begin{algorithmic} % FOR INTERPOLATION
\item Interpolation with pivot-based parent finding:
\item Preprocess pivots buckets
\item Select random samples $s$ from $S$. $s\subset{S}$.
\ForAll{node $n$ to be interpolated}
	\State Create distance cache array, equal to size of $s$
	\ForAll{pivot $p$ in $k$}
      \State Perform distance calculation.
      \If{$p\in{s}$} cache distance \EndIf
      \ForAll{node $i$ in bucket}
        \State Perform distance calculation.
        \If{$i\in{s}$} cache distance \EndIf
      \EndFor
    \EndFor
    \State Fill in the rest of the cache
    \State \Call{Place n}{$n$, closest $i$ or $p$, $s$, distance cache}
\EndFor
\end{algorithmic}
\end{multicols}

Since the original paper did not specify the ``satisfactory'' metric for placing a node $n$ on the circle around its parent (step 3 to 4), Matthew Chalmers, the project advisor who also took part in developing the algorithm, was contacted for clarification. Unfortunately, the knowledge was lost. Instead, sum of distance error between $n$ and every member of $s$ was proposed as an alternative. Preliminary testings shows that it works well and is used for this implementation.

With that decision, the high-dimensional distances between $n$ and each member of $s$ becomes used multiple times for binary searching and placement refinement (step 7 and 8). To reduce the distance function calls, a distance cache have been created. For brute-force parent finding, the cache can be filled while the parent is being selected as $s\subset{S}$. On the other hand, pivot-based searching might not cover every member of $s$. Thus, the missing caches are filled after parent searching.
%============================

\section{Metrics}
Many different metrics were introduced in section \ref{sec:bg_metrics}, some of them require extra code to be written. While memory usage measurement requires an external profiler, execution time can also be calculated by the application. For JavaScript, the recommended way is to take the high-resolution time-stamp before and after code execution. The method provides accuracy up-to 5 microseconds. However, it is important to note that with the level of precision, the measured value will vary from run-to-run, due to many factors both from software such as OS' process scheduler, and hardware such as Intel\textsuperscript{\textregistered} Turbo Boost or cache prefetch.

\begin{lstlisting}[language=JavaScript,caption={Execution time measurement.},label={lst:impl_Time}]
p1 = performance.now();
// Execute algorithm
p2 = performance.now();
console.log("Execution time", p2-p1);
\end{lstlisting}

Stress calculation is done as defined by the formula in section \ref{sec:bg_metrics}. The calculation is independent of the algorithm. In fact, it does not depend on D3 at all. Only an array of node objects and a distance calculation is required. Due to its very long calculation time, this function is only called on-demand when the value has to be recorded. The exact implementation is shown in code \ref{lst:impl_Stress}.

\begin{lstlisting}[language=JavaScript,caption={Stress calculation function.},label={lst:impl_Stress}]
export function getStress(nodes, distance) {
  let sumDiffSq = 0;  let sumLowDDistSq = 0;
  for (let j = nodes.length-1; j >= 1; j--) {
    for (let i = 0; i < j; i++) {
      let source = nodes[i], target = nodes[j];
      let lowDDist = Math.hypot(target.x - source.x, target.y - source.y);
      let highDDist = distance(source, target);
      sumDiffSq += Math.pow(highDDist - lowDDist, 2);
      sumLowDDistSq += lowDDist * lowDDist;
    }
  }
  return Math.sqrt(sumDiffSq / sumLowDDistSq);
}
\end{lstlisting}

%==============================================================================
%%%%%%%%%%%%%%%%
%              %
%     EVAL     %
%              %
%%%%%%%%%%%%%%%%
\chapter{Evaluation}
\label{ch:eval}
This chapter presents comparisons between each of the implemented algorithms. First, used data sets will be described. The experiment setup is then introduced, along with decision behind the each test design. Lastly, the results are shown and briefly interpreted.

\section{Data Sets}
\label{sec:EvalDataSet}
The data sets utilized during the development are the Iris, Poker Hands\cite{UCL_Data}, and Antarctic data set\cite{Antartica_Data}.
The Iris is one of the most popular data set to get started in Machine Learning. It contain 150 measurements from flowers of Iris Setosa, Iris Versicolour and Iris Virginica species, each with four parameters: petal and sepal width and height in centimeter. It was chosen as a starting point for development because it is a classification data set where the parameters can be used by the distance function and the label can be used to colour each instance. Each species is also clustered quite clearly, making it easier to see if the algorithm is working as intended.

The Poker Hands is another classification data set containing hands of 5 playing cards drawn from a standard deck, each is described in rank (Ace, 2, 3,...) and suit (Hearts, Spades, etc). Each hand is labelled as a poker hand (Nothing, Flush, Full house, etc). This data set is selected for the experiment because it contains over a million records. In each test, only  subsets of the data is used due to size limitation.

\begin{figure}
  \centering
   \includegraphics[height=6cm]{layout/Link10000Stable_crop.png}
  \caption{Visualisation of 10,000 data points from the Poker Hands data set, using Link Force.}
  \label{fig:eval_idealSample}
\end{figure}

The Antarctic data set contain 2,202 measurements by remote sensing probes over the period of 2 weeks at a frozen lake in the Antarctic. The 16 features includes water temperature, UV radiation levels, ice thickness, etc. The data is formatted into CSV by Greg Ross and is used to represent a data set with complex structure and high dimensionality. Due to the relatively small size of this data set, it is only used to compare the ability to show fine details.

\section{Experimental Setup}
Hardware and web browser can greatly impact the JavaScript performance. In addition to the code and data set, these variables have to be controlled as well.
The computers used are all the same model of Dell All-in-One desktop computers with Intel\textsuperscript{\textregistered} Core\texttrademark{} i5-3470S and 8GB of DDR3 memory, running CentOS 7 with Linux 3.10-x86-64.
As for web browser, the official 64-bit build of Google Chrome 61.0.3163.79 is used to both run and analyse hardware usage with its performance profiling tool.

Other unrelated parameters were also controlled as much as possible. The simulation's velocity decay is set at default of $0.4$, mimicking air friction, and the starting position of all nodes are locked at $(0,0)$. Although starting every nodes at the exact same position may seems to cause a very high initial spring force, the force scaling and the way D3 takes each node's velocity as part of spring force calculation prevent the system from spreading out too far. In practice, the graphs continue to expand for several more iterations before the overall layout reaches the correct size. Alpha, a decaying value used for artificially slowing down and freezing the system over time, is also kept at 1 to keep the springs' forces in full effect.

The web page is also refreshed after every run to make sure that everything, including uncontrollable aspects such as JavaScript heap, ahead-of-time compilation and the behavior of the browser's garbage collector, have been properly reset.

\subsection{Termination criteria}
\label{ssec:eval_termCriteria}
Both Link force and the Chalmers' 1996 algorithm create a layout that stabilises over time. In D3, calculations are performed for a predefined number of iterations. This have a drawback of having to select an appropriate value. Choosing the number too high means that execution time is wasted calculating minute details with no visible change to the layout while the opposite can results in a bad layout.
Determining the constant number can be problematic, considering that each algorithm may stabilise after different number of iterations, especially when considering that the interpolation result from the Hybrid algorithm can vary greatly from run-to-run(section \ref{ssec:eval_selectParams}).

An alternative method is to stop when a condition is met. One such condition proposed is the difference in velocity ($\Delta{v}$) of the system between iterations\cite{Algo2002}. In other word, once the amount of force applied in that iteration is lower than a scalar threshold, the calculation may stop. Taking note of stress and average force applied over multiple iterations as illustrated in figure \ref{fig:eval_stressVeloOverTime}, it is clear that Link Force converges to a complete stillness while the Chalmers algorithm reaches and fluctuate around a constant as stated in section \ref{sec:imp_neigbour}. It can also be seen that stress of each layout converges a value as the average force converges a constant, indicating that the best layout each algorithm can create can be obtained once the system stabilises.

\begin{figure}
  \centering
  \includegraphics[height=5cm]{graphs/stressVeloOverTime.png}
  \caption{A log-scaled graph showing decreasing stress and forces applied per iteration over time converging to a constant number when running different algorithms over 10,000 data points from Poker Hands data set. Stress is calculated every \nth{10} iteration.}
  \label{fig:eval_stressVeloOverTime}
\end{figure}

Since stress takes too long to calculate at every iteration, the termination criteria selected is the average force applied per node. This criteria is used for all 3 algorithms for consistency. The cut-off constants are then manually selected for each algorithm for each subset used. Link force's threshold is a value low enough that there are no visible changes and stress have reached near minimum. The Chalmers' threshold is the lowest possible value that will be reached most of the time. It is interesting to note that with bigger subset of the Poker Hands data set, the threshold rises and converges to 0.66 from 3,000 data points onward.

By selecting this termination condition, the goal of the last phase of the Hybrid Layout algorithm is flipped. Rather than performing the Chalmers' algorithm over the whole data set to correct interpolation errors, the interpolation phase's role is to help the final phase reaches stability quicker. Thus, parameters of the interpolation phase can not be evaluated on their own. Taking more time to produce a better interpolation result may or may not effect the number of iterations in the final phase, creating the need to balance between time spent and saved by interpolation.

%============================

\subsection{Selecting Parameters}
\label{ssec:eval_selectParams}
Some of the algorithms have variables that are predefined constant numbers. Care have to be taken in choosing these values as bad choices could cause the algorithm to produce bad results or takes unnecessarily long computation time. To compare each algorithm fairly, a good set of parameters have to be chosen for each.

The Chalmers' algorithm have two adjustable parameters: $Neighbours_{size}$, $Samples_{size}$.
According to previous evaluations\cite{LastYear}\cite{Algo2002}, favorable layout could be achieved with values as low as $10$ for both variables. Preliminary testings seems to confirm the findings and the values are selected for the experiments. In the other hand, Link force have no adjustable parameter whatsoever so no attention is required.

Hybrid layout have multiple parameters during the interpolation phase. For the parent-finding stage, there is a choice of whether to use brute-force or pivot-based searching method. In case of pivot-based, the number of pivots ($k$) have to also be chosen. Experiments have been run to find the accuracy of pivot-based searching, starting from 1 pivot to determine reasonable numbers to use in subsequent experiments. As shown in figure \ref{fig:eval_interpVariations}, the randomly selected $S$ set (the $\sqrt{N}$ samples used in the first stage) can greatly affect the interpolation result, especially with smaller data set with many small clusters. Therefore, each tests have to be run multiple times to generalise the result. From figure \ref{fig:eval_pivotHits}, it can be seen that the more pivots used, the higher accuracy and consistency. The diminishing returns can be observed at around 6 to 10 pivots, depending on the number of data points. Hence, higher number of pivots are not considered for the experiment.

Finally, the last step of interpolation is to refine the placement for a constant number of times. Preliminary testings shows that this step helps clean up a lot of interpolation artifacts. For example, a clear radial pattern and straight lines can be seen in figure \ref{sfig:eval_refineCompareA}, especially in the lower right corner. While these artifacts are no longer visible in figure \ref{sfig:eval_refineCompareB}, it is still impossible to obtain a desirable layout, even after more refinement steps were added. Thus, running the Chalmers' algorithm over the entire data set after the interpolation phase is unavoidable. For the rest of the experiment, only two values, 0 and 20 were selected, representing with and without interpolation artifacts cleaning.

\begin{figure}[!h]
  \centering
  \includegraphics[height=5cm]{graphs/hitrate_graph1.png}
  \includegraphics[height=5cm]{graphs/hitrate_graph2.png}
  \caption{Graphs showing accuracy of pivot-based searching between $k = $ 1, 3, 6 and 10. The left box-plot graph shows the percentage across 5 different runs (higher and more consistent is better). The right shows the high-dimensional distance ratio between the parent chosen by brute-force and pivot-based searching when they are not the same (closer to 1 is better). For instance, if the parent found by brute-force searching is 1 unit away from the querying node, a ratio of 1.3 means that the parent found by pivot-based searching parent is 1.3 unit away.}
  \label{fig:eval_pivotHits}
\end{figure}

\begin{figure}[!h]
  \centering
  \begin{subfigure}{\textwidth}
    \includegraphics[height=5.5cm]{layout/interpVar1A.png}
    \includegraphics[height=5.5cm]{layout/interpVar1B.png}
    \caption{An example of interpolation result with a more-balanced $S$}
  \end{subfigure}

  \begin{subfigure}{\textwidth}
    \includegraphics[height=6cm]{layout/interpVar2A.png}
    \includegraphics[height=6cm]{layout/interpVar2B.png}
    \caption{An example of interpolation result with a less-balanced $S$}
  \end{subfigure}
  \caption{Difference in interpolation results of a subset with 1,000 data points. Left images show only data points in set $S$ and the right show the interpolation result.}
  \label{fig:eval_interpVariations}
\end{figure}

\begin{figure}[!h]
  \centering
  \begin{subfigure}{0.45\textwidth}
    \includegraphics[height=5cm]{layout/refineCompareA.png}
    \caption{No interpolation refinement} \label{sfig:eval_refineCompareA}
  \end{subfigure}
  ~ %add desired spacing between images, if blank, line break
  \begin{subfigure}{0.45\textwidth}
    \includegraphics[height=5cm]{layout/refineCompareB.png}
    \caption{20 steps of Interpolation refinement} \label{sfig:eval_refineCompareB}
  \end{subfigure}
  \caption{A comparison between the interpolation results}
  \label{fig:eval_refineCompare}
\end{figure}
%============================

\subsection{Performance metrics}
As discussed in section \ref{sec:bg_metrics}, there are three main criteria to evaluate each algorithm: execution time, memory consumption, and the produced layout. Although stress is a good metric to judge the quality of a layout, it does not necessarily means that layouts of the same stress are equally as good for data exploration. Thus, the looks of the product itself have to also be compared. Since Bartasius have found that Link Force provide a layout with the least stress in all cases\cite{LastYear}, its layout will be used as a baseline for comparison (recall figure \ref{fig:eval_idealSample}).

It should also be noted that for ease of comparison, the visualisations shown in this chapter may be uniformly scaled and rotated. This manipulation should not effect the evaluation as the only concern of a spring model is relative distance between data points.
%============================

\section{Results}

\subsection{Memory usage}
\label{ssec:eval_ram}

Google Chrome comes with a performance profiling tools, allowing users to measure JavaScript heap usage. While it is straightforward to measure the usage of Link Force, the garbage collector gets in the way of obtaining an accurate value for the 1996 algorithm. Because the $Samples$ sets and, to a certain degree, $Neighbours$ sets are reconstructed at every iterations, a lot of new memory spaces are allocated and the old ones are left unreachable, waiting to be reclaimed. As a result, the JS heap usage keeps increasing until the GC runs even though the actual usage is theoretically constant across multiple iterations (figure \ref{fig:eval_neighbourRam}). Even though GC is designed to be only be run automatically by the JavaScript engine, Google Chrome allow it to be manually called in the profiling tool. For this experiment, GC will be manually called periodically during part of the run. The usage immediately after garbage collection is then be recorded and used for comparison. The peak usage before GC automatically gets invoked is also noted.

\begin{figure}
  \centering
  \includegraphics[height=3cm]{graphs/neighbourRam.png}
  \caption{Fluctuating JavaScript heap usage due to frequent memory allocation of the Chalmers' algorithm. GC is manually invoked every second in the first half and automatically in the later.}
  \label{fig:eval_neighbourRam}
\end{figure}

The hybrid layout has multiple phases, each with different theoretical memory complexity. As far as the interpolation phase is concerned, the buckets storage for pivot-based finding requires the most memory at $k(S_{Size})=O(\sqrt{N})$. Compared to $N(Neighbours_{size}+Samples_{size}) = O(N)$ of the Chalmers' algorithm used in the final phase, the overall memory requirement should be equals to that of the 1996 algorithm.

\begin{figure}
  \centering
  \includegraphics[height=5cm]{graphs/ram.png}
  \includegraphics[height=5cm]{graphs/ramMultiSize.png}
  \caption{Comparison of memory usage of each algorithm}
  \label{fig:eval_ram}
\end{figure}

The comparison has made between the 3 algorithms with hybrid layout running 10 pivots to represent the worst case scenario for interpolation. Rendering is also turned off to minimize the impact due to DOM elements manipulation\cite{LastYear}. The results are displayed in figure \ref{fig:eval_ram}. The modified Link Force, which use less memory compared to the D3's implementation (section \ref{sec:imp_linkForce}), scales badly compared to all others, even with automatic garbage collection. The difference in the base memory usage between the 1996 algorithm and the final stage of Hybrid layout is also within the margin of error, confirming that they both have the same memory requirement. If the final phase of the Hybrid layout is skipped, memory requirement will grow at a slightly lower rate.

Although the original researchers, Chalmers, Morrison and Ross, have explored this memory aspect before, Bartasius experimented with the maximum data size the application handle before Out of Memory exception occurred\cite{LastYear}. A similar test is re-performed to find if there has been any changes.
Due to JavaScript limitation, Link Force crashes the browser tab at 50,000 data points before any spring force is calculated, failing the test entirely. The similar behavior can also be observed with the D3's implementation. In contrast, the Chalmers' and hybrid algorithm can process as much as 470,000 data points. Interestingly, while the Chalmers' algorithm can also handle 600,000 data points with rendering, the 8GB memory is all used up, causing heavy thrashing and slowing down the entire machine. Considering that, paging does not occur when Link Force crashes the browser tab, memory requirement may not the only limiting factor in play.

All in all, since a desirable result can not be obtained from Hybrid algorithm if the final stage is skipped, there is no benefit in term of memory usage from using the Hybrid layout, compared to Chalmers' algorithm. Both of them have a lot smaller memory footprint compared to Link Force and can work with a lot more data points on the same hardware constraint.
%============================

\subsection{Different Parameters for the Hybrid Layout}
In section \ref{ssec:eval_termCriteria}, it has been concluded that the value of the parameters can not be evaluated on their own. Based on findings discussed in section \ref{ssec:eval_selectParams}, 10 different combinations of interpolation parameters were chosen: Brute force and 1, 3, 6, and 10 pivots, each with and without refinement at the end. Due to possible variations from the sample set $S$, each experiment is also performed 5 times. Data sets used are Poker Hands with 10,000 data points, which is the highest amount where stress can be calculated before crashing the web page, and 100,000 data points to hi-light the widen difference in interpolation time.

It should also be noted that while the original researchers had a similar experiment\cite{Algo2003}, it only explored the difference in execution time between random parent, brute-force, and pivot-based parent finders. Different number of pivots were not taken into consideration and the produced results were assumed to be equal across multiple different runs.

\begin{figure}
  \centering
  \includegraphics[height=5cm]{graphs/hybridParams_2ndTime_10k.png}
  ~
  \includegraphics[height=5cm]{graphs/hybridParams_stress_10k.png}
  \includegraphics[height=5cm]{graphs/hybridParams_totalTime_10k.png}
  \caption{Comparison of different interpolation parameters of the hybrid layout at 10,000 data points.}
  \label{fig:eval_hybridParams10k}
\end{figure}

Figure \ref{fig:eval_hybridParams10k} and \ref{fig:eval_hybridParams100k} shows that most of the execution time is spent in the final phase, making the number of iterations very important. While refining the interpolation result takes an insignificant amount of time, it both reduces the stress of the final layout and help the last phase reach stability much faster across the board. Figure \ref{fig:eval_pivotToggleRefine} also suggest that the produced layout is much more accurate. Without refining, it can be seen that a lot of ``One pair'' (orange) and ``Two pair'' (green) data points circles around ``Nothing'' (blue) when they should not. Thus, there is no compelling reason to disable this refinement step.

Surprisingly, despite lower time complexity, selecting higher number of pivots on the smaller data set can results in higher execution time than brute-force, negating any benefits of using it. At 10,000 data points, 3 pivots takes approximately as much time as brute-force, marking the highest sensible number of pivots to use. Looking at the lower number, the time saved from using 1 pivot does not reflect on the total time used by the layout. At 100,000 points, however, a significant speed up can be observed and is reflected in the total execution time. This suggests that pivot-based searching could shine with even larger data set and slower distance functions.

\begin{figure}
  \centering
  \includegraphics[height=5cm]{graphs/hybridParams_2ndTime_100k.png}
  ~
  \includegraphics[height=5cm]{graphs/hybridParams_totalIts_100k.png}

  \includegraphics[height=5cm]{graphs/hybridParams_2ndTime_100k_blank.png}
  ~
  \includegraphics[height=5cm]{graphs/hybridParams_totalTime_100k.png}
  \caption{Comparison of different interpolation parameters of the hybrid layout at 100,000 data points. The two Box and Whisker plots are aligned for ease of comparison.}
  \label{fig:eval_hybridParams100k}
\end{figure}

\begin{figure}
  \centering
  \begin{subfigure}{0.45\textwidth}
    \includegraphics[height=5cm]{layout/Pivot6_0_100k.png}
    \caption{No interpolation refinement}
  \end{subfigure}
  ~ %add desired spacing between images, if blank, line break
  \begin{subfigure}{0.45\textwidth}
    \includegraphics[height=5cm]{layout/Pivot6_20_100k.png}
    \caption{20 steps of Interpolation refinement}
  \end{subfigure}
  \caption{Comparison of the produced layout using 6 pivots, 100,000 data points}
  \label{fig:eval_pivotToggleRefine}
\end{figure}

Between brute-force and 1 pivot, there is no visual difference aside from variation from run-to-run. The stress measurement seems to support this subjective opinion. On the other hand, brute-force seems to results in a more consistent total execution time. Considering that refinement is stronger with bigger data set as there are more points to compare against, it make sense that the effect of low accuracy is easily corrected in larger data set.

In summary, to obtain quality layout, the refining step of the interpolation phase can not be ignored. Pivot-based searching only provides a significant benefit with very large data set and/or slow distance function. Otherwise, brute-force method can consistently yield a better layout in less time.
%============================

\subsection{The 3-way comparison}
Figure \ref{sfig:eval_multiAlgoTime} shows the execution time and stress of the produced layout of each algorithm with various data sets. The results reveal that the execution time of the Hybrid algorithm is superior to others across the board. The difference compared to Chalmers' algorithm is so large that the time differences from setting interpolation parameter seems insignificant. It should be noted that with smaller data sets, the processing time in each iteration can be faster than 17 milliseconds, the time between each frame on a typical monitor running at 60 frames per second. In D3-force, the processing is put on idle until the next screen refresh. As a result, the total execution time is limited to the number of iterations.

\begin{figure}[h]  % GRAPH
  \centering
  \begin{subfigure}{0.45\textwidth}
     \includegraphics[height=4cm]{graphs/multiAlgoTime.png}
    \caption{Execution time for up to 100,000 data points of Poker Hands data set}
    \label{sfig:eval_multiAlgoTime}
  \end{subfigure}
  ~ %add desired spacing between images, if blank, line break
  \begin{subfigure}{0.45\textwidth}
    \includegraphics[height=4cm]{graphs/multiAlgoStress.png}
    \caption{Relative stress of each finished layout compared to Link Force of different data sets}
    \label{sfig:eval_multiAlgoStress}
  \end{subfigure}
  \caption{Comparison between different algorithms}
  \label{fig:eval_multiAlgo}
\end{figure}

\begin{figure} % Poker 10k
  \centering
  \begin{subfigure}[t]{0.3\textwidth}
    \includegraphics[width=\textwidth]{layout/Poker10kLink.png}
    \caption{Link Force}
  \end{subfigure}
  ~ %add desired spacing between images, if blank, line break
  \begin{subfigure}[t]{0.3\textwidth}
    \includegraphics[width=\textwidth]{layout/Poker10kNeighbour.png}
    \caption{Chalmers' 1996}
  \end{subfigure}
  ~ %add desired spacing between images, if blank, line break
  \begin{subfigure}[t]{0.3\textwidth}
    \includegraphics[width=\textwidth]{layout/Poker10kHybrid.png}
    \caption{Hybrid Layout}
  \end{subfigure}
  \caption{Visualisations of 10,000 data points from the Poker Hands data set.}
  \label{fig:eval_Poker10k}
\end{figure}

As for the stress, a relative value is used for comparison. Figure \ref{sfig:eval_multiAlgoStress} shows that the Hybrid algorithm results in a layout of lower stress overall. A trend also implies that the more data points available, the better the Chalmers' and Hybrid algorithm perform. In any cases, Link Force's always has the lowest stress.

Comparing the produced layout, at 10,000 data points (figure \ref{fig:eval_Poker10k}), Hybrid can better reproduce the space between large clusters as seen in the Link Force's layout. For example, ``Nothing'' (blue) and ``One pair'' (orange) have a clearer gap; ``Two pairs'' (green) and ``Three of a kind'' (red) overlap less; ``Three of a kind'' and ``Straight'' (brown) mixes together in Chalmers' layout but more separated in the Hybrid layout. However, for other classes with less data points (colored brown, purple, pink, ...), the hybrid layout fail to form a cluster, causing them to spread out even more. The same phenomenon can be observed at 100,000 data points (figure \ref{fig:eval_Poker100k}).

\begin{figure}[h] % Poker 100k
  \centering
  \begin{subfigure}[t]{0.6\textwidth}
    \includegraphics[width=\textwidth]{layout/Poker100kNeighbour.png}
    \caption{Chalmers' 1996}
  \end{subfigure}

  \begin{subfigure}[t]{0.6\textwidth}
    \includegraphics[width=\textwidth]{layout/Poker100kHybrid.png}
    \caption{Hybrid Layout}
  \end{subfigure}
  \caption{Visualisations of 100,000 data points from the Poker Hands data set.}
  \label{fig:eval_Poker100k}
\end{figure}

The area where the 1996 and Hybrid algorithm fall short is the consistency in the layout quality with smaller data points. Sometime, both algorithms stop at the local minimum stress, instead of than global, resulting in an inaccurate result. Figure \ref{fig:eval_IrisBad} and \ref{fig:eval_Poker100Bad} shows examples of such occurrence. If the 1996 algorithms were allowed to continue the calculation, the layout will eventually reach the true stable position, depending on when a right combination of $Samples$ set is randomized to trip the system off its local stable position.

\begin{figure}[h] % Iris BAD
  \centering
  \begin{subfigure}[t]{0.45\textwidth}
    \centering
    \includegraphics[height=5cm]{layout/IrisNormalProper.png}
    \caption{A typical result from the 1996 Algorithm}
  \end{subfigure}
  ~ %add desired spacing between images, if blank, line break
  \begin{subfigure}[t]{0.45\textwidth}
    \centering
    \includegraphics[height=5cm]{layout/IrisBadProper.png}
    \caption{A result from the 1996 Algorithm when the layout stabilise at a local minimum.}
  \end{subfigure}
  \caption{Results from the Chalmers' 1996 algorithm on the Iris data set with the exact same parameters.}
  \label{fig:eval_IrisBad}
\end{figure}

\begin{figure}[h] % Poker 100 BAD
  \centering
  \begin{subfigure}[t]{0.45\textwidth}
    \centering
    \includegraphics[height=5.2cm]{layout/Poker100Hybrid.png}
    \caption{A result with less stress.}
  \end{subfigure}
  ~ %add desired spacing between images, if blank, line break
  \begin{subfigure}[t]{0.45\textwidth}
    \centering
    \includegraphics[height=5.2cm]{layout/Poker100HybridBad.png}
    \caption{A result with higher stress.}
  \end{subfigure}
  \caption{Variations of the results from the Hybrid on 100 data points from the Poker Hands data set with the same parameters.}
  \label{fig:eval_Poker100Bad}
\end{figure}

\begin{figure} % Antartica
  \centering
  \begin{subfigure}[t]{0.6\textwidth}
    \includegraphics[width=\textwidth]{layout/AntarticaLinkDay.png}
    \caption{Link Force}
  \end{subfigure}
\end{figure}
\begin{figure}
  \centering
  \ContinuedFloat
  \begin{subfigure}[t]{0.6\textwidth}
    \includegraphics[width=\textwidth]{layout/AntarticaNeighbourDay.png}
    \caption{Chalmers' 1996}
  \end{subfigure}
\end{figure}
\begin{figure}
  \centering
  \ContinuedFloat
  \begin{subfigure}[t]{0.6\textwidth}
    \includegraphics[width=\textwidth]{layout/AntarticaHybridDay.png}
    \caption{Hybrid Layout}
  \end{subfigure}
  \caption{Visualisations of the Antartica data set, color-keyed to Day.}
  \label{fig:eval_Antartica}
\end{figure}

Moving to the Antartica data set with a more complicated pattern, all three algorithms produces a very similar results (figure \ref{fig:eval_Antartica}). The big clustering difference is located around top center of the image. In Link Force, day 17 (brown) and 18 (lime) are lined up clearly, compared to others that fail to replicate the fine detail. Hybrid layout also fail to distinguish day 17, 18, 19 (pink) and 20 (grey) from each other in that corner. Aside from that, Hybrid form a layout slightly more similar to that of Force Link. Considering that the time used by Link Force, 1996 and Hybrid are approximately 14, 8, and 3.2 seconds respectively, it is hard to argue against using the Hybrid layout.
%============================

\section{Summary}
Each algorithm demonstrates its own strengths and weaknesses in different tests. For smaller data sets with a few thousand data points, Link Force works great and perform consistently. Most information visualisations on a web page will not hit the limitation of the algorithm. In addition, it allows the real-time object interactions and produces smooth animations which might be more important to most users. However, for a fully-connected spring model with over 1,000 data points, the startup time spent on distance caching starts to become noticeable and the each iteration can takes longer than 17ms time limit, dropping the animation below 60fps, causing visible stuttering and slowdown. Its memory-hungry nature also limits the ability to run on lower-end computers that a significant margin of the Internet users possess.

When bigger data sets are loaded and interactivity is not a concern, performing the Hybrid layout's interpolation strategy before running the 1996 algorithm results in a better layout in a shorter amount of time. It should be noted that, this  method does not work consistently with smaller data set, making Link Force a better option. As for interpolation, simple brute-force method is the better choice in general. Pivot-based searching does not significantly decrease the computation time, unless a very large data set is concerned, and the result is less predictable.

Looking back at the older Java implementation from 2002 running on Intel\textsuperscript{\textregistered} Pentium III\cite{Algo2002}, it used to be that a 3-dimensional data set with 30,000 data points requires over 10 minutes to run, using Chalmers' algorithm and approximately 3 minutes using the Hybrid algorithm\cite{Algo2003}. Comparing to now where 30,000 data point of Poker Hands, even with parameters stored as text-keyed dictionary object rather than an memory-offset-based class, can be visualised in 1.5 minutes with Chalmers' and 14 seconds with the Hybrid algorithm, it is clear that performance of general consumer devices have improved greatly.

Overall, these algorithms are all valuable tools. It depends on the developer to use the right tool for the application.
%============================


%==============================================================================
%%%%%%%%%%%%%%%%
%              %
%  Conclusion  %
%              %
%%%%%%%%%%%%%%%%
\chapter{Conclusion}
\label{ch:conc}

\section{Summary on the project achievements}

The following is the summarized list of work from the beginning to the end, over the course of two semesters.

\begin{itemize}
	\item \textbf{Studied algorithms:} Each algorithm and relevant researches were studied and understood.
	\item \textbf{Researched and assessed libraries:} Open-source JavaScript libraries were looked into. D3-force module was inspected and assessed for potential faults.
	\item \textbf{Modified D3 Link Force:} The D3 Link Force implementation was forked and optimized for use with a complete graph such as multidimensional scaling using spring model. The applied force tracker was also added, allowing the user to stop the simulation once the system stabilised.
	\item \textbf{Modified d3-neighbour-sampling plug-in:} Chalmers' implementation was tweaked to scale the applied force against a constant, rather than relying on a decreasing value. The applied force tracker was added and the evaluation application interface was updated to include newer algorithms.
	\item \textbf{Implemented interpolation algorithms for hybrid layout:} Interpolation functions were implemented with support for both pivot-based and brute-force pivot finding.
	\item \textbf{Implemented Hybrid simulation controller:} A JavaScript object was created as part of the plug-in to control a D3 Simulation object through the 3 phases of Hybrid layout algorithm.
	\item \textbf{Evaluated interpolation parameters:} Since the interpolation process have many parameters, several values were tested and the impacts were evaluated, both independently and as a whole system. A good combination of parameters was found after experiments.
	\item \textbf{Compared the three algorithms:} Each algorithm's strength and weakness was identified and compared against each other. Link Force was found to only work well in small data set but does not scales while Hybrid Layout only perform nicely on larger ones.
\end{itemize}

\section{Learning Experience}
The project had many challenges that helped me learn of both software engineering and research practices. Working with older research papers, I have met with a lot of ambiguity in an otherwise thorough-looking description. In terms of Software Engineering, both D3 force and d3-neighbour-sampling do not have a documentation on interfaces between each component. A lot of time were spent figuring out how each object interacts with each other and what the flow of the system is. At the same time, the free and open-source license of D3 allowed me to easily access the source code to learn and customise components such as Link Force. This project also helped me expand my knowledge of client-side web application technologies and its fast development.

As a result of evaluating this project, I believe that I have a better understanding of designing and conducting experiments on software performance. Furthermore, I also gained a valuable knowledge in JavaScript behaviour and the limitation of each performance profiling tool in different browsers.

\section{Future Work}
There are several areas of the project that was not thoroughly explored or could be improved. This section shows several directions that can enhance the application.
\begin{itemize}
	\item \textbf{Incorporating Chalmers' 1996 and Hybrid interpolation algorithms into D3 framework:} Currently, all the implementations are published on a publicly-accessible self-hosted Git Server as a D3 plug-in. While the hybrid algorithm as a whole seems to make more sense as a user application implementation, the improved Chalmers' algorithm and the interpolation functions could be integrated to the core functionality of the D3 library.
	\item \textbf{Data Exploration Test:} The project focuses on overall layouts produced by each algorithm and a single Stress metric. One of the goal of MDS is to explore data, which has not been assessed. A good tool and layout should help users identify patterns and meanings behind small clusters with less effort. The project could be extended to include data investigation tools.
	\item \textbf{Data Sets:} The evaluation focuses mainly on 1 data set. It is possible that the algorithms could behave differently on different data set with different dimensionality, data types and distance functions. Hence, findings in chapter \ref{ch:eval} may not apply to all.
	\item \textbf{Optimal parameters generalisation:} So far, only good combinations of parameters were determined for a specific data set. These values may not be universally optimal and can vary from data set to data set. Even the threshold value to stop Chalmers' algorithm also varies for different size of subset of the same Poker Hands data set. Future researches could be conducted to find the relation between these parameters to other information about the data set.
	\item \textbf{GPU rendering:} The use of GPU for general-purpose computing (GPGPU) is gaining popularity because GPU can perform simple calculations in parallel much faster than CPU. In 2017, Khronos group have introduced WebCL\cite{WebCL}, OpenCL-like standard for web browsers. However, it have never gained any popularity and was not adopted by any browser.

    Other efforts such as gpu.js\cite{gpujs} turns to use OpenGL Shading Language (GLSL) on WebGL instead. While the latest WebGL 2.0 does not support Compute Shader due to the limiting feature set of OpenGL ES 3.0\cite{WebGL2}, all of the mathematical operations used in the algorithms in this project are supported. Following the approach, Chalmers' and the interpolation algorithms could be ported to GLSL in the future.
	\item \textbf{asm.js and WebAssembly:} As discussed in section \ref{ssec:des_js}, coding in lower-level languages such as C and C++ and compiling them to asm.js or WebAssembly could speed up the execution time. During the period of the project, support for WebAssembly has been growing with more learning resources available online. It is now supported on many major web browsers such as Firefox, Chrome, Safari, and Edge. The project could be ported to these languages to potentially reduce the execution time even further.
	\item \textbf{More-efficient hashing algorithms for parent finding:} Over the decade, the field of machine learning and data mining have gained a lot of interest. Many improvements were made to solving related problems, including high-dimensional near neighbour searching. Newer algorithms such as data-dependent Locality-Sensitive Hashing\cite{LSH}, could provide a better execution time or more accurate result. Future researches can be carried out to incorporate these newer algorithms into the interpolation process of the Hybrid layout and evaluate any difference they make.
	\item \textbf{Multi-threading with HTML5 Web Workers:} By nature, JavaScript is designed to be single-threaded. HTML5 allow new processes to be created and ran concurrently. These workers have isolated memory space and are not attached to the HTML document. The only way to communicate between each other is message passing. JSON objects passed are serialized by the sender and de-serialized on the other end, creating even more overhead. Due to the size of the object the program have to work with, it is estimated that the overhead will out weight the benefit and support was not implemented. In the future, the support can be added to verify the hypothesis.
\end{itemize}

\section{Acknowledgements}
I would like to thank Matthew Chalmers for his guidance and feedback throughout the entire development process.

%%%%%%%%%%%%%%%%
%              %
%  APPENDICES  %
%              %
%%%%%%%%%%%%%%%%
\begin{appendices}

\chapter{Running the evaluation application}
The web application can run locally by loading a single HTML file. It is located at
\begin{verbatim}
      examples/example-papaparsing.html
\end{verbatim}
The data sets used can also be found at
\begin{verbatim}
      examples/data
\end{verbatim}
Please note that a modern browser is required to run the application. Firefox 57 and Chrome 61 were tested, but some older versions might also works.

Most of the settings are available on the web interface. However, the cut-off value to stop the simulation when the system stabilises is not. To change the values, navigate to
\begin{verbatim}
      examples/js/algos/[algorithmName].js
\end{verbatim}
and edit the parameter of \texttt{stableVelocity()} method.

Aside from the Poker Hands data set, all tests uses the ``General'' distance function, which is similar to the Euclidean distance but scales distances per feature and supports strings and dates. It also ignores \texttt{index} and \texttt{type} field in the CSV file so that the label of the Iris data set is not taken into account.

Euclidean distance function ignores \texttt{class}, \texttt{app}, \texttt{user}, \texttt{weekday} and \texttt{type} fields. This is for other data sets not used in this project. It will also crash when other fields contain non-number values. The error can be seen in the JavaScript console.

\chapter{Setting up development environment}
The API references and instruction for building the plug-in is available in README.md file. Please note that the build scripts are written for Ubuntu and may have to be adapted for other distributions or operating systems. A built JavaScript file for the plug-in is already included with the submission, hence re-building is unnecessary.

For the plug-in, dependencies have to first been fetched and installed. Assuming that a recent version of Node.js and node package manager is already installed, run
\begin{verbatim}
      npm install
\end{verbatim}

To compile and pack the plug-in, run
\begin{verbatim}
      npm run build
      npm run minify
      npm run zip
\end{verbatim}
The output files will be located in the \texttt{build} directory.

The evaluation web page is self-contained and can be edited with any text editor without Node.js. It will load the plug-in from the \texttt{build} directory. When the new build of the plug-in is compiled, simply refresh the web page will load up the new build.

The code is currently hosted on a personal publicly-accessible Git service at \url{https://git.win32exe.tech/brian/d3-spring-model}. Since this is on a personal bare-metal server, it will be maintained with best-effort without guarantee.


\end{appendices}

%%%%%%%%%%%%%%%%%%%%
%   BIBLIOGRAPHY   %
%%%%%%%%%%%%%%%%%%%%

\bibliographystyle{plainurl}
\bibliography{l4proj}

\end{document}

#if 0

\chapter{Introduction}
\label{intro} then \ref{intro}
\section{First Section in Chapter}
\subsection{A subsection}

\vspace{-7mm}
\begin{figure}
\centering
\includegraphics[height=9.2cm,width=13.2cm]{uroboros.pdf}
\vspace{-30mm}
\caption{An alternative hierarchy of the algorithms.}
\label{uroborus}
\end{figure}

\begin{verbatim}

            > pdflatex example0
            > bibtex example0
            > pdflatex example0
            > pdflatex example0

\end{verbatim}

#endif