aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorNiklas Halle <niklas@niklashalle.net>2021-02-21 16:24:39 +0100
committerNiklas Halle <niklas@niklashalle.net>2021-02-21 16:24:39 +0100
commit32530d168a1e96209dba43d2087747786f4f5841 (patch)
tree288a3d26cbcfe6277082299776aba1a93d7c3734
parentf10119b9443f821a8202cd5235f939d4a1967e04 (diff)
downloadbachelor_thesis-32530d168a1e96209dba43d2087747786f4f5841.tar.gz
bachelor_thesis-32530d168a1e96209dba43d2087747786f4f5841.zip
implemented some feedback
-rw-r--r--latex/proposal/Proposal.pdfbin211300 -> 211289 bytes
-rw-r--r--latex/proposal/Proposal.tex12
2 files changed, 6 insertions, 6 deletions
diff --git a/latex/proposal/Proposal.pdf b/latex/proposal/Proposal.pdf
index e1f84ec..df4047b 100644
--- a/latex/proposal/Proposal.pdf
+++ b/latex/proposal/Proposal.pdf
Binary files differ
diff --git a/latex/proposal/Proposal.tex b/latex/proposal/Proposal.tex
index 6dd80c6..788208f 100644
--- a/latex/proposal/Proposal.tex
+++ b/latex/proposal/Proposal.tex
@@ -27,23 +27,23 @@
\maketitle
\section{Purpose}
- We want to compare the performance of “traditional” algorithms to find specific graph properties with the performance of a machine learning-based approach centred around graph embedding (on the grounds of random walks (\ntv)).\\Evaluation shall include speed, accuracy, stability and scalability \improvement{more/less factors to compare2?} in order to find whether certain problems might benefit from such an machine learning based approach. \improvement{Does that cover it?}
+ I want to compare the performance of “traditional” algorithms to find specific graph properties with the performance of a machine learning-based approach centred around graph embedding (on the grounds of random walks (\ntv)).\\Evaluation shall include speed, accuracy, stability and scalability \improvement{more/less factors to compare2?} in order to find whether certain problems might benefit from such a machine learning based approach. \improvement{Does that cover it?}
\inlineToDo{Maybe mention \nk\ somehow?}
\section{Justification}
While some research into embedding based approaches has already been conducted, there is little direct comparison within the same framework, especially one as widely used as \nk.\\
- Finding whether the interface provided and data by \nk is useful for certain problems and maybe even competitive with “traditional” algorithms can help further research in embedding based approaches and their application.
+ Finding whether the interface and data provided by \nk is useful for certain problems and may even be competitive with “traditional” algorithms, can help further research in embedding based approaches and their application.
\inlineToDo{TODO: extend?}
\section{Literature review}
There is a few articles and papers going into graph embedding and its application. A good example is \cite{GOYAL201878}, which is a survey on a collection of different graph embedding techniques, evaluating their applications and performances. It also includes the random walked based \ntv, which will be the base in this thesis.\\
They find that “Choosing the right balance enables \ntv\ to preserve community structure as well as structural equivalence between nodes” but also that “Embeddings learned by \ntv\ have low reconstruction precision,” “\ntv\ outperforms other methods on the task of node classification,” and “\ntv\ preserves homophily as well as structural equivalence between nodes [which] suggest this can be useful in node classification”. Therefore comparing it with “traditional” algorithms for clustering in \nk\ (which in essence is node classification) is evident.\\
Another interesting study on embedding and clustering is \cite{rozemberczki2019gemsec}, which is quite a bit different though, as it does the embedding and the clustering step at once, not in sequence based on the embedding.\\
- On the other hand, there is few data on the effectiveness of \ntv\ for approximating different centrality scores. Therefore we will also evaluate two of theses, to see if approximation of specific numeric properties of graphs might also be feasible using this approach in \nk.
+ On the other hand, there is few data on the effectiveness of \ntv\ for approximating different centrality scores. Therefore I will also evaluate two of theses, to see if approximation of specific numeric properties of graphs might also be feasible using this approach in \nk.
\section{Method}
- The broad idea is to find a good set of synthetic as well as real world graphs, for which cluster can either be retrieved well or are already known. For the same graphs we want to have a small set of properties besides clustering which can be calculated exactly and have not been (a major) part of previous studies, in order to see how an \ntv-embedding + machine learning approach might perform on those.\\
- I will then familiarize myself with the machine learning framework \tf, which I intend to use for the design, training and evaluation of the machine learning based part. Planned is one network for each of the properties, but if the time is enough we could also look into trying to create one which might try to do multiple at once (which obviously could heavily boost performance \unsure{Good idea?}).\\
+ The broad idea is to find a good set of synthetic as well as real world graphs, for which cluster can either be retrieved well or are already known. For the same graphs I want to have a small set of properties besides clustering which can be calculated exactly and have not been (a major) part of previous studies, in order to see how an \ntv-embedding + machine learning approach might perform on those.\\
+ I will then familiarize myself with the machine learning framework \tf, which I intend to use for the design, training and evaluation of the machine learning based part. Planned is one network for each of the properties, but if the time is enough I could also look into trying to create one which might try to do multiple at once (which obviously could heavily boost performance \unsure{Good idea?}).\\
Afterwards, results of the machine learning part will be compared with those of “traditional” algorithms already provided by \nk.
\inlineToDo{Describe your proposed research methodology: Qualitative or quantitative etc\\
Describe your time frame\\
@@ -86,7 +86,7 @@
\end{enumerate}
\section{Dissemination}
- \improvement{Not sure about this section, do we need it?}
+ \improvement{Not sure about this section, do I need it?}
\inlineToDo{Describe how the findings will be used\\
Evaluate this use\\
Describe how the research findings will be disseminated}