Graph analysis techniques and applications to bitcoin forensics

Flow based networks have been studied over the years, which have considered different interpretations of a flow, for instance where a flow may be based on transfer (e.g., package delivery), or serial replication (e.g., one-to-one gossip), or parallel duplication (e.g., epidemic spread). However, the...

Full description

Saved in:
Bibliographic Details
Main Author: Phetsouvanh, Silivanxay
Other Authors: Anwitaman Datta
Format: Theses and Dissertations
Language:English
Published: 2019
Subjects:
Online Access:https://hdl.handle.net/10356/106803
http://hdl.handle.net/10220/49668
https://doi.org/10.32657/10220/49668
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-106803
record_format dspace
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic Engineering::Computer science and engineering
spellingShingle Engineering::Computer science and engineering
Phetsouvanh, Silivanxay
Graph analysis techniques and applications to bitcoin forensics
description Flow based networks have been studied over the years, which have considered different interpretations of a flow, for instance where a flow may be based on transfer (e.g., package delivery), or serial replication (e.g., one-to-one gossip), or parallel duplication (e.g., epidemic spread). However, the study of flows which may split while conserving the overall volume at each split is relatively sparse. In this thesis we look at certain aspects of such split-and-transfer based flows with volume conservation. Specifically, we (i) design general purpose flow based graph analysis techniques and demonstrate their applicability by studying a wide range of real networks, including the network induced by Bitcoin transactions; and (ii) design analysis techniques which are specific to the Bitcoin network, and showcase the efficacy of these techniques in conjunction with our novel flow-based graph analysis algorithms with case studies. Pertaining the general purpose graph analysis, we first consider the notion of entropic centrality which measures how central a node is in terms of how uncertain the destination of a flow starting at this node is: the more uncertain the destination (i.e., larger the spread of the flow), the more well connected and thus central the node is deemed. Prior works on entropic centrality implicitly assumed that the flow is indivisible, and at every node, the flow is transferred from one edge to another. In this thesis, we propose a split-and-transfer flow model for entropic centrality, where at every node, the flow can actually be arbitrarily split across choices of neighbours, while conserving the overall volume. This interpretation of volume conserved non-atomic flows better reflects certain applications, including Bitcoin transactions. We consider two variations. In one, we constrain that a down-stream flow cannot revisit a node that has already been visited upstream (no cycles), and show how to relate this model to an equivalent transfer based entropic centrality one for computational tractability. We then relax the `no cycle' constraint, and explore a Markov process of random walk which simplifies the computation. These entropic centrality models can easily accommodate directed and weighted graphs. Based on the latter variant of (Markov) entropic centrality model, we next design a 2-stage clustering algorithm, which in the first stage is informed by the entropic centrality to detect local communities based on random walks, and then performs a bottom-up iterative aggregation. We furthermore explore graph clustering by using an entropic distance based approach as an alternative to distribution of flow based approach as described above. To that end, we revisit a Renyi entropy based measure introduced originally for image clustering, and study its application to graph clustering. To effectuate Renyi entropy based graph clustering, we propose a simulated annealing algorithm, as well as a hierarchical approach. While the simulated annealing approach is practical for a range of graphs, our early results indicate that it (or at least our realization) does not scale for large graphs. This was an immediate motivation for the exploration of a hierarchical agglomerative approach which is naturally amenable to (not explored in this thesis) distribution and parallelization. Finally, we demonstrate the use of the above novel information theory based algorithms to derive macroscopic information from the Bitcoin network. We augment these macroscopic general purpose graph algorithms with Bitcoin specific graph analysis algorithms that can be used to carry out microscopic analysis of a zoomed in Bitcoin subnetwork. Specifically, we study path confluence based address aggregation and Bitcoin flow traceback/forward mechanisms, which are centred around Bitcoin addresses of potential interest. In addition to that, we designed and developed a visualization aided graph mining tool, BiVA, which enables Bitcoin network data exploration, visualization of subgraphs around nodes of interest, and integrates both standard and new algorithms, including the aforementioned general algorithms for flow based centrality and clustering, and the Bitcoin microscopic graph analysis algorithms.
author2 Anwitaman Datta
author_facet Anwitaman Datta
Phetsouvanh, Silivanxay
format Theses and Dissertations
author Phetsouvanh, Silivanxay
author_sort Phetsouvanh, Silivanxay
title Graph analysis techniques and applications to bitcoin forensics
title_short Graph analysis techniques and applications to bitcoin forensics
title_full Graph analysis techniques and applications to bitcoin forensics
title_fullStr Graph analysis techniques and applications to bitcoin forensics
title_full_unstemmed Graph analysis techniques and applications to bitcoin forensics
title_sort graph analysis techniques and applications to bitcoin forensics
publishDate 2019
url https://hdl.handle.net/10356/106803
http://hdl.handle.net/10220/49668
https://doi.org/10.32657/10220/49668
_version_ 1681042669187366912
spelling sg-ntu-dr.10356-1068032019-12-10T14:46:32Z Graph analysis techniques and applications to bitcoin forensics Phetsouvanh, Silivanxay Anwitaman Datta School of Computer Science and Engineering Parallel and Distributed Computing Centre Engineering::Computer science and engineering Flow based networks have been studied over the years, which have considered different interpretations of a flow, for instance where a flow may be based on transfer (e.g., package delivery), or serial replication (e.g., one-to-one gossip), or parallel duplication (e.g., epidemic spread). However, the study of flows which may split while conserving the overall volume at each split is relatively sparse. In this thesis we look at certain aspects of such split-and-transfer based flows with volume conservation. Specifically, we (i) design general purpose flow based graph analysis techniques and demonstrate their applicability by studying a wide range of real networks, including the network induced by Bitcoin transactions; and (ii) design analysis techniques which are specific to the Bitcoin network, and showcase the efficacy of these techniques in conjunction with our novel flow-based graph analysis algorithms with case studies. Pertaining the general purpose graph analysis, we first consider the notion of entropic centrality which measures how central a node is in terms of how uncertain the destination of a flow starting at this node is: the more uncertain the destination (i.e., larger the spread of the flow), the more well connected and thus central the node is deemed. Prior works on entropic centrality implicitly assumed that the flow is indivisible, and at every node, the flow is transferred from one edge to another. In this thesis, we propose a split-and-transfer flow model for entropic centrality, where at every node, the flow can actually be arbitrarily split across choices of neighbours, while conserving the overall volume. This interpretation of volume conserved non-atomic flows better reflects certain applications, including Bitcoin transactions. We consider two variations. In one, we constrain that a down-stream flow cannot revisit a node that has already been visited upstream (no cycles), and show how to relate this model to an equivalent transfer based entropic centrality one for computational tractability. We then relax the `no cycle' constraint, and explore a Markov process of random walk which simplifies the computation. These entropic centrality models can easily accommodate directed and weighted graphs. Based on the latter variant of (Markov) entropic centrality model, we next design a 2-stage clustering algorithm, which in the first stage is informed by the entropic centrality to detect local communities based on random walks, and then performs a bottom-up iterative aggregation. We furthermore explore graph clustering by using an entropic distance based approach as an alternative to distribution of flow based approach as described above. To that end, we revisit a Renyi entropy based measure introduced originally for image clustering, and study its application to graph clustering. To effectuate Renyi entropy based graph clustering, we propose a simulated annealing algorithm, as well as a hierarchical approach. While the simulated annealing approach is practical for a range of graphs, our early results indicate that it (or at least our realization) does not scale for large graphs. This was an immediate motivation for the exploration of a hierarchical agglomerative approach which is naturally amenable to (not explored in this thesis) distribution and parallelization. Finally, we demonstrate the use of the above novel information theory based algorithms to derive macroscopic information from the Bitcoin network. We augment these macroscopic general purpose graph algorithms with Bitcoin specific graph analysis algorithms that can be used to carry out microscopic analysis of a zoomed in Bitcoin subnetwork. Specifically, we study path confluence based address aggregation and Bitcoin flow traceback/forward mechanisms, which are centred around Bitcoin addresses of potential interest. In addition to that, we designed and developed a visualization aided graph mining tool, BiVA, which enables Bitcoin network data exploration, visualization of subgraphs around nodes of interest, and integrates both standard and new algorithms, including the aforementioned general algorithms for flow based centrality and clustering, and the Bitcoin microscopic graph analysis algorithms. Doctor of Philosophy 2019-08-16T05:08:03Z 2019-12-06T22:18:41Z 2019-08-16T05:08:03Z 2019-12-06T22:18:41Z 2019 Thesis Phetsouvanh, S. (2019). Graph analysis techniques and applications to bitcoin forensics. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/106803 http://hdl.handle.net/10220/49668 https://doi.org/10.32657/10220/49668 en 209 p. application/pdf