Skip to content

A C++ implementation of a Bitcoin address clustering procedure.

Notifications You must be signed in to change notification settings

Marco-Venturini/BitcoinAddressClustering-maesa

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BitcoinAddressClustering

This repository contains a C++ implementation of the Bitcoin address clustering procedure described in [1].

Our implementation relies on the igraph library and consists of two distinct executables: the graph builder and the analyzer.

Graph builder

Given a list of transactions, the procedure produces a partition (i.e., a clustering) of all Bitcoin addresses included in such transactions. The clustering is constructed in accordance with the multi-input heuristic.

More precisely, the program reads a list of Bitcoin transactions and constructs a graph, called auxiliary graph, defined as follows.

  1. Nodes correspond to all addresses appearing in the transactions. Both input and output addresses are considered.
  2. For each transaction T in the list, there is a path among the nodes representing the input addresses of T.

The input file contains a textual representation of Bitcoin transactions. In this repository it is possible to find a more detailed description of the required input file format.

The output graph is represented as a binary file containing a sequence of 32-bit signed integers (in big-endian format). The file has the following format:

  1. the first 32 bits represent the number of nodes N;
  2. the next 32 bits represent the number of edges M;
  3. the remaining M pairs of 32-bit integers represent the edges of the graph. More precisely, the first (resp. second) integer corresponds to the identifier of the source (resp. target) node of the edge.

Graph analyzer

This program reads the graph produced by the builder and analyzes it by computing its connected components.

The output of this program is a CSV file containing one line for each node of the auxiliary graph. Each line contains the node identifier and the identifier of the component the node belongs to, separated by a comma.

References

[1] Di Francesco Maesa, Damiano, Andrea Marino, and Laura Ricci. "Data-driven analysis of bitcoin properties: exploiting the users graph." International Journal of Data Science and Analytics 6 (2018): 63-80.

About

A C++ implementation of a Bitcoin address clustering procedure.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 96.4%
  • Makefile 3.6%