This repository contains code to cluster legal network data. It is, inter alia, used to produce the results reported in the following publications:
- Daniel Martin Katz, Corinna Coupette, Janis Beckedorf, and Dirk Hartung, Complex Societies and the Growth of the Law, Sci. Rep. 10 (2020), https://doi.org/10.1038/s41598-020-73623-x
- Corinna Coupette, Janis Beckedorf, Dirk Hartung, Michael Bommarito, and Daniel Martin Katz, Measuring Law Over Time, to appear (2021)
Related Repositories:
- Complex Societies and the Growth of the Law (Publication Release)
- Measuring Law Over Time
- Legal Data Preprocessing (Latest Publication Release)
Related Data:
- Preprocessed Input Data for Sci. Rep. 10 (2020)
- Preprocessed Input Data for Measuring Law Over Time, to appear (2021)
- It is assumed that you have Python 3.7 installed. (Other versions are not tested.)
- Set up a virtual environment and activate it. (This is not required but recommended.)
- Install the required packages
pip install -r requirements.txt
.
One option is to generate the required data yourself using https://github.com/QuantLaw/legal-data-preprocessing (also available at https://doi.org/10.5281/zenodo.4070772).
Another option is to use the generated data from the related datasets (see above). This repository also contains the clustering results. To execute the clustering, you only need the following directories, other directories should be removed as otherwise clustering steps might be skipped.
Required files for Germany relative to this repository
../legal-networks-data/de/2_xml
../legal-networks-data/de/4_crossreference_graph
../legal-networks-data/de/5_snapshot_mapping_edgelist
Required files for USA relative to this repository
../legal-networks-data/us/2_xml
../legal-networks-data/us/4_crossreference_graph
../legal-networks-data/us/5_snapshot_mapping_edgelist
The combined data of statutes and regulations is located in the de_reg
and us_reg
folders next to the de
and us
folders.
Run ./run_example_configs.sh
to preprocess the graphs in multiple
configurations, cluster them, and map the clusterings over all available years.
The following steps will be executed:
- Preprocessing Simplify the graphs so that they can serve as input for clustering algorithms.
- Cluster Perform the clustering with infomap or louvain.
- Cluster Texts Collect the text for each cluster. (This step can only be performed
if the text data is available
../legal-networks-data/{us,de,us_reg,de_reg}/2_xml
.) - Cluster Evolution Mappings Map the clusters over time.
- Cluster Evolution Graph Create a graph with clusters as nodes and edges indicating the dynamics of nodes between snapshots.
- Cluster Inspection Inspect the content of individual clusters.
- Cluster Evolution Inspection Inspect the content of cluster families.