Skip to content

Latest commit

 

History

History

synapse_prediction

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

Predicting synaptic links with the synful package

Convolutional neural networks were trained to predict the locations of postsynaptic sites and their corresponding presynaptic sites in the EM dataset using the method described in Buhmann et al. 2021 Nature Methods. The implementation of this pipeline is available on Github at funkelab/synful.

This neuroglancer link shows the two layers output by the two CNNs: one predicts whether a given pixel is a postsynaptic site, and the other predicts vectors pointing from postsynaptic sites to presynaptic sites.

From the postsynaptic site probabilities, 83,917,332 discrete postsynaptic sites were extracted. The corresponding presynaptic sites were identified via the vector predictions at each postsynaptic site. The full set of ~84 million synaptic links can be found in the google cloud storage bucket gs://zetta_lee_fly_vnc_001_alignment_temp/v4/fill_nearest_mip1/img/img_seethrough/synful_extraction/229bd2f77b2adf7c0e2c5b90ed605098/8.6_8.6_45/

Filtering the predictions

A number of filters were applied to prune the ~84 million synaptic links down to a final set of ~50 million that constitute the final synapse table.

Exclude synapses outside the segmentation

Any synaptic link where either the presynaptic or the postsynaptic site isn't associated with any reconstructed object (specifically, has no supervoxel at its position) was excluded. This filtered out 836,640 synapses (~1%), bringing the number of remaining synapses from 83,917,332 to 83,080,692.

Exclude synapses with a score less than 12

The synful package provides a score for each predicted synaptic link. We found that thresholding the predictions by keeping only the ones with sum_score > 12 (roughly meaning that the postsynaptic site had more than 12 voxels predicted to be a postsynaptic location) produced the maximal f-score when evaluating performance on ground truth synapse annotations. We applied this threshold, filtering out 26,566,091 synapses (~32%), bringing the number of remaining synapses from 83,080,692 to 56,514,601.

Exclude autapses

Any synaptic link that connects a given supervoxel to itself was excluded. This removed 798,050 synapses (~1%), bringing the number of remaining synapses from 56,514,601 to 55,716,551.

Exclude clear duplicates

Sometimes a single pair of supervoxels will be connected by two (or more) different synaptic links. This occurs due to limitations in the synful approach – in essentially all of these cases, the pair of supervoxels should only be connected a single time. In these cases, we removed any duplicates, leaving only the single link with the largest score to connect any given pair of supervoxels. This removed 5,724,380 synapses (~7%), bringing the number of remaining synapses from 55,716,551 to 49,992,171.

Exclude likely duplicates

Similar to the section above, sometimes a single dendritic twig will be connected to the same presynaptic neuron by multiple links, but without having the same exact pair of supervoxels connected. We removed duplicate links that connect the same two segIDs if the links' presynaptic locations are within 150nm of one another. We examined the some of the cases identified by this approach and indeed all the instances we examined were duplicates that deserved to be removed. Here's a neuroglancer link to 5 examples. Thanks to Sven Dorkenwald for providing code for this step, which was also applied to the FAFB/FlyWire synapse table. This step removed 4,936,246 synapes (5.9%), bringing the number of remaining synapses from 49,992,171 to 45,055,925.

Final Nov2022 synapse dataset

The final set of 45,055,925 synapses is available:

The score column is the sum_score provided by synful, converted from float to int.

Identify which region each (filtered) synapse is in

We next check which region(s)/neuropil(s) each synapse is in.

Algorithm: For each neuropil-synapse pair, we first check whether the synapse is at least in the bounding box of the neuropil mesh by simple arithmetic comparisons of x, y, z coordinates. If the synapse is in the bounding box, we further check whether it's actually contained in the mesh using a ray-casting-based algorithm from the Trimesh peckage.

Implementation: Practically, we load the CSV synapse table dump and add boolean columns "is_in_<region_name>" to indicate whether each synapse is in the corresponding region. This allows for overlapping/hierarchical organization of regions. We also split the synapse table into chunks and distribute the workload among many worker processes using a payload pool. Each payload checks whether a set of synapses is in a single given neuropil. The results are merged and written into a Parquet file.

Usage: Use neuropil_identification/locate_neuropil.py:

usage: locate_neuropil [-h] [-c CHUNK_SIZE] [-p PROCS]
                       input_file output_file mesh_dir

Identify which neuropil/tract each synapse is in

positional arguments:
  input_file            Input CSV file listing all synapses
  output_file           Output Parquet file identifying the regions
  mesh_dir              Path to mesh files

options:
  -h, --help            show this help message and exit
  -c CHUNK_SIZE, --chunk_size CHUNK_SIZE
                        Synapses are localized in small chunks. Set the chunk
                        size here.
  -p PROCS, --procs PROCS
                        Number of worker processes

For example:

python locate_neuropil.py \
    ~/Data/fanc/synapse_table_20221120/raw_dump/20221117_fanc_syn.csv \    # input
    ~/Data/fanc/synapse_table_20221120/localization_res/synapse_location.parquet \    # output
    FANC_auto_recon/data/volume_meshes/JRC2018_VNC_UNISEX_to_FANC/meshes_by_side \    # meshes
    -c 10000 -p 12    # run in chunks of 10,000 synapses with 12 worker processes

Alternatively, one can also import locate_neuropil from neuropil_identification/locate_neuropil.py in Python. See docstring for details.