You are welcome to reuse the data in any way as long as you provide proper reference to our work.
A coronal atlas of the molecular clusters in a vectorized format. It can be easily modified with a vector graphics editor.
This .zip folder contains 41 .pdf files: one per coronal plate et a merged file with all plates combined together. The Allen Brain Atlas is displayed on the left side and the molecular clusters as spots on the right side.
This .zip folder contains 41 .pdf files: one per coronal plate et a merged file with all plates combined together. The Allen Brain Atlas is displayed on the left side and the molecular clusters as continuous regions on the right side.
This .zip folder contains 41 .pdf files: one per coronal plate et a merged file with all plates combined together. The molecular clusters are displayed as spots on the left side and the molecular clusters as continuous regions on the right side.
All files are in vector format so they can easily be edited. Neuroanatomical definitions and borders can be selected and manually corrected. In order to do so, it is recommended to first delete the clipping masks covering the plots. On Adobe Illustrator, proceed as follow: first, Select > All. Then Object > Clipping Mask > Release
Spots and regions are color coded based on their cluster assignment using the categorical color scheme. In order to facilitate selection, each cluster color was made unique by small adjustment of the RGB value. You can select all shapes belonging to a cluster (including in the legend) by using the selection tool. First select one spot, then go to Select > Same > Fill Color or use the “Select similar object” shortcut.
The data in an easy to use format (matrix of raw counts, meta table, etc.) after minimal processing.
A meta table in .tsv format with spots as rows. Rows are indexed by a unique spot identifier. Columns represent respectively:
The matrix of UMI raw counts in .tsv format. Each row corresponds to one spot and is indexed by a unique spot identifier. Each column corresponds to the gene indicated in the header.
A matrix of normalized expression in .tsv format. Data is log-normalized using the computeSumFactors from the scran package and is batch corrected for animals. Each row corresponds to one spot and is indexed by a unique spot identifier. Each column corresponds to the gene indicated in the header.
A .zip file containing microscopy images of coronal sections with Hematoxylin-and-Eosin (HE) staining. Sections are cropped around the spatial transcriptomics array.
A .tsv matrix with t-SNE and UMAP coordinates based on molecular expression, both in 2D and 3D space. The columns respectively correspond to:
A matrix containing the scores of every spot on each Independent Component (IC) in .tsv format. Each row corresponds to one spot and is indexed by a unique spot identifier. The 80 columns correspond to the first 80 components obtained with Independent Component Analysis (ICA).
A matrix containing the loads of every gene on each Independent Component (IC) in .tsv format. Each row corresponds to one gene used as row name. The 80 columns correspond to the first 80 components obtained with Independent Component Analysis (ICA).
If you are using R, the .tsv files can be parsed using the following instruction:
read.table(path, sep = '\t', stringsAsFactors = F, header = T, row.names = 1)
Parsing the expression matrix might be long due to its size. Using lazy loading will reduce initial loading time to few seconds. This can be done by calling the fread function from the data.table package or by using the vroom package.
The raw data sequences in FASTQ format.
The raw data was deposited to the GEO platform. It can be accessed using the accession number GSE147747. You will find the following files for each section :
Intermediary data objects. They can be used to reproduce specific steps or figures using the code available on GitHub.
A directory containing all the data required to run the registration script.
A directory containing all the data required to reproduce the figures.
As most of the analysis was conducted in R, many of these objects are in .RData format.
The data required to load the molecular atlas in the ST Viewer software.
This file should be input in the ST Viewer as "data file". It contains the expression matrix with spots as rows and genes as columns. This is exactly the same file as the expression matrix of raw counts from the “Download processed data” section.
This file should be input in the ST Viewer as "spot coordinates file". It contains a table in .tsv format with spots as rows. The three columns correspond to mediolateral (ML), dorsoventral (DV) and anteroposterior (AP) stereotaxic coordinates in mm from the bregma reference point.
This file should be input in the ST Viewer as “mesh". It contains a mesh representing the brain outline provided by the Allen Institute in .obj format.
This file should be input to the ST Viewer when loading spots colors from a file. It contains a table in .tsv format with spots as rows. The two columns correspond to the cluster ID and the cluster name.
This .zip folder contains individual meta files for each section. They should be inputted in the ST Viewer as "spot coordinates file". Expression matrices and HE images can be downloaded from the GEO repository.
The coordinates of the molecular regions for visualization purposes.
A table in .tsv format with clusters as rows. Columns represent respectively:
There is one .obj file per molecular cluster in this .zip folder. Each file contains the triangular mesh corresponding to the cluster with coordinates being expressed in the Common Coordinate Framework 2017 from the Allen Institute. Files are indexed with an ID which can be mapped to a cluster name using the cluster name table (see top of this section). The meshes for the Allen Brain Atlas can be downloaded here.
The .obj files can be easily visualized using the open source software MeshLab. Please note that some shading might be inverted.
Not all clusters have a mesh associated: applying the Support Vector Machine (SVM) classifier erased 4 small clusters.