Sample Collection:
The mice used for this dataset were CTRL (Prmt5fl/fl;Crect-) embryos from a recent study (Recka et al. 2025). Three embryos were collected at E15.5 from two different litters and yolk-sac tissue was lysed using the Extract-N-Amp Tissue PCR Kit (Sigma-Aldrich) and quickly genotyped using Phusion High-Fidelity DNA Polymerase (New England Biolabs) and shortened PCR cycling parameters. During genotyping, back skin from E15.5 mouse embryos was dissected and transferred to ice cold DEPC-PBS (Millipore Sigma). Samples were incubated in 0.25% Trypsin (ThermoFisher) for 30-45 mins at 37°C with frequent pipette mixing. Once cells reached a single cell suspension, the reaction was quenched with 10% FBS (ThermoFisher), passed through a 40 µM cell strainer (Flomi Cell Strainers, SP Bel-Art), and triplicates were pooled. 500,000 cells were incubated with lysis buffer (10mM Tris HCl pH 7.5, 10mM NaCl, 3mM MgCl2, 0.1% NP-40) for 5 mins on ice, nuclei extracted and quantified using trypan blue.
Sequencing:
scRNA-Seq and scATAC-Seq sequencing libraries were prepared by the University of Iowa’s Iowa Institute of Human Genetics Genomics Division Laboratory. Briefly, 10,000 nuclei were targeted for bulk transposition before individual nuclei were encapsulated in oil droplets along with the 10X GEM code beads in the Chip J cartridge (P/N: 1000234) using the 10X Genomics Single Cell iX Chromium Controller. Generation of gel beads in emulsion (GEMs), barcoding, pre-amplification PCR, and ATAC library construction were all performed as recommended by the manufacturer (10X Genomics, Chromium Next GEM Single Cell Multiome + ATAC Reagent Kit, Rev F, P/N:1000283). The gene expression libraries were pooled and sequenced on a NovaSeq 6000 to give at least 20,000 reads per nuclei. The ATAC-Seq libraries were pooled and sequenced on a NovaSeq 6000 to give at least 25,000 reads per nuclei.
Data Pre-processing:
Sequencing results were demultiplexed and converted to FASTQ format using the Illumina bcl2fastq software. scRNA and scATAC-seq reads were processed and aligned to the mm10 reference genome using the Cell Ranger “Count” function. Further analysis and visualization were performed using Seurat (v4.3) (Hao et al., 2021) and Signac (v1.9) (Stuart et al., 2021). Briefly, peaks were unified between conditions and filtered based on length (20-10,000). EnsDb.Mmusculus.v79 was used for gene annotations. Cells underwent manual filtering to ensure cells used were between nCount_ATAC (100-40,000), nCount_RNA (100-30,000), nucleosome_signal <4, TSS_enrichment (1-30), and pct_reads_in_peaks >15. A blacklist ratio was created using a fraction of atac_peak_region_fragments over atac_fragments, with cells under 0.1 removed. scRNA-seq separately underwent an additional “SCTransform” analysis. scATAC-seq underwent “FindTopFeatures”, “RunTFIDF”, and “RunSVD”. Datasets were then intersected using “FindIntegrationAnchors” (1:30, k.anchors=5) and “IntegrateData” (1:30). “SCTransform”, “RunTFIDF”, “FindTopReatures”, and “RunSVD” were rerun after integration. “FindMultiModalNeighbors” using PCA (1:50) and LSI (2:50) was performed prior to “FindClusters” (res = 0.9) to generate the final UMAP.
Website Data Generation:
For visualization of gene expression patterns on the companion website, violin plots were created using Seurat’s "VlnPlot" function, displaying expression across major cell types (represented by groups of related clusters) for genes annotated in the Mmusculus.UCSC.mm10 genome. For genomic track visualizations via the UCSC Genome Browser, we generated pseudobulk profiles representing average RNA expression and ATAC-seq chromatin accessibility within distinct clusters. First, the Seurat clusters were grouped into broader biological categories (e.g., merging clusters 2 and 5 into "basal", clusters 3 and 10 into "spinous", defining "epidermis", "dermis", etc.). Pseudobulk tracks for both RNA-seq and ATAC-seq assays were then created using a combination of publicly available tools and custom R functions. This process involved aggregating reads/fragments for cells within each broader biological category and subsequently calculating signal coverage across 25 bp genomic tiles. The coverage within each tile was then normalized by scaling the raw tile counts relative to the total signal counts within that group. Finally, the normalized coverage data for each cell group and assay was exported into BigWig format. These BigWig files are deposited in the Gene Expression Omnibus (GEO) under accession number GSE290228.
Code Availability:
All code used for this website and associated manuscript is available on GitHub (https://github.com/nmrecka/Recka_G3_2025).