分析测试百科网

搜索

喜欢作者

微信支付微信支付
×

Heterogeneity of Single-Cell Gene Expression Across Phenotypically(二)

2020.6.22
头像

王辉

致力于为分析测试行业奉献终身


 

Figure 2: Verification of C1 Single-Cell mRNA-Seq data quality. a)ERCC  RNA Spike-In Control Mix 1 was applied to a C1 IFC at a total  transcript input of 1.4*106 copies/reaction and was then subjected to  RT-PCR by both STA and mRNASeq chemistries. There was a high correlation  (R2 = 0.92) between the STA data and the mRNASeq data as assessed by  qPCR on the BioMark HD System. b) The mRNASeq cDNA was converted  to sequencing libraries using Nextera® XT tagmentation and these  libraries were then sequenced on MiSeq. The resulting data was mapped  using tophat/cufflinks. The log2(FPKM) values showed a high correlation  to qPCR data derived from the same cDNA (R2 = 0.92) c) Correlation  of RPKM values with transcript input concentration. The RPKM value for  each transcript is the average of 96 harvest samples. d) Variation  and positive rate of ERCC transcripts. The coefficient of variation and  detection rate for each ERCC transcript across is plotted vs input  amount (n=96). The positive rate of each ERCC transcript represents  number of samples with >10 RPKM. At the sequencing depth used here,  we observed dropouts at spike loads below ~50 copies per C1 reaction  line. e) The number of transcripts detected (cutoff of RPKM  >1) were plotted as a function of read depth for single-cell  libraries generated both on-chip and in a 96-well plate, with single  K562 cells delivered to the plate by FACS (n = 3 for both chip and  plate). A slightly greater transcript diversity was observed in the  chip-derived libraries. f) Pairwise correlation plots for the  same single-cell libraries sequenced at different depths (3 single  cells, sequenced at both 3M and 20M reads). While there are more  dropouts of low expression genes at the lower sequencing depth, the  correlation is excellent between the same single-cell libraries read at  different depths (R2 = 0.99) g) Comparison of aggregated data  collected from single cells on the C1 System to matching population tube  control data, showing high correlation between the two data sets (R2 =  0.94).


Figure 3: Basic evaluation of single-cell mRNA Seq data quality from the C1 System. a)The  table describes the various cell types evaluated on the C1 System and  demonstrates that high mapping rates were obtained from cell lines and  primary cells, derived from both human and mouse. Note that while the  average and maximum mapping rates are good for all cell types, there are  outliers with very low mapping rates that are present in many of the  groups. b) A plot from the SINGuLAR Analysis Toolset v2.0 demonstrating  outlier selection based on GEx distributions of that sample (in  comparison to the rest of the sample group). Here, the default settings  were used to remove samples with average GEx values (across all genes)  that were less than 15% of the average GEx values for the entire sample  group. The outliers selected by this automated method agree closely with  those selected manually based on mapping rates and total read numbers  (20 of 20 samples that were manually identified were also identified by  the SINGuLAR toolset). c) The table shows the total number of samples  that were retained for further analysis after outlier selection.

Figure 4: PCA of selected human samples and genes a). Principal  components analysis was applied to data derived from the 440  single-cell libraries selected after outlier removal. Cells within a  given cell type are largely grouped together, with blood cells (K562,  HL60, CRL-2339) clustering closely together with positive PC2 scores,  and keratinocyte, BJ fibroblast, CRL-2338 (breast cancer), and HeLa  cells grouped together with negative PC2 scores. iPS cells, neural  progenitor cells, and other neural cell types are spread widely along  the PC1 axis with PC2 scores near zero, suggesting that PC1 describes  much of the variation associated with neural development and  differentiation. b). The top 20 genes contributing to PC1, PC2,  and PC3 are listed, with genes contributing to both PC2 and PC3  highlighted in red text. c). Violin plots (combined box plot and  probability density plot) are shown for each of the cell groups for the  50 genes that contributed most to the PCA.

Figure 5: Hierarchical clustering of selected human samples and genes a) Hierarchical clustering was performed using the top 50 genes contributing to the PCA. b) The  plot shows the Euclidian distance between various cell groups in the  hierarchical clustering, highlighting which cell types are most closely  related with respect to these top 50 genes.

Figure 6: a). PCA  was again conducted on the single-cell data from iPS and neural  progenitor cells; however, in this analysis only 400 specified genes of  interest were included in the analysis. b). The top 50 genes  contributing to the PCA were then used to conduct hierarchical  clustering, revealing gene clusters that contribute to the differences  between these two cell types, and also revealing a unique gene signature  in the single neural progenitor cell that was separated from all other  cells by the PCA.

 

Conclusion


 

•The  C1 mRNA-Seq protocol generates full-length cDNA of high quality and  enables simple, reproducible, cost-effective sequencing library  preparation from individual cells derived from a wide variety of tissues  and cultures.

•The  data produced by this system is quantitative and reproducible, as  demonstrated using ERCC spike-in controls and by reassembling population  data using aggregated single-cell data.

•Principal  component analysis and hierarchical clustering of the resulting mRNA  Seq data performed Fluidigm’s SINGuLAR Analysis Toolset v2.0 accurately  classifies >400 single cells into multiple distinct clusters. The  analyzed data reveals a broad spectrum of transcriptional heterogeneity  within nominally homogeneous cell populations.


互联网
文章推荐