Dimension reductions can be plotted by function plot_scdata()
:
plot_scdata(scRNA_int, pal_setup = pal)
There are three optional parameters for plot_scdata()
: color_by
, split_by
, and pal_setup
. As for the color_by
parameter, the function will color different "seurat_clusters"
by default, and it can be changed to any factors in the metadata, like "sample"
or "group"
:
plot_scdata(scRNA_int, color_by = "group", pal_setup = pal)
If split_by
parameter is specified as a factor in the metadata, the plotting will be split by that factor:
plot_scdata(scRNA_int, split_by = "sample", pal_setup = pal)
Similar to the plot_qc()
function, the pal_setup
parameter can be RColorBrewer
palette names, palette setup dataframe, or manually specified color vector.
plot_scdata(scRNA_int, pal_setup = "Dark2")
plot_scdata(scRNA_int, color_by = "sample", pal_setup = c("red","orange","yellow","green","blue","purple"))
The count and proportion statistics of clustering can be plotted by function plot_stat()
, the plot_type
parameter must be provided as one of the three values: "group_count"
, "prop_fill"
, and "prop_multi"
. Their plots are shown below:
plot_stat(scRNA_int, plot_type = "group_count")
plot_stat(scRNA_int, "group_count", group_by = "seurat_clusters", pal_setup = pal)
plot_stat(scRNA_int, plot_type = "prop_fill",
pal_setup = c("grey90","grey80","grey70","grey60","grey50","grey40","grey30","grey20"))
plot_stat(scRNA_int, plot_type = "prop_multi", pal_setup = "Set3")
The group_by
parameter uses "sample"
as the default grouping variable, and it can be specified as other factors in the metadata (e.g. "group"
).
plot_stat(scRNA_int, plot_type = "prop_fill", group_by = "group")
plot_stat(scRNA_int, plot_type = "prop_multi", group_by = "group", pal_setup = c("sienna","bisque3"))
The plotting of heatmap requires cluster markers to be found by Seurat:
markers <- FindAllMarkers(scRNA_int, logfc.threshold = 0.1, min.pct = 0, only.pos = T)
Then, the top genes in each cluster are plotted by plot_heatmap()
. The default value of number of genes plotted in each cluster n
is 8
. In the heatmap, each row represents a gene and each column a cell. The cells can be sorted by sort_var
can it is set to c("seurat_clusters")
by default, meaning the cells are sorted by cluster identity. Multiple variables can be specified in sort_var
and the cells will be sorted by the order of the variables. The bars above the heatmap are annotation bars and can show categorical or continuous variables in the metadata by specifying the anno_var
parameter, with variable names as a character vector. The anno_colors
parameter is a list that specifies the annotation colors for corresponding annotation variables hence it should be the same length as anno_var
. It is recommended that proper color palettes are used for categorical and continuous variables. As before, RColorBrewer
palettes and manually specified palettes are supported, and a three-color vector can be used for continuous variable annotation.
plot_heatmap(dataset = scRNA_int,
markers = markers,
sort_var = c("seurat_clusters","sample"),
anno_var = c("seurat_clusters","sample","percent.mt","S.Score","G2M.Score"),
anno_colors = list("Set2", # RColorBrewer palette
c("red","orange","yellow","purple","blue","green"), # color vector
"Reds",
c("blue","white","red"), # Three-color gradient
"Greens"))
Furthermore, hm_limit
and hm_colors
are used to specify the color gradient and limits of the main heatmap tiles.
plot_heatmap(dataset = scRNA_int,
n = 6,
markers = markers,
sort_var = c("seurat_clusters","sample"),
anno_var = c("seurat_clusters","sample","percent.mt"),
anno_colors = list("Set2",
c("red","orange","yellow","purple","blue","green"),
"Reds"),
hm_limit = c(-1,0,1),
hm_colors = c("purple","black","yellow"))
The GO analysis results can be plotted by plot_cluster_go()
and plot_all_cluster_go()
. The former plotted one specific cluster while the latter iterates all clusters. The topn
parameter in plot_cluster_go()
specifies the number of top genes for GO analysis and the default value is 100
. The org
parameter specifies the organism, and "human"
and "mouse"
are the accepted values. plot_all_cluster_go()
is the wrapper for plot_cluster_go()
and the latter is again a wrapper for clusterProfilter::enrichGO()
. Hence, the ...
parameters can be passed into inner functions.
plot_cluster_go(markers, cluster_name = "1", org = "human", ont = "CC")
plot_all_cluster_go(markers, org = "human", ont = "CC")
The measures are defined as continuous variables in the metadata as well as gene expression values. The plot_measure()
and plot_measure_dim()
summarize these variables as box/violin plots and dimension reduction plots, respectively. Parameters like group_by
, split_by
, and pal_setup
can be used similarly as described above.
plot_measure(dataset = scRNA_int,
measures = c("KRT14","percent.mt"),
group_by = "seurat_clusters",
pal_setup = pal)
plot_measure_dim(dataset = scRNA_int,
measures = c("nFeature_RNA","nCount_RNA","percent.mt","KRT14"))
plot_measure_dim(dataset = scRNA_int,
measures = c("nFeature_RNA","nCount_RNA","percent.mt","KRT14"),
split_by = "sample")
To perform GSEA analysis, we will first find the differentially expressed genes (DEGs) and related measures by find_diff_genes()
. Then, the ranked list will be input for GSEA analysis by test_GSEA()
. (Note: It may take Seurat a long time to find DEGs. Parallel processing by package future
is recommended.). Finally, the output can be plotted by plot_GSEA()
, with additional parameters provided for adjusted p-value cutoff and color gradients.
de <- find_diff_genes(dataset = scRNA_int,
clusters = as.character(0:7),
comparison = c("group", "CTCL", "Normal"),
logfc.threshold = 0, # threshold of 0 is used for GSEA
min.cells.group = 1) # To include clusters with only 1 cell
gsea_res <- test_GSEA(de,
pathway = pathways.hallmark)
plot_GSEA(gsea_res, p_cutoff = 0.1, colors = c("#0570b0", "grey", "#d7301f"))