Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export gated cell populations from FlowJo workspace to a dataframe #381

Open
rohitfarmer opened this issue Oct 31, 2022 · 5 comments
Open

Comments

@rohitfarmer
Copy link

Hi there, I am working to export gated cell populations from a FlowJo workspace to an R data frame. My code fetches cells for the first few gates, and then I get a blank matrix; there are no errors. Any suggestions would be helpful. Thanks!

Below is the code for which I can fetch cells.

library(CytoML)
library(flowWorkspace)

# Load FlowJo workspace (xml) file
wsFile <- file.path(file.path("covidflu", "WSP-without-id", "20210727_COVID_FLU(act T cells).wsp"))
ws <- CytoML::open_flowjo_xml(file.path("covidflu", "WSP-without-id", "20210727_COVID_FLU(act T cells).wsp"))

# Parse 
gs <- flowjo_to_gatingset(ws, name = 2, path = file.path("covidflu", "all-fcs-files"), execute = TRUE)

getdat <- gs_pop_get_data(gs, y = "/PBMC/Single Cells/Live/CD45+")
ff <- flowWorkspace::cytoframe_to_flowFrame(getdat[[1,]])
nrow(flowCore::exprs(ff))

[1] 883444

And below is the same code with the next gate beyond which I am getting nothing.

getdat <- gs_pop_get_data(gs, y = "/PBMC/Single Cells/Live/CD45+/Lymphocytes")
ff <- flowWorkspace::cytoframe_to_flowFrame(getdat[[1,]])
nrow(flowCore::exprs(ff))

[1] 0

Here is my sessionInfo()

> sessionInfo()
R version 4.1.3 (2022-03-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.6

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] flowWorkspace_4.6.0 CytoML_2.6.0       

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.0     lattice_0.20-45      colorspace_2.0-3     vctrs_0.4.2          generics_0.1.3       stats4_4.1.3        
 [7] yaml_2.3.5           ncdfFlow_2.40.0      base64enc_0.1-3      utf8_1.2.2           flowCore_2.6.0       RBGL_1.70.0         
[13] XML_3.99-0.11        rlang_1.0.6          hexbin_1.28.2        pillar_1.8.1         glue_1.6.2           DBI_1.1.3           
[19] aws.s3_0.3.21        Rgraphviz_2.38.0     BiocGenerics_0.40.0  RColorBrewer_1.1-3   readxl_1.4.1         plyr_1.8.7          
[25] matrixStats_0.62.0   jpeg_0.1-9           lifecycle_1.0.3      MatrixGenerics_1.6.0 zlibbioc_1.40.0      RProtoBufLib_2.6.0  
[31] cellranger_1.1.0     munsell_0.5.0        gtable_0.3.1         cytolib_2.6.2        latticeExtra_0.6-30  Biobase_2.54.0      
[37] IRanges_2.28.0       curl_4.3.3           fansi_1.0.3          Rcpp_1.0.9           scales_1.2.1         DelayedArray_0.20.0 
[43] S4Vectors_0.32.4     jsonlite_1.8.2       RcppParallel_5.1.5   graph_1.72.0         deldir_1.0-6         interp_1.1-3        
[49] gridExtra_2.3        ggplot2_3.3.6        png_0.1-7            digest_0.6.29        dplyr_1.0.10         grid_4.1.3          
[55] cli_3.4.1            tools_4.1.3          magrittr_2.0.3       tibble_3.1.8         aws.signature_0.6.0  pkgconfig_2.0.3     
[61] Matrix_1.5-1         data.table_1.14.2    xml2_1.3.3           assertthat_0.2.1     httr_1.4.4           rstudioapi_0.14     
[67] R6_2.5.1             ggcyto_1.22.0        compiler_4.1.3 
@miosisoniii
Copy link

miosisoniii commented Jan 13, 2023

@rohitfarmer are you looking for population statistics? You can try gs_pop_get_count_fast, which produces a matrix, which could then just be piped into a dataframe:

# Load FlowJo workspace (xml) file
wsFile <- file.path(file.path("covidflu", "WSP-without-id", 
"20210727_COVID_FLU(act T cells).wsp"))
ws <- CytoML::open_flowjo_xml(file.path("covidflu", "WSP-without-id", "20210727_COVID_FLU(act T cells).wsp"))

# Parse 
gs <- flowjo_to_gatingset(ws, 
name = 2, 
path = file.path("covidflu", "all-fcs-files"), 
execute = TRUE)

# Using pop_get_count_fast 
getdat <- gs_pop_get_count_fast(gs,
statistic = "freq",
format = "wide",
xml = TRUE) 

# coerce to dataframe since the function produces a matrix
getdat_df <- get_dat |> 
as.data.frame()

@rohitfarmer
Copy link
Author

@miosisoniii no, I am interested in exporting individual cells with their marker values and time stamp. I can export them now; however, during the export, the values are being transformed that I cannot reverse. Therefore, values are not the same if I match them with the same population exported from FlowJo.

@mikejiang
Copy link
Member

first of all, to fetch expression data matrix, you do not need to convert it to flowframe, exprs(getdat[[1]]) should do

secondly, /PBMC/Single Cells/Live/CD45+/Lymphocytes gives you zero count, simply means there is no cell in that gate.

Finally, expression data matrix is stored as transformed scale after parsed from flowjo workspace into gatingset, in order to get raw scale, you switch inverse.transform flag
for example

> dataDir <- system.file("extdata",package="flowWorkspaceData")
>   gs_dir <- list.files(dataDir, pattern = "gs_manual",full = TRUE)
> gs <<- load_gs(gs_dir)
> head(flowCore::exprs(gs_pop_get_data(gs, "CD4")[[1]])[,5:7])
     <B710-A> <R660-A>  <R780-A>
[1,] 3106.004 3302.719 2073.3540
[2,] 3128.845 1834.073 1607.8027
[3,] 2902.931 2458.440  482.8756
[4,] 2928.725 1382.240 1510.5111
[5,] 2832.599 1277.941  714.8516
[6,] 2727.793 1704.678  678.8153
> head(flowCore::exprs(gs_pop_get_data(gs, "CD4", inverse.transform = T)[[1]])[,5:7])
      <B710-A>   <R660-A>  <R780-A>
[1,] 21521.121 35399.8320 2342.2910
[2,] 22785.982  1105.2427  958.1031
[3,] 12979.613  4470.3438 -667.2806
[4,] 13837.313   431.4819  779.5670
[5,] 10906.154   341.7943 -326.7503
[6,]  8425.668   843.9904 -376.3401

@rohitfarmer
Copy link
Author

Which Bioconductor version is the code from? It's not working for me. I had a similar problem before when I pointed out that the cell count from the gate was zero. I had to lower the Bioconductor version to make it work.

@mikejiang
Copy link
Member

latest release.
please provide your sessioninfo and reproducible sample in order for us to help you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants