FcsFiles
FCS files are the standard data representation for flow cytometry. In
CellEngine, an FcsFile
resource represents the file itself, along with
metadata such as the panel and annotations.
Properties
Properties are the snake_case equivalent of those documented on the CellEngine API unless otherwise noted.
Methods
cellengine.resources.fcs_file.FcsFile
A class representing a CellEngine FCS file.
annotations: List[Annotations]
property
writable
Examples:
Getting an annotation value:
next(a["value"] for a in file.annotations if a["name"] == "my_anno", None)
Appending an annotation:
file.annotations.append({"name": "my_anno", "value": "my_value"})
Modifying an annotation:
for anno in file.annotations:
if anno["name"] == "my_anno":
anno["value"] = "new_value"
Removing an annotation:
file.annotations = [a for a in file.annotations if a["name"] != "my_anno"]
channels: List[str]
property
Return all channels in the file.
compensation: Union[FileCompensations, None]
property
Used with per-file compensation to indicate the compensation to apply to this file.
crc32c: str
property
The CRC32c sum for the file, hex-encoded, per RFC 4960 Appendix B.
data: Dict[str, Any]
property
deleted: Union[datetime, None]
property
writable
If set, the file has been soft-deleted and may be permanently deleted later.
event_count: int
property
The number of events (rows) in the file.
events
property
A DataFrame containing this file's events (typically cells).
This is the last result from FcsFile.get_events(inplace=True)
. If that
method has not been called, a DataFrame will be fetched with all events
(ungated, no compensation, no subsampling). To fetch events with
subsampling, compensation and/or gating to a specific population, use
FcsFile.get_events()
.
experiment_id: str
property
filename: str
property
writable
The filename.
gates_locked: bool
property
writable
If true, no gates that apply to this file can be changed.
has_file_internal_comp: bool
property
Whether or not this file has a valid file-internal compensation matrix.
header: Dict[str, str]
property
Note: this property may be fetched lazily.
id: str
property
Alias for _id
.
is_control: bool
property
Whether or not this is a control file. Control files are hidden from most of the Web UI, and can be used to exclude compensation or calibration beads from analysis, for example.
md5: str
property
The MD5 sum for the file, hex-encoded.
name: str
property
writable
Alias for filename
.
panel: List[Channel]
property
panel_name: str
property
The name of the file's panel. Files with the same panelName are grouped into a panel.
sample_name: Optional[str]
property
The sample name extracted from the file header.
size: int
property
The number of bytes in the file.
spill_string: str
property
The file-internal compensation, if present (see
has_file_internal_comp
). Note: this property may be fetched lazily due
to its size.
channel_for_reagent(reagent)
Returns the channel name ($PnN
) for the given reagent ($PnS). Returns
None` if the channel isn't found.
create(experiment_id, fcs_files=None, filename=None, add_file_number=False, add_event_number=False, pre_subsample_n=None, pre_subsample_p=None, seed=None)
classmethod
Creates an FCS file by copying, concatenating and/or subsampling existing file(s) from this or other experiments, or by importing from an S3-compatible service.
When concatenating and subsampling at the same time, subsampling is applied to each file before concatenating.
If addFileNumber
is true, a file number column (channel) will be added
to the output file indicating which file each event (cell) came from.
The values in this column have a uniform random spread (±0.25 of the
integer value) to aid visualization. While this column can be useful for
analysis, it will cause the experiment to have FCS files with different
panels unless all FCS files that have not been concatenated are deleted.
During concatenation, any FCS header parameters that do not match between files will be removed, with some exceptions:
$BTIM
(clock time at beginning of acquisition) and$DATE
will be set to the earliest value among the input files.$ETIM
(clock time at end of acquisition) will be set to the latest value among the input files.$PnR
(range for parameter n) will be set to the highest value among the input files.
All channels present in the first FCS file in the fcsFiles parameter must also be present in the other FCS files.
When importing from an S3-compatible service, be aware of the following:
- Only a single file can be imported at a time.
- The host property must include the bucket and region as applicable. For example, for AWS, this would look like "mybucket.s3.us-east-2.amazonaws.com".
- The path property must specify the full path to the object, e.g. "/Study001/Specimen01.fcs".
- Importing private S3 objects requires an accessKey and a secretKey for a user with appropriate permissions. For AWS, GetObject is required.
- Importing objects may incur fees from the S3 service provider.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
experiment_id
|
str
|
ID of the experiment to which the file belongs |
required |
fcs_files
|
Optional[Union[str, List[str], Dict[str, str]]]
|
ID of file or list of IDs of files or objects to process.
If more than one file is provided, they will be concatenated in
order. To import files from other experiments, pass a list of
dicts with |
None
|
filename
|
Optional[str]
|
Rename the uploaded file. |
None
|
add_file_number
|
Optional[bool]
|
If concatenating files, adds a file number channel to the resulting file. |
False
|
add_event_number
|
Optional[bool]
|
Add an event number column to the exported file. This number corresponds to the index of the event in the original file; when concatenating files, the same event number will appear more than once. |
False
|
pre_subsample_n
|
Optional[int]
|
Randomly subsample the file to contain this many events. |
None
|
pre_subsample_p
|
Optional[float]
|
Randomly subsample the file to contain this percent of events (0 to 1). |
None
|
seed
|
Optional[int]
|
Seed for random number generator used for subsampling. Use for deterministic (reproducible) subsampling. If omitted, a pseudo-random value is used. |
None
|
Returns:
Type | Description |
---|---|
FcsFile
|
FcsFile |
create_from_dataframe(experiment_id, filename, df, reagents=None, headers={})
classmethod
Creates an FCS file from a DataFrame and uploads it CellEngine.
Channel names ($PnN
values such as "FITC-A" or "530/30-A") are read
from the DataFrame column names.
Reagent names ($PnS
values such as "CD3") are optional and are read
from the 2nd-level index of the DataFrame if present, or can be provided
in a list in the same order as the channels via the reagents
argument.
Additional header keys can be provided via headers
. In particular, it
can be useful to set $PnD
values, which CellEngine uses to set the
initial display scaling:
- For linear scales, set
"$PnD": "Linear,<min>,<max>"
(e.g."Linear,-200,50000"
for a linear scale ranging from -200 to 50,000). - For logarithmic scales, set
"$PnD": "Logarithmic,<decades>,<min>"
(e.g."Logarithmic,4,0.1"
for a logarithmic scale ranging from 0.1 to 1000). - For arcsinh scales, leave
"$PnD"
unset. Aside from several heuristics, CellEngine will usually default to arcsinh with a max equal to the"$PnR"
value.
FCS files created with this method always use float32 encoding. For efficiency, consider using float32 arrays upstream when generating the FCS file values.
Examples:
With reagents specified in a multi-level index:
df = pandas.DataFrame(
np.random.randn(6,3),
columns=[["Ax488-A", "PE-A", "Cluster ID"],
["CD3", "CD4", None]],
dtype="float32"
)
FcsFile.create_from_dataframe(
experiment_id,
"myfile.fcs",
df,
headers={
"P3D": "Linear,0,20"
}
)
With reagents specified in the reagents
argument:
df = pandas.DataFrame(
np.random.randn(6,3),
columns=["Ax488-A", "PE-A", "Cluster ID"],
dtype="float32"
)
FcsFile.create_from_dataframe(
experiment_id,
"myfile.fcs",
df,
reagents=["CD3", "CD4", None],
headers={
"P3D": "Linear,0,20"
}
)
Factorize categorical data into numbers for encoding as FCS:
df = pandas.DataFrame(
[[1.0, "T cell", 1],
[2.0, "T cell", 2],
[3.0, "B cell", 3],
[4.0, "T cell", 4]],
columns=["Ax488-A", "Cell Type", "Cluster ID"]
)
df["Cell Type"], cell_type_index = pandas.factorize(df["Cell Type"])
created = FcsFile.create_from_dataframe(
blank_experiment._id,
"myfile.fcs",
df,
headers={
"$P2D": "Linear,0,10",
"$P3D": "Linear,0,10"
}
)
Returns: The created FCS file.
delete()
get(experiment_id, _id=None, name=None)
classmethod
get_events(inplace=False, destination=None, **kwargs)
get_events(
inplace: Optional[bool] = ...,
destination: None = ...,
**kwargs: Any
) -> DataFrame
get_events(
inplace: Optional[bool] = ...,
destination: str = ...,
**kwargs: Any
) -> None
Fetch a DataFrame containing this file's data.
The returned DataFrame uses the channel names ($PnN
values) for column
names because, unlike reagent names ($PnS
), they are required and must
be unique. To find the $PnN
value for a given reagent name ($PnS
),
use fcs_file.channel_for_reagent(reagent)
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
inplace
|
Optional[bool]
|
If |
False
|
destination
|
Optional[str]
|
If provided, the file will be saved to the given path. |
None
|
**kwargs
|
Any
|
|
{}
|
Returns:
Name | Type | Description |
---|---|---|
DataFrame |
Union[DataFrame, None]
|
This file's data, with query parameters applied. |
Union[DataFrame, None]
|
If inplace=True, it updates the self.events property. |
|
Union[DataFrame, None]
|
If destination is a string, saves file to the destination and returns None. |
get_file_internal_compensation()
Get the file-internal Compensation.
plot(plot_type, x_channel, y_channel, z_channel=None, population_id=None, compensation=0, **kwargs)
Build a plot for an FcsFile.
See Plot.get()
for more information.
update()
Save changes to this FcsFile to CellEngine.
upload(experiment_id, filepath)
classmethod
Uploads a file. The maximum file size is approximately 2.3 GB. Contact us if you need to work with larger files.
Automatically parses panels and annotations and updates ScaleSets to include all channels in the file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
experiment_id
|
str
|
ID of the experiment to which the file belongs |
required |
filepath
|
str
|
The file contents. |
required |