Skip to content

FcsFiles

CellEngine API: Fcs Files

FCS files are the standard data representation for flow cytometry. In CellEngine, an FcsFile resource represents the file itself, along with metadata such as the panel and annotations.

Properties

Properties are the snake_case equivalent of those documented on the CellEngine API unless otherwise noted.

Methods

cellengine.resources.fcs_file.FcsFile

A class representing a CellEngine FCS file.

annotations: List[Annotations] property writable

Examples:

Getting an annotation value:

next(a["value"] for a in file.annotations if a["name"] == "my_anno", None)

Appending an annotation:

file.annotations.append({"name": "my_anno", "value": "my_value"})

Modifying an annotation:

for anno in file.annotations:
    if anno["name"] == "my_anno":
        anno["value"] = "new_value"

Removing an annotation:

file.annotations = [a for a in file.annotations if a["name"] != "my_anno"]

channels: List[str] property

Return all channels in the file.

compensation: Union[FileCompensations, None] property

Used with per-file compensation to indicate the compensation to apply to this file.

crc32c: str property

The CRC32c sum for the file, hex-encoded, per RFC 4960 Appendix B.

data: Dict[str, Any] property

deleted: Union[datetime, None] property writable

If set, the file has been soft-deleted and may be permanently deleted later.

event_count: int property

The number of events (rows) in the file.

events property

A DataFrame containing this file's events (typically cells).

This is the last result from FcsFile.get_events(inplace=True). If that method has not been called, a DataFrame will be fetched with all events (ungated, no compensation, no subsampling). To fetch events with subsampling, compensation and/or gating to a specific population, use FcsFile.get_events().

experiment_id: str property

filename: str property writable

The filename.

gates_locked: bool property writable

If true, no gates that apply to this file can be changed.

has_file_internal_comp: bool property

Whether or not this file has a valid file-internal compensation matrix.

header: Dict[str, str] property

Note: this property may be fetched lazily.

id: str property

Alias for _id.

is_control: bool property

Whether or not this is a control file. Control files are hidden from most of the Web UI, and can be used to exclude compensation or calibration beads from analysis, for example.

md5: str property

The MD5 sum for the file, hex-encoded.

name: str property writable

Alias for filename.

panel: List[Channel] property

panel_name: str property

The name of the file's panel. Files with the same panelName are grouped into a panel.

sample_name: Optional[str] property

The sample name extracted from the file header.

size: int property

The number of bytes in the file.

spill_string: str property

The file-internal compensation, if present (see has_file_internal_comp). Note: this property may be fetched lazily due to its size.

channel_for_reagent(reagent)

Returns the channel name ($PnN) for the given reagent ($PnS). ReturnsNone` if the channel isn't found.

create(experiment_id, fcs_files=None, filename=None, add_file_number=False, add_event_number=False, pre_subsample_n=None, pre_subsample_p=None, seed=None) classmethod

Creates an FCS file by copying, concatenating and/or subsampling existing file(s) from this or other experiments, or by importing from an S3-compatible service.

When concatenating and subsampling at the same time, subsampling is applied to each file before concatenating.

If addFileNumber is true, a file number column (channel) will be added to the output file indicating which file each event (cell) came from. The values in this column have a uniform random spread (±0.25 of the integer value) to aid visualization. While this column can be useful for analysis, it will cause the experiment to have FCS files with different panels unless all FCS files that have not been concatenated are deleted.

During concatenation, any FCS header parameters that do not match between files will be removed, with some exceptions:

  • $BTIM (clock time at beginning of acquisition) and $DATE will be set to the earliest value among the input files.
  • $ETIM (clock time at end of acquisition) will be set to the latest value among the input files.
  • $PnR (range for parameter n) will be set to the highest value among the input files.

All channels present in the first FCS file in the fcsFiles parameter must also be present in the other FCS files.

When importing from an S3-compatible service, be aware of the following:

  • Only a single file can be imported at a time.
  • The host property must include the bucket and region as applicable. For example, for AWS, this would look like "mybucket.s3.us-east-2.amazonaws.com".
  • The path property must specify the full path to the object, e.g. "/Study001/Specimen01.fcs".
  • Importing private S3 objects requires an accessKey and a secretKey for a user with appropriate permissions. For AWS, GetObject is required.
  • Importing objects may incur fees from the S3 service provider.

Parameters:

Name Type Description Default
experiment_id str

ID of the experiment to which the file belongs

required
fcs_files Optional[Union[str, List[str], Dict[str, str]]]

ID of file or list of IDs of files or objects to process. If more than one file is provided, they will be concatenated in order. To import files from other experiments, pass a list of dicts with _id and experimentId properties. To import a file from an S3-compatible service, provide a Dict with keys "host" and "path"; if the S3 object is private, additionally provide "access_key" and "secret_key".

None
filename Optional[str]

Rename the uploaded file.

None
add_file_number Optional[bool]

If concatenating files, adds a file number channel to the resulting file.

False
add_event_number Optional[bool]

Add an event number column to the exported file. This number corresponds to the index of the event in the original file; when concatenating files, the same event number will appear more than once.

False
pre_subsample_n Optional[int]

Randomly subsample the file to contain this many events.

None
pre_subsample_p Optional[float]

Randomly subsample the file to contain this percent of events (0 to 1).

None
seed Optional[int]

Seed for random number generator used for subsampling. Use for deterministic (reproducible) subsampling. If omitted, a pseudo-random value is used.

None

Returns:

Type Description
FcsFile

FcsFile

create_from_dataframe(experiment_id, filename, df, reagents=None, headers={}) classmethod

Creates an FCS file from a DataFrame and uploads it CellEngine.

Channel names ($PnN values such as "FITC-A" or "530/30-A") are read from the DataFrame column names.

Reagent names ($PnS values such as "CD3") are optional and are read from the 2nd-level index of the DataFrame if present, or can be provided in a list in the same order as the channels via the reagents argument.

Additional header keys can be provided via headers. In particular, it can be useful to set $PnD values, which CellEngine uses to set the initial display scaling:

  • For linear scales, set "$PnD": "Linear,<min>,<max>" (e.g. "Linear,-200,50000" for a linear scale ranging from -200 to 50,000).
  • For logarithmic scales, set "$PnD": "Logarithmic,<decades>,<min>" (e.g. "Logarithmic,4,0.1" for a logarithmic scale ranging from 0.1 to 1000).
  • For arcsinh scales, leave "$PnD" unset. Aside from several heuristics, CellEngine will usually default to arcsinh with a max equal to the "$PnR" value.

FCS files created with this method always use float32 encoding. For efficiency, consider using float32 arrays upstream when generating the FCS file values.

Examples:

With reagents specified in a multi-level index:

df = pandas.DataFrame(
    np.random.randn(6,3),
    columns=[["Ax488-A", "PE-A", "Cluster ID"],
             ["CD3", "CD4", None]],
    dtype="float32"
)
FcsFile.create_from_dataframe(
    experiment_id,
    "myfile.fcs",
    df,
    headers={
        "P3D": "Linear,0,20"
    }
)

With reagents specified in the reagents argument:

df = pandas.DataFrame(
    np.random.randn(6,3),
    columns=["Ax488-A", "PE-A", "Cluster ID"],
    dtype="float32"
)
FcsFile.create_from_dataframe(
    experiment_id,
    "myfile.fcs",
    df,
    reagents=["CD3", "CD4", None],
    headers={
        "P3D": "Linear,0,20"
    }
)

Factorize categorical data into numbers for encoding as FCS:

df = pandas.DataFrame(
    [[1.0, "T cell", 1],
     [2.0, "T cell", 2],
     [3.0, "B cell", 3],
     [4.0, "T cell", 4]],
    columns=["Ax488-A", "Cell Type", "Cluster ID"]
)
df["Cell Type"], cell_type_index = pandas.factorize(df["Cell Type"])
created = FcsFile.create_from_dataframe(
    blank_experiment._id,
    "myfile.fcs",
    df,
    headers={
        "$P2D": "Linear,0,10",
        "$P3D": "Linear,0,10"
    }
)

Returns: The created FCS file.

delete()

get(experiment_id, _id=None, name=None) classmethod

get_events(inplace=False, destination=None, **kwargs)

get_events(
    inplace: Optional[bool] = ...,
    destination: None = ...,
    **kwargs: Any
) -> DataFrame
get_events(
    inplace: Optional[bool] = ...,
    destination: str = ...,
    **kwargs: Any
) -> None

Fetch a DataFrame containing this file's data.

The returned DataFrame uses the channel names ($PnN values) for column names because, unlike reagent names ($PnS), they are required and must be unique. To find the $PnN value for a given reagent name ($PnS), use fcs_file.channel_for_reagent(reagent).

Parameters:

Name Type Description Default
inplace Optional[bool]

If True, updates the events property of this FcsFile.

False
destination Optional[str]

If provided, the file will be saved to the given path.

None
**kwargs Any
  • compensatedQ (bool): If True, applies the compensation specified in compensationId to the exported events. The numerical values will be unchanged, but the file header will contain the compensation as the spill string.
  • compensationId ([int, str]): Required if populationId is specified. Compensation to use for gating.
  • headers (bool): For TSV format only. If True, a header row containing the channel names will be included.
  • original (bool): If True, the returned file will be byte-for-byte identical to the originally uploaded file. If false or unspecified (and compensatedQ is false, populationId is unspecified and all subsampling parameters are unspecified), the returned file will contain essentially the same data as the originally uploaded file, but may not be byte-for-byte identical. For example, the byte ordering of the DATA segment will always be little-endian and any extraneous information appended to the end of the original file will be stripped. This parameter takes precedence over compensatedQ, populationId and the subsampling parameters.

    The Python toolkit uses the FlowIO library, which cannot parse as many FCS files as CellEngine can. Setting this parameter to True can cause parsing errors.

  • populationId (str): If provided, only events from this population will be included in the output file.

  • postSubsampleN (int): Randomly subsample the file to contain this many events after gating.
  • postSubsampleP (float): Randomly subsample the file to contain this percent of events (0 to 1) after gating.
  • preSubsampleN (int): Randomly subsample the file to contain this many events before gating.
  • preSubsampleP (float): Randomly subsample the file to contain this percent of events (0 to 1) before gating.
  • seed: (int): Seed for random number generator used for subsampling. Use for deterministic (reproducible) subsampling. If omitted, a pseudo-random value is used.
  • addEventNumber (bool): Add an event number column to the exported file. When a populationId is specified (when gating), this number corresponds to the index of the event in the original file.
{}

Returns:

Name Type Description
DataFrame Union[DataFrame, None]

This file's data, with query parameters applied.

Union[DataFrame, None]

If inplace=True, it updates the self.events property.

Union[DataFrame, None]

If destination is a string, saves file to the destination and returns None.

get_file_internal_compensation()

Get the file-internal Compensation.

plot(plot_type, x_channel, y_channel, z_channel=None, population_id=None, compensation=0, **kwargs)

Build a plot for an FcsFile.

See Plot.get() for more information.

update()

Save changes to this FcsFile to CellEngine.

upload(experiment_id, filepath) classmethod

Uploads a file. The maximum file size is approximately 2.3 GB. Contact us if you need to work with larger files.

Automatically parses panels and annotations and updates ScaleSets to include all channels in the file.

Parameters:

Name Type Description Default
experiment_id str

ID of the experiment to which the file belongs

required
filepath str

The file contents.

required