Human MCF7 cells – compound-profiling experiment

Accession number BBBC021 · Version 1

Example Images

Actin disrupter Aurora kinase inhibitor Monoaster

Actin disrupter

Aurora kinase inhbitor

Eg5 inhibitor

Tubulin destabilizer Tubulin stabilizer

Tubulin destabilizer

Tubulin stabilizer

Description of the biological application

Phenotypic profiling attempts to summarize multiparametric, feature-based analysis of cellular phenotypes of each sample so that similarities between profiles reflect similarities between samples. Profiling is well established for biological readouts such as transcript expression and proteomics. Image-based profiling, however, is still an emerging technology.

This image set provides a basis for testing image-based profiling methods wrt. to their ability to predict the mechanisms of action of a compendium of drugs. The image set was collected using a typical set of morphological labels and uses a physiologically relevant p53-wildtype breast-cancer model system (MCF-7) and a mechanistically distinct set of targeted and cancer-relevant cytotoxic compounds that induces a broad range of gross and subtle phenotypes.

Images

The images are of MCF-7 breast cancer cells treated for 24 h with a collection of 113 small molecules at eight concentrations. The cells were fixed, labeled for DNA, F-actin, and Β-tubulin, and imaged by fluorescent microscopy as described [Caie et al. Molecular Cancer Therapeutics, 2010].

There are 39,600 image files (13,200 fields of view imaged in three channels) in TIFF format. We provide the images in 55 ZIP archives, one for each microtiter plate. The archives are ~750 MB each.

 BBBC021_v1_images_Week1_22123.zip (839436312 bytes)

 BBBC021_v1_images_Week1_22141.zip (851400910 bytes)

 BBBC021_v1_images_Week1_22161.zip (841371484 bytes)

 BBBC021_v1_images_Week1_22361.zip (854598915 bytes)

 BBBC021_v1_images_Week1_22381.zip (861576297 bytes)

 BBBC021_v1_images_Week1_22401.zip (874848053 bytes)

 BBBC021_v1_images_Week2_24121.zip (813322359 bytes)

 BBBC021_v1_images_Week2_24141.zip (812952878 bytes)

 BBBC021_v1_images_Week2_24161.zip (817951167 bytes)

 BBBC021_v1_images_Week2_24361.zip (819824327 bytes)

 BBBC021_v1_images_Week2_24381.zip (744172615 bytes)

 BBBC021_v1_images_Week2_24401.zip (800661306 bytes)

 BBBC021_v1_images_Week3_25421.zip (871456323 bytes)

 BBBC021_v1_images_Week3_25441.zip (857910029 bytes)

 BBBC021_v1_images_Week3_25461.zip (866095101 bytes)

 BBBC021_v1_images_Week3_25681.zip (872819691 bytes)

 BBBC021_v1_images_Week3_25701.zip (872579408 bytes)

 BBBC021_v1_images_Week3_25721.zip (879704833 bytes)

 BBBC021_v1_images_Week4_27481.zip (821231042 bytes)

 BBBC021_v1_images_Week4_27521.zip (900816031 bytes)

 BBBC021_v1_images_Week4_27542.zip (842012560 bytes)

 BBBC021_v1_images_Week4_27801.zip (887877004 bytes)

 BBBC021_v1_images_Week4_27821.zip (869265677 bytes)

 BBBC021_v1_images_Week4_27861.zip (859009634 bytes)

 BBBC021_v1_images_Week5_28901.zip (855927945 bytes)

 BBBC021_v1_images_Week5_28921.zip (871718868 bytes)

 BBBC021_v1_images_Week5_28961.zip (842785692 bytes)

 BBBC021_v1_images_Week5_29301.zip (852254591 bytes)

 BBBC021_v1_images_Week5_29321.zip (863875730 bytes)

 BBBC021_v1_images_Week5_29341.zip (873145275 bytes)

 BBBC021_v1_images_Week6_31641.zip (820699818 bytes)

 BBBC021_v1_images_Week6_31661.zip (858473246 bytes)

 BBBC021_v1_images_Week6_31681.zip (800433406 bytes)

 BBBC021_v1_images_Week6_32061.zip (805248261 bytes)

 BBBC021_v1_images_Week6_32121.zip (821454013 bytes)

 BBBC021_v1_images_Week6_32161.zip (736915539 bytes)

 BBBC021_v1_images_Week7_34341.zip (848887897 bytes)

 BBBC021_v1_images_Week7_34381.zip (807292825 bytes)

 BBBC021_v1_images_Week7_34641.zip (878009142 bytes)

 BBBC021_v1_images_Week7_34661.zip (871253843 bytes)

 BBBC021_v1_images_Week7_34681.zip (869464697 bytes)

 BBBC021_v1_images_Week8_38203.zip (784989526 bytes)

 BBBC021_v1_images_Week8_38221.zip (821316018 bytes)

 BBBC021_v1_images_Week8_38241.zip (765063810 bytes)

 BBBC021_v1_images_Week8_38341.zip (781845212 bytes)

 BBBC021_v1_images_Week8_38342.zip (770975988 bytes)

 BBBC021_v1_images_Week9_39206.zip (781358283 bytes)

 BBBC021_v1_images_Week9_39221.zip (790100629 bytes)

 BBBC021_v1_images_Week9_39222.zip (771022096 bytes)

 BBBC021_v1_images_Week9_39282.zip (730865395 bytes)

 BBBC021_v1_images_Week9_39283.zip (779096045 bytes)

 BBBC021_v1_images_Week9_39301.zip (725130381 bytes)

 BBBC021_v1_images_Week10_40111.zip (886666704 bytes)

 BBBC021_v1_images_Week10_40115.zip (855657088 bytes)

 BBBC021_v1_images_Week10_40119.zip (829928012 bytes)

Note about the selection of compounds

A subset of the compound-concentrations have been identified as clearly having one of 12 different primary mechanisms of action. Mechanistic classes were selected so as to represent a wide cross-section of cellular morphological phenotypes. The differences between phenotypes in some cases were very subtle: we identified 6 of the 12 mechanisms visually (Actin disruptors, Aurora kinase inhibitors, Eg5 inhibitors, Microtubule destabilizers, Microtubule stabilizers, and Epithelial); the remainder were defined based on the literature.

All compounds were tested at eight doses. The top concentration was different for many of the compounds and was carefully selected from the literature. Not all concentrations are available in this dataset. Missing concentrations are due to one of three factors:

  • The dose was determined to be inactive. Activity was defined by setting a threshold on the Mahalanobis distance from the set of DMSO profiles: profiles of doses that were outside this threshold were considered active. The feature space corresponded to measurements extracted by a proprietary software tool used at AstraZeneca.
  • The dose was determined to be overly toxic, i.e., the images had no cells or very few cells.
  • The images did not pass QC, that is, they were either out of focus wells or contained image artifacts.

Metadata

The file BBBC021_v1_image.csv contains the metadata, with the following fields:

  • TableNumber
  • ImageNumber
  • Image_FileName_DAPI
  • Image_PathName_DAPI
  • Image_FileName_Tubulin
  • Image_PathName_Tubulin
  • Image_FileName_Actin
  • Image_PathName_Actin
  • Image_Metadata_Plate_DAPI
  • Image_Metadata_Well_DAPI
  • Replicate
  • Image_Metadata_Compound
  • Image_Metadata_Concentration

Example rows from the file:

TableNumber ImageNumber Image_FileName_DAPI Image_PathName_DAPI Image_FileName_Tubulin Image_PathName_Tubulin Image_FileName_Actin Image_PathName_Actin Image_Metadata_Plate_DAPI Image_Metadata_Well_DAPI Replicate Image_Metadata_Compound Image_Metadata_Concentration
4 233 G10_s1_w1BEDC2073-A983-4B98-95E9-84466707A25D.tif Week4/Week4_27481 G10_s1_w2DCEC82F3-05F7-4F2F-B779-C5DF9698141E.tif Week4/Week4_27481 G10_s1_w43CD51CBC-2370-471F-BA01-EE250B14B3C8.tif Week4/Week4_27481 Week4_27481 G10 1 5-fluorouracil 0.003000
4 234 G10_s2_w11C3B9BCC-E48F-4C2F-9D31-8F46D8B5B972.tif Week4/Week4_27481 G10_s2_w2570437EF-C8DC-4074-8D63-7FA3A7271FEE.tif Week4/Week4_27481 G10_s2_w400B21F33-BDAB-4363-92C2-F4FB7545F08C.tif Week4/Week4_27481 Week4_27481 G10 1 5-fluorouracil 0.003000
4 235 G10_s3_w1F4FCE330-C71C-4CA3-9815-EAF9B9876EB5.tif Week4/Week4_27481 G10_s3_w2194A9AC7-369B-4D84-99C0-DA809B0042B8.tif Week4/Week4_27481 G10_s3_w4E0452054-9FC1-41AB-8C5B-D0ACD058991F.tif Week4/Week4_27481 Week4_27481 G10 1 5-fluorouracil 0.003000

BBBC021_v1_image.csv (3.8 MB)

The file BBBC021_v1_compound.csv gives the structures (in SMILES format) of most of the compounds. The fields are:

  • compound
  • smiles

Example rows from the file:

compound smiles
DMSO  
leupeptin CC(C)C[C@H](NC(=O)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(=N)N)C=O
taxol CC(=O)O[C@H]1C(=O)[C@]2(C)[C@@H](O)C[C@H]3OC[C@@]3(OC(=O)C)[C@H]2[C@H](OC(=O)c4ccccc4)[C@]5(O)C[C@H](OC(=O)[C@H](O)[C@@H](NC(=O)c6ccccc6)c7ccccc7)C(=C1C5(C)C)C
AZ235 CC(C)n1c(C)ncc1c2ccnc(Nc3ccc(cc3)S(=O)(=O)C)n2
AZ-O  

 BBBC021_v1_compound.csv (8174 bytes)

Ground truth Biological labels button B

A subset of the compound-concentrations have been identified as clearly having one of 12 different primary mechanims of action. mechanistic classes were selected so as to represent a wide cross-section of cellular morphological phenotypes. The differences between phenotypes were in some cases very subtle: we were only able to identify 6 of the 12 mechanisms visually; the remainder were defined based on the literature.

The file BBBC021_v1_moa.csv contains the mechanisms of action of 103 compound-concentrations (38 compounds at 1–7 concentrations each). The fields are:

  • compound
  • concentration
  • moa

Example rows from the file:

compound concentration moa
PP-2 3.000000 Epithelial
emetine 0.300000 Protein synthesis
AZ258 1.000000 Aurora kinase inhibitors

NOTE: When evaluating accuracy of MOA classification, it is critical to ensure that the cross-validation is set up correctly. MOA classification is the task of classifying the MOA of an unseen compound. Therefore, the evaluation should be a leave-one-compound-out cross validation: in each iteration, hold out one compound (all replicates and at all concentrations), train on the remaining, and test on the held out compound.

 BBBC021_v1_moa.csv (4393 bytes)

CellProfiler pipelines

CellProfiler analysis pipeline

CellProfiler illumination pipeline

Published results using this image set

The prediction of Mechanism-of-Action was performed with restrictions on the possible match. Not-Same-Compound (NSC) does not allow a match to the same compound. Not-Same-Compound-or-Batch (NSCB) does not allow a match to the same compound or any compound on the same batch. Evaluations were performed at level of individual wells (Per-Well) as well as that of individual treatments, where replicate wells were averaged to create a profile (Per-Treatment).

Per-Treatment Per-Well  
Biological labels button B Biological labels button B Biological labels button B Biological labels button B  
NSC NSCB NSC NSCB Citation
96% 95% 91% 89% Ando et al., BioRxiv, 2017.
94% 77% 86% 71% Ljosa et al., J. Biomol. Screening, 2013.
91% N/A N/A N/A Pawlowski et al., BioRxiv, 2016.
90% 85% N/A N/A Singh et al., J. Microsc., 2014.

For more information

These images were originally gathered for Caie et al. (Molecular Cancer Therapeutics, 2010).

Recommended citation

"We used image set BBBC021v1 [Caie et al., Molecular Cancer Therapeutics, 2010], available from the Broad Bioimage Benchmark Collection [Ljosa et al., Nature Methods, 2012]."

Copyright

The images and ground truth are copyright AstraZeneca Pharmaceuticals.