Accession number BBBC021 · Version 1
Example Images
Actin disrupter |
Aurora kinase inhbitor |
Eg5 inhibitor |
Tubulin destabilizer |
Tubulin stabilizer |
Description of the biological application
Phenotypic profiling attempts to summarize multiparametric, feature-based analysis of cellular phenotypes of each sample so that similarities between profiles reflect similarities between samples. Profiling is well established for biological readouts such as transcript expression and proteomics. Image-based profiling, however, is still an emerging technology.
This image set provides a basis for testing image-based profiling methods wrt. to their ability to predict the mechanisms of action of a compendium of drugs. The image set was collected using a typical set of morphological labels and uses a physiologically relevant p53-wildtype breast-cancer model system (MCF-7) and a mechanistically distinct set of targeted and cancer-relevant cytotoxic compounds that induces a broad range of gross and subtle phenotypes.
Images
The images are of MCF-7 breast cancer cells treated for 24 h with a collection of 113 small molecules at eight concentrations. The cells were fixed, labeled for DNA, F-actin, and Β-tubulin, and imaged by fluorescent microscopy as described [Caie et al. Molecular Cancer Therapeutics, 2010].
There are 39,600 image files (13,200 fields of view imaged in three channels) in TIFF format. We provide the images in 55 ZIP archives, one for each microtiter plate. The archives are ~750 MB each.
BBBC021_v1_images_Week1_22123.zip (839436312 bytes)
BBBC021_v1_images_Week1_22141.zip (851400910 bytes)
BBBC021_v1_images_Week1_22161.zip (841371484 bytes)
BBBC021_v1_images_Week1_22361.zip (854598915 bytes)
BBBC021_v1_images_Week1_22381.zip (861576297 bytes)
BBBC021_v1_images_Week1_22401.zip (874848053 bytes)
BBBC021_v1_images_Week2_24121.zip (813322359 bytes)
BBBC021_v1_images_Week2_24141.zip (812952878 bytes)
BBBC021_v1_images_Week2_24161.zip (817951167 bytes)
BBBC021_v1_images_Week2_24361.zip (819824327 bytes)
BBBC021_v1_images_Week2_24381.zip (744172615 bytes)
BBBC021_v1_images_Week2_24401.zip (800661306 bytes)
BBBC021_v1_images_Week3_25421.zip (871456323 bytes)
BBBC021_v1_images_Week3_25441.zip (857910029 bytes)
BBBC021_v1_images_Week3_25461.zip (866095101 bytes)
BBBC021_v1_images_Week3_25681.zip (872819691 bytes)
BBBC021_v1_images_Week3_25701.zip (872579408 bytes)
BBBC021_v1_images_Week3_25721.zip (879704833 bytes)
BBBC021_v1_images_Week4_27481.zip (821231042 bytes)
BBBC021_v1_images_Week4_27521.zip (900816031 bytes)
BBBC021_v1_images_Week4_27542.zip (842012560 bytes)
BBBC021_v1_images_Week4_27801.zip (887877004 bytes)
BBBC021_v1_images_Week4_27821.zip (869265677 bytes)
BBBC021_v1_images_Week4_27861.zip (859009634 bytes)
BBBC021_v1_images_Week5_28901.zip (855927945 bytes)
BBBC021_v1_images_Week5_28921.zip (871718868 bytes)
BBBC021_v1_images_Week5_28961.zip (842785692 bytes)
BBBC021_v1_images_Week5_29301.zip (852254591 bytes)
BBBC021_v1_images_Week5_29321.zip (863875730 bytes)
BBBC021_v1_images_Week5_29341.zip (873145275 bytes)
BBBC021_v1_images_Week6_31641.zip (820699818 bytes)
BBBC021_v1_images_Week6_31661.zip (858473246 bytes)
BBBC021_v1_images_Week6_31681.zip (800433406 bytes)
BBBC021_v1_images_Week6_32061.zip (805248261 bytes)
BBBC021_v1_images_Week6_32121.zip (821454013 bytes)
BBBC021_v1_images_Week6_32161.zip (736915539 bytes)
BBBC021_v1_images_Week7_34341.zip (848887897 bytes)
BBBC021_v1_images_Week7_34381.zip (807292825 bytes)
BBBC021_v1_images_Week7_34641.zip (878009142 bytes)
BBBC021_v1_images_Week7_34661.zip (871253843 bytes)
BBBC021_v1_images_Week7_34681.zip (869464697 bytes)
BBBC021_v1_images_Week8_38203.zip (784989526 bytes)
BBBC021_v1_images_Week8_38221.zip (821316018 bytes)
BBBC021_v1_images_Week8_38241.zip (765063810 bytes)
BBBC021_v1_images_Week8_38341.zip (781845212 bytes)
BBBC021_v1_images_Week8_38342.zip (770975988 bytes)
BBBC021_v1_images_Week9_39206.zip (781358283 bytes)
BBBC021_v1_images_Week9_39221.zip (790100629 bytes)
BBBC021_v1_images_Week9_39222.zip (771022096 bytes)
BBBC021_v1_images_Week9_39282.zip (730865395 bytes)
BBBC021_v1_images_Week9_39283.zip (779096045 bytes)
BBBC021_v1_images_Week9_39301.zip (725130381 bytes)
BBBC021_v1_images_Week10_40111.zip (886666704 bytes)
BBBC021_v1_images_Week10_40115.zip (855657088 bytes)
BBBC021_v1_images_Week10_40119.zip (829928012 bytes)
Note about the selection of compounds
A subset of the compound-concentrations have been identified as clearly having one of 12 different primary mechanisms of action. Mechanistic classes were selected so as to represent a wide cross-section of cellular morphological phenotypes. The differences between phenotypes in some cases were very subtle: we identified 6 of the 12 mechanisms visually (Actin disruptors, Aurora kinase inhibitors, Eg5 inhibitors, Microtubule destabilizers, Microtubule stabilizers, and Epithelial); the remainder were defined based on the literature.
All compounds were tested at eight doses. The top concentration was different for many of the compounds and was carefully selected from the literature. Not all concentrations are available in this dataset. Missing concentrations are due to one of three factors:
- The dose was determined to be inactive. Activity was defined by setting a threshold on the Mahalanobis distance from the set of DMSO profiles: profiles of doses that were outside this threshold were considered active. The feature space corresponded to measurements extracted by a proprietary software tool used at AstraZeneca.
- The dose was determined to be overly toxic, i.e., the images had no cells or very few cells.
- The images did not pass QC, that is, they were either out of focus wells or contained image artifacts.
Metadata
The file BBBC021_v1_image.csv contains the metadata, with the following fields:
- TableNumber
- ImageNumber
- Image_FileName_DAPI
- Image_PathName_DAPI
- Image_FileName_Tubulin
- Image_PathName_Tubulin
- Image_FileName_Actin
- Image_PathName_Actin
- Image_Metadata_Plate_DAPI
- Image_Metadata_Well_DAPI
- Replicate
- Image_Metadata_Compound
- Image_Metadata_Concentration
Example rows from the file:
The file BBBC021_v1_compound.csv gives the structures (in SMILES format) of most of the compounds. The fields are:
- compound
- smiles
Example rows from the file:
compound | smiles |
---|---|
DMSO | |
leupeptin | CC(C)C[C@H](NC(=O)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(=N)N)C=O |
taxol | CC(=O)O[C@H]1C(=O)[C@]2(C)[C@@H](O)C[C@H]3OC[C@@]3(OC(=O)C)[C@H]2[C@H](OC(=O)c4ccccc4)[C@]5(O)C[C@H](OC(=O)[C@H](O)[C@@H](NC(=O)c6ccccc6)c7ccccc7)C(=C1C5(C)C)C |
AZ235 | CC(C)n1c(C)ncc1c2ccnc(Nc3ccc(cc3)S(=O)(=O)C)n2 |
AZ-O |
BBBC021_v1_compound.csv (8174 bytes)
Ground truth
A subset of the compound-concentrations have been identified as clearly having one of 12 different primary mechanims of action. mechanistic classes were selected so as to represent a wide cross-section of cellular morphological phenotypes. The differences between phenotypes were in some cases very subtle: we were only able to identify 6 of the 12 mechanisms visually; the remainder were defined based on the literature.
The file BBBC021_v1_moa.csv contains the mechanisms of action of 103 compound-concentrations (38 compounds at 1–7 concentrations each). The fields are:
- compound
- concentration
- moa
Example rows from the file:
compound | concentration | moa |
---|---|---|
PP-2 | 3.000000 | Epithelial |
emetine | 0.300000 | Protein synthesis |
AZ258 | 1.000000 | Aurora kinase inhibitors |
NOTE: When evaluating accuracy of MOA classification, it is critical to ensure that the cross-validation is set up correctly. MOA classification is the task of classifying the MOA of an unseen compound. Therefore, the evaluation should be a leave-one-compound-out cross validation: in each iteration, hold out one compound (all replicates and at all concentrations), train on the remaining, and test on the held out compound.
BBBC021_v1_moa.csv (4393 bytes)
CellProfiler pipelines
CellProfiler analysis pipeline
CellProfiler illumination pipeline
Published results using this image set
The prediction of Mechanism-of-Action was performed with restrictions on the possible match. Not-Same-Compound (NSC) does not allow a match to the same compound. Not-Same-Compound-or-Batch (NSCB) does not allow a match to the same compound or any compound on the same batch. Evaluations were performed at level of individual wells (Per-Well) as well as that of individual treatments, where replicate wells were averaged to create a profile (Per-Treatment).
Per-Treatment | Per-Well | |||
---|---|---|---|---|
NSC | NSCB | NSC | NSCB | Citation |
96% | 95% | 91% | 89% | Ando et al., BioRxiv, 2017. |
94% | 77% | 86% | 71% | Ljosa et al., J. Biomol. Screening, 2013. |
91% | N/A | N/A | N/A | Pawlowski et al., BioRxiv, 2016. |
90% | 85% | N/A | N/A | Singh et al., J. Microsc., 2014. |
For more information
These images were originally gathered for Caie et al. (Molecular Cancer Therapeutics, 2010).
Recommended citation
"We used image set BBBC021v1 [Caie et al., Molecular Cancer Therapeutics, 2010], available from the Broad Bioimage Benchmark Collection [Ljosa et al., Nature Methods, 2012]."
Copyright
The images and ground truth are copyright AstraZeneca Pharmaceuticals.