Datasets
SEA Libraries
SEA Libraries consist of the following files
<library_id>_compounds.Rdata: Compound informaiton Rdata file with a data.frame the following columns
compound: ZINC ID (e.g. ZINC57058)
smiles: character string representing the molecule (e.g. NCCc1c[nH]c2ccc(O)cc12)
data: Information about the molecule collected from ChEMBL, ZINC and depicted with CACTVS
<library_id>_sets.Rdata: Target set information Rdata file with a data.frame with the following columns
target: Identifier for the target (e.g. DRD2_HUMAN)
name: Target name
compound: compound identifier
affinity: affinity threshold
description: target description
<library_id>.fit: Parameters for the null background model
<library_id>.sea: SEA library file
All datasets are prepared with using the
BioChemPantry R package.
Library ID |
Library Source |
Library Description |
Download Link |
chembl27_rdkit_ecfp4 |
ChEMBL 27 |
Uniprot Entry to Zinc ID with Extended Connectivity Fingerprint with diameter 4 (Hashed) fingerprints |
Download |
SEA Scores
SEA Scores consist of all associations between targets in the reference library vs. targets in the query library
The results table contains the following columns
target1: Uniprot Entry of reference target (e.g. DRD2_HUMAN)
entrez_id1: Gene id of reference target (e.g. 1813)
gene_name1: Gene name of reference target (e.g. DRD2)
description1: Description of reference target (e.g. D(2) dopamine receptor)
target1_class_1: ChEMBL class_1 for reference target (e.g. Membrane receptor)
target1_class_2: ChEMBL class_2 for reference target (e.g. 7TM1)
target1_class_3: ChEMBL class_3 for reference target (e.g. SmallMol)
target1_class_4: ChEMBL class_4 for reference target (e.g. Monoamine receptor)
target1_class_5: ChEMBL class_5 for reference target (e.g. Dopamine receptor)
target1_class_6: ChEMBL class_5 for reference target (e.g. Dopamine receptor)
target2: Uniprot Entry of query target (e.g. ESR1_HUMAN)
entrez_id2: Gene id of query target (e.g. 2099)
gene_name2: Gene name of query target (e.g. ESR1)
description2: Description of query target (e.g. Estrogen receptor)
target2_class_1: ChEMBL class_1 for query target (e.g. Transcription Factor)
target2_class_2: ChEMBL class_2 for query target (e.g. Nuclear Receptor)
target2_class_3: ChEMBL class_3 for query target (e.g. NR3)
target2_class_4: ChEMBL class_4 for query target (e.g. NR3A)
target2_class_5: ChEMBL class_5 for query target (e.g. NR3A1)
target2_class_6: ChEMBL class_5 for query target (e.g. NR3A1)
MaxTC: Maximum tanimoto similarity between compounds from ref_target to compounds from query_target in [0,1] with 1 being identical upto the the resolution of the fingerprint
Zscore: Raw Z-score under the reference SEA background null model
Pvalue: P-value of Z-score
Qvalue: FDR corrected P-value for significance across all comparisons
Evalue: Bonferroni correction for significance across all comparisons
Reference Library |
Query Library |
Fingerprint Type |
Download Link |
chembl27_rdkit_ecfp4 |
chembl27_rdkit_ecfp4 |
Extended Connectivity Fingerprint (Path-based, Bits, Hashed, radius 2) |
Download |