Skip to contents

This function sends a request to PubChem to retrieve Compound IDs (CIDs) for a given identifier. It returns a tibble (data frame) with the provided identifier and the corresponding CIDs.

Usage

get_cids(
  identifier,
  namespace = "name",
  domain = "compound",
  searchtype = NULL,
  options = NULL
)

Arguments

identifier

A vector of positive integers (e.g. cid, sid, aid) or identifier strings (source, inchikey, formula). In some cases, only a single identifier string (name, smiles, xref; inchi, sdf by POST only).

namespace

Specifies the namespace for the query. For the 'compound' domain, possible values include 'cid', 'name', 'smiles', 'inchi', 'sdf', 'inchikey', 'formula', 'substructure', 'superstructure', 'similarity', 'identity', 'xref', 'listkey', 'fastidentity', 'fastsimilarity_2d', 'fastsimilarity_3d', 'fastsubstructure', 'fastsuperstructure', and 'fastformula'. For other domains, the possible namespaces are domain-specific.

domain

Specifies the domain of the query. Possible values are 'substance', 'compound', 'assay', 'gene', 'protein', 'pathway', 'taxonomy', 'cell', 'sources', 'sourcetable', 'conformers', 'annotations', 'classification', and 'standardize'.

searchtype

Specifies the type of search to be performed. For structure searches, possible values are combinations of 'substructure', 'superstructure', 'similarity', 'identity' with 'smiles', 'inchi', 'sdf', 'cid'. For fast searches, possible values are combinations of 'fastidentity', 'fastsimilarity_2d', 'fastsimilarity_3d', 'fastsubstructure', 'fastsuperstructure' with 'smiles', 'smarts', 'inchi', 'sdf', 'cid', or 'fastformula'.

options

Additional arguments passed to get_json.

Value

A tibble (data frame) where each row corresponds to a provided identifier and its CID. The tibble has columns 'Compound' and 'CID'.

Examples

get_cids(
  identifier = "aspirin",
  namespace = "name"
)
#> # A tibble: 1 × 2
#>   Identifier CID  
#>   <chr>      <chr>
#> 1 aspirin    2244