91ÑÇÉ«´«Ã½ recommends clearer guidance on metabolomics data sharing
The 91ÑÇÉ«´«Ã½ sent a letter to the National Cancer Institute on Dec. 30 regarding how to support privacy, reproducibility and harmonization of metabolomics data in alignment with the new .
When the NCI published a titled “Soliciting Input on the Use and Reuse of Cancer Metabolomics Data” in October, the 91ÑÇÉ«´«Ã½ was eager to ask its members about their experiences and to share their concerns about the NIH’s data-management and -sharing policy.
Briefly, in its letter, the 91ÑÇÉ«´«Ã½ told the NCI that (1) -omics research produces large, complex data sets that threaten to burden many scientists under the NIH policy; (2) the diversity of metabolomics research necessitates both standardization and flexibility to be maximally effective; and (3) high variability in sample preparation, data collection, software, metabolite nomenclature and more makes the reuse and integration of metabolomics data extremely difficult.
Compliance must not burden investigators
The NIH data-management and -sharing policy, effective Jan. 25, aims to enable validation, promote data reuse and provide public access to NIH-funded research.
Rick Page of Miami University, chair of the 91ÑÇÉ«´«Ã½ Public Affairs Advisory Committee, said that this effort is “noble and laudable but has the potential to require onerous data annotation efforts in order to yield useful publicly shared data.”
His concern arises from the NIH policy’s broad definition of scientific data, which must be of “sufficient quality to validate and replicate research findings, regardless of whether the data are used to support scholarly publications.” You can read the full policy and definition .
For scientists who perform metabolomics and -omics research, this could be a tall order.
The society recommended that the NIH issue more guidance on what level of data and information are required to be compliant. The organization noted that it is imperative that the clarifications be “sufficiently flexible” to accommodate the diverse methods used in metabolomics research and their technical limitations.
Standardize, but stay flexible too
For data-management and -sharing to be maximally effective, there must be some standardization in terms of data formats, nomenclature and metadata information. For metabolomics data sets, standardization will be a challenge.
Metabolomics research most commonly is conducted using either mass spectrometry or nuclear magnetic resonance. These techniques are highly sensitive to experimental parameters. This means that variations in media, sample preparation, instrumentation and more can affect the results of the experiment significantly. Without sufficient metadata to instruct other scientists on how the data was collected and analyzed, the datasets are of little use.
To ensure data have proper metadata without undue burden, the society recommended that repositories require “a reasonable degree of metadata that is standardized in format and interoperable with .”
Metabolomics involves collecting a snapshot of millions of molecules varying in chemical structures, chimeric states and chemical modifications. To distinguish one unique molecule from others is no small task, let alone naming them.
Many different formats and styles exist to name molecules, including InChIKey, SMILES, PubChem, ChemSpider, CHEBI and several others, but molecules still can have multiple names that complicate the deposition and retrieval of metabolomics data.
The 91ÑÇÉ«´«Ã½ said that standardization of nomenclature across scientific fields would be beneficial, but interoperability of naming formats should be prioritized.
Additionally, metabolomics is a rapidly evolving field, and standardizations run the risk of being outdated quickly. The society called for the NCI and the NIH to structure data repositories to “accommodate new technologies and incorporate new functionalities with ease.”
Data diversity and complexity hinder metabolomics reuse
Andrew Lane, a professor at the University of Kentucky, agreed with the NIH’s goal of the data-sharing policy, stating that data in metabolomics should be “easily retrievable and understandable to nonexperts.” But he had some concerns about how to achieve successful transformation of metabolomics data to biological significance.
Pathway analysis and enrichment software can be “unnecessarily reductive in its assumptions,” Lane said. This type of software is designed to analyze complex data sets and output the metabolic pathways that may be upregulated or downregulated. However, the results may be misleading. To help ensure metabolomics data are shared and reused responsibly, the society recommended that these software packages clearly communicate to users that their outputs require additional validation.
The NCI requested feedback on researchers’ experiences integrating metabolomics data into multiomics studies, to which Lane said: “It is critical that the data are carefully managed and highly interoperable between multiple -omics data streams to ensure the output isn’t misleading or overly reductive.”
He clarified that to do this for each tissue, cancer type and specialized metabolism within an organism, the NIH must be prepared for “horrendous complexity.”
To increase reuse of metabolomics data by nonexperts, the society recommended that repositories be required to provide thorough instructions on how to properly retrieve, process and analyze metabolomics data sets to ensure they are utilized correctly.
Lane explained that the complexity of metabolomics makes standardization and centralized deposition a challenge.
“Developing a system that is effective for everyone is actually very difficult,” Lane said. He applauded the efforts of researchers at the University of California, San Diego, in developing a workable databank system for metabolomics, the , but noted that some issues related to depositing tracer data remain.
Let’s stay in touch
The society credited the NCI for soliciting input from the scientific community on cancer metabolomics data-management and -sharing but encouraged continued engagement.
“Decisions on these policies must consider both the utility of deposited data and the financial and time costs associated with meeting the final requirements” and should not be rushed, the society wrote.
The society asked the NCI to convene a summit to provide direct and candid discussions with investigators, journals and industry for setting standards and implementing those standards into research workflows.
This will ensure policymakers strike a balance between delivering on goals for the new NIH data-management and -sharing policy and implementing it in a way that is amenable to current scientific infrastructure.
The 91ÑÇÉ«´«Ã½ and its members also hope to gain clarity on how federal science agencies and research institutes plan to support the infrastructure necessary for effective data-management and -sharing, such as funding for repositories. This type of support is critical for public access to scientific data but has yet to be defined clearly by policymakers.
Enjoy reading 91ÑÇÉ«´«Ã½ Today?
Become a member to receive the print edition four times a year and the digital edition weekly.
Learn moreGet the latest from 91ÑÇÉ«´«Ã½ Today
Enter your email address, and we’ll send you a weekly email with recent articles, interviews and more.
Latest in Policy
Policy highlights or most popular articles
‘Our work is about science transforming people’s lives’
Ann West, chair of the 91ÑÇÉ«´«Ã½ Public Affairs Advisory Committee, sits down Monica Bertagnolli, director of the National Institutes of Health.
Applied research won’t flourish without basic science
Three senior figures at the US National Institutes of Health explain why the agency remains committed to supporting basic science and research.
91ÑÇÉ«´«Ã½ weighs in on NIH reform proposal
The agency must continue to prioritize investigator-initiated, curiosity-driven basic research, society says.
91ÑÇÉ«´«Ã½ seeks feedback on NIH postdoc training questions
The National Institutes of Health takes steps toward addressing concerns about support caps, a funding mechanism and professional development.
5 growing threats to academic freedom
From educational gag orders to the decline of tenure-track positions, academic freedom in the United States has been worsening in recent years.
Will Congress revive the China Initiative?
The 2018 program to counter economic espionage raised fears about anti-Asian discrimination and discouraged researchers.