Working Group Recommendations for Critical Elements in NGS Reporting
Report Structure and Format
Expected Format of an NGS Report
It is the consensus of the working group and end users that an NGS report must be legible. It should contain a summary of findings, which should be clear, without jargon, and easy to read through for a layperson. Including too much information often turns counterproductive in comprehending the report for a clinician or patient. All laboratories must make sure that the final report is presented in a comprehensive and clear manner without omitting essential information. The ideal NGS assay report based on DNA sequencing or RNA sequencing must contain the following without fail:
-
• Name of the patient (individual).
-
• Date of birth or age.
-
• Gender (if consented).
-
• Nationality.
-
• Ethnicity.
-
• Type of specimen.
-
• Date of collection of the specimen.
-
• Date and time of receipt of specimen in the laboratory.
-
• Laboratory identification number.
-
• Name and affiliation of requesting clinician.
-
• Contact details of patient or kin where applicable.
-
• Name of the test (single gene/gene panel/whole-exome sequencing/WGS [whole genome sequencing]); total number of genes targeted, total number of reads obtained on target, the name of the genes that were not covered or genes with coverage below 100X, and total number of fusion genes targeted.
-
• Date of issue of the report.
-
• Indication for testing (including cancer type).
-
• Methodology of the assay.
-
• Quality assurance parameters.
-
• Any additional information on quality control (QC) failure.
-
• Limitations of the assay.
-
• Description of genomic and or transcriptomic variants, including indels and fusions in accordance with Human Genome Variation Society (HGVS) nomenclature.
-
• Classification of genomic variants according to American College of Medical Genetics and Genomics/American Association of Molecular Pathology (ACMG/AMP) guidelines and/or the European Society for Molecular Oncology (ESMO) Scale for Clinical Actionability of Molecular Targets (ESCAT) system.
-
• Clear inclusion of variant allele fraction (VAF) and copy number alterations (where applicable).
-
• Targeted therapy available globally and in India and approved by regulatory bodies.
-
• Specific mention of zygosity in case of germline assays and allelic fraction of the variant in somatic assays.
-
• Name, identification number of clinical trials open for recruitment in India, and country of origin of clinical trials outside India that are available for the variants reported in the assay.
-
• Inclusion of variants of uncertain significance.
-
• Indication of clonal hematopoiesis (CHIP) mutations in solid tumor patients with suggestion for confirmation by paired blood sequencing.
-
• Signatures and designation of reporting authorities including technical supervisor.
-
• Laboratory accreditation identifiers whenever applicable.
-
• Disclaimers if applicable.
-
• Brief description of assay validation including reference to peer-reviewed publication.
-
• Recommendations for further testing that could help clarify any existing confounders in the presented report, for example, RNA NGS for better detection of any fusion genes picked up on a DNA NGS. Also, recommendations for cases in which it is preferred to cross-check a positive report, for example: NTRK fusion gene testing ([Fig. 1]).
Fig. 1 Template for a next generation sequencing report.
Signatory Authorities
-
• It is mandatory that the NGS reports be signed by the technical supervisor, scientist, and or clinician, and the head of the laboratory with the date and designation to affirm the responsibility of the technical information generated and issued in the report.
-
• The qualifications of the personnel mentioned must be in accordance with the guidelines set forth by the Ministry of Family Health and Welfare (MoFHW) as per the gazette notification of May 18, 2018 (http://clinicalestablishments.gov.in/WriteReadData/4161.pdf) followed by approval of national accreditation bodies.
-
• It is the responsibility of the head of the laboratory in the case of a standalone diagnostic center and the head of the institution in the case of a hospital to ensure the guidelines are followed.
Nomenclature of Variants
-
• Variants must be depicted as per standard international guidelines on variant nomenclature put forward by HGVS, the Human Variome Project, and the Human Genome Organization.
-
• It is mandatory to depict variants at the DNA level. Depiction at the transcriptomic or proteomic level is optional. A publicly accepted reference sequence based on the GRCh38/hg38 build must be the standard. Type of reference sequence must be depicted by a prefix:
-
“c.”: coding DNA reference sequence.
-
“g.”: linear genomic reference sequence.
-
“n.”: noncoding DNA reference sequence.
-
“p.”: protein reference sequence.
-
“r.”: RNA reference sequence (transcript).
-
• At the DNA level, the nucleotides are to be depicted in upper case letters, at the RNA level they must be in lowercase letters and at the protein level, the three-letter code is preferable as per IUPAC-IUBMB symbols.
-
• When there is more than one type of variation, the following order must be adhered to: substitution, deletion, inversion, duplication, and insertion.
-
• All genes are to be italicized and must be depicted in the most recent symbol approved by the HUGO Gene Nomenclature Committee (HGNC) as to attain uniform reporting.
-
• Special characters such as “ + ,” “-,” and “*” must also follow the HGVS system.
-
• Abbreviations denoting type/nature of variation must be strictly adhered to: “ > ” for substitution, del for deletion, dup for duplication, ins for insertion, inv for inversion, fs for frameshift, and ext for extension.
-
• Fusions are to be denoted using the symbol “::” in between the fusion partner gene symbols.
Tier-Based Classification of Variants
There are definitive international recommendations on the classification of variants in the somatic setting in cancer and in the germline setting in all diseases. All reports based on NGS must follow the AMP-ASCO-CAP (Association for Molecular Pathology–the American Society of Clinical Oncology–the College of American Pathologists College of American Pathologists) classification for somatic variants and the ACMG guidelines for variant classification for Mendelian disorders. ESMO had introduced the ESCAT ranking parallelly, but it is yet to be adapted on a global scale. We recommend the AMP-ASCO-CAP and ACMG guidelines be mandatory in NGS reports.
Using Predictive Algorithms
Artificial intelligence (AI) and deep neural networks are utilized in cancer research daily. With the advancement of technology, laboratories are adopting AI to decipher and make sense of the volume of data generated with larger sized genomic and transcriptomic panels. Yet, as with most applications and output of AI, the predictive algorithms ([Table 1]) offered for clinical services must be welcome but accepted under the “research use only (RUO)” label.
Interpreting Signaling Pathways from Variant Information
Genomic and transcriptomic variations in a tumor are directed to eventually alter protein or metabolic signaling pathways for sustenance of tumor growth or invasion. Dedicated canonical pathways such as RTK/Ras/Raf/Mek, PI3K/Akt, Wnt/β catenin, p53, Myc, Notch, and Hippo are mostly the driving pathways in cancer; however, there is a multitude of noncanonical pathways that cannot be overlooked. The routine NGS gene panels are limited in the number of genes queried as compared with the actual variations in a cancer genome. Hence deriving signaling pathway information from a limited set of variants is to be understood as limited information and not heavily relied upon. The larger the panel, the stronger the predictive capacity of driver signaling pathways. However, large panels bring with them the potential of detecting variants of doubtful clinical significance. The limitations of each should be clearly mentioned in the final report.
Therapeutic Options
The inclusion of targeted therapy and immunotherapy under the umbrella of precision oncology represents the endpoints of assaying tumor samples using molecular techniques. Every NGS report must thus have the approved and actionable therapeutic molecule mentioned next to the variant found in the tumor sample.
Off-Label Therapeutic Suggestions
Off-label use of a drug means using a drug “out of instruction.” As per the World Health Organization, half of the drugs globally are used off-label for various indications. The scenario is not different in cancer standard of care or targeted therapy. The decision and choice of using targeted therapy in an off-label mode rests upon the consensus decision of a molecular tumor board or the medical oncologist.
Minimum Quality Control Requisites
Total quality management in a laboratory is undisputedly the most important yardstick of the authenticity of a test report generated from the same. International guidelines by the American Federal program Clinical Laboratory Improvement Amendments (CLIA) mandate laboratories to follow strict policies on QC matrices.
Preanalytical Phase of NGS in Tissue-Based Assays
The success of the NGS-based molecular testing depends in large part on having an adequate amount of tumor (thereby sufficient DNA), having enough tumor percent, and minimizing potential tissue issues.
The quality of FFPE (formalin-fixed paraffin-embedded) block is a very important yardstick for a quality report. Following points are mandatory during the selection of the tissue:
FFPE samples should be reviewed by trained and board-certified molecular onco-pathologists for specimen suitability, including specimen type, tumor purity, and quantity.
Tumor purity: tumor purity is often a crucial but overlooked variable in NGS sample assessment. It is an indicator of the number or fraction of tumor cells out of all the cells present in a sample submitted by the pathologist. A score of 20 to 30% is generally accepted by laboratories but there is no standardization. Tumor purity could indirectly influence calculation of TMB and inference of germline mutations from a given sample.
Hence it is important that laboratories maintain a harmonized cut-off score for good-quality results. The committee advises a cut-off score of tumor content (20% for smaller panels: 15–50 genes), 30% for the bigger panels (>500 genes) for the macro/microdissection, given the fact that 30% is the international standard cut-off.
Nucleic acid yield: the starting material of an NGS assay is DNA or RNA. Therefore, the yield and purity of the nucleic acid are of paramount importance in generating appropriate results. Fluorescence-based DNA measurements are far lower than those quantified by spectrophotometry, but results are more accurate and precise, particularly at lower concentration ranges. In lung cancer, >30 ng can be considered a cut-off. 100 ng could be considered optimal for laboratories in general.
There are fewer chances of library failures when samples have a minimum DNA quality score represented as DNA integrity number (DIN) of 3 with a DNA concentration of at least 5 ng/L and a minimum library concentration of 40 nmol for targeted panels. RNA library concentration is the only parameter directly associated with coverage and not the RNA integrity number (RIN) in solid tumors. Therefore, the threshold value of DIN >3, with a minimum concentration of 5 ng/µL, should be accepted. The RNA distribution value is a better quality metric than the RIN.
There is a wide range of preanalytical variables that affect DNA quality, e.g., the presence or absence of fixation; the type of fixative; length of fixation in FFPE tissue.
Preliminary or final pathology reports should accompany all specimens.
Tissue should be fixed in 10% neutral-buffered formalin. Other fixatives are discouraged unless otherwise specified.
NGS sequencing and fragment size are both crucial components in the clinic's analysis of DNA material. Because it influences the type and quantity of DNA that can be extracted and analyzed, fragment size is significant. The particular application and platform being used determine the ideal fragment size for NGS sequencing. For Illumina sequencing, a fragment size of 300 to 500 base pairs is advised. Since the fragment size can influence the precision of variant detection and cause sequencing errors, choosing the right fragment size is essential for generating high-quality sequencing results.
Preanalytical Phase of NGS in Liquid Biopsy
Liquid biopsy refers to testing molecular representatives such as circulating tumor cells, circulating tumor DNA (ctDNA), exosomes, tumor extracellular vesicles, tumor-educated platelets, circulating cell-free RNA, etc. primarily from blood and to a lesser extent from other body fluids. ctDNA accounts for 0.1 to 10% of 10 to 100 ng/mL of cfDNA. Time is an essential factor in liquid biopsy. These molecules have frail half-lives. Other factors are freeze-thawing, temperature, time lost between blood draw and analysis, DNA disintegration, and leakage from cells. Hence it is important to ensure that the starting material has passed internal QC checks. For liquid samples, flow cytometry or other methods should be used to evaluate the sample's percentage of neoplastic cells. Laboratories should archive either a representative slide or image of the tissue tested.
Reporting of Tissue-Based NGS Assays
Reporting Somatic Variants in a Tumor Sample
Identification of somatic variants is done through whole exome or somatic targeted mutations in the clinic. Reporting of the single nucleotide variant (SNV) present in cancer cells is done by using databases and bioinformatic methods. hg19 and hg38 are the two versions used for purpose of alignment. The GRCh38 ALT contigs are recognizable by their _alt suffix. GRCh38 /hg38 is strongly recommended over hg19. In addition to adding many alternate contigs, GRCh 38 corrects thousands of small sequencing artifacts that cause false single nucleotide polymorphisms (SNPs) and indels. It also includes synthetic centromeric sequence and updates nonnuclear genome sequence.
All variants that predict sensitivity, resistance, or toxicity to a specific therapy, alter the function of any gene, which can be targeted by approved or investigational drugs or included in clinical trials or can influence disease prognosis or assist in diagnosing cancer, or can be used for early cancer detection, may be included in the report separately. All clinically relevant information for that tumor type should be mentioned including the pertinent negative variants which are not detected for that tumor type. When reporting a variant, reference sequence databases, population databases, cancer-specific databases, and constitutional variant databases should be considered along with in silico (computational) tool predictions, and relevant publications on functional aspects of the variant should be considered. Reports should be static with the date of issue as medical knowledge is known to change rapidly.
Levels of evidence: somatic variants should be categorized based on the level of evidence into four tiers. Clinical and experimental evidence labeled from A to D is used to classify these tiers as shown in [Table 2]. Tier I variants have strong clinical significance and have approved therapy for that tumor type or have well-provided studies supporting the same. Tier II variants are approved for other tumor types or supported by preclinical trials or case reports. Tier III are variants of unknown significance and Tier IV are benign or likely benign variants and are usually not included in the report.
Table 2
Tier-based reporting categories based on clinical and/or experimental evidence
Tier I: variants of strong clinical significance
|
Tier II: variants of potential clinical significance
|
Tier III: variants of unknown clinical significance
|
Tier IV: benign or likely benign variants
|
Level A evidence
Variants with approved therapy included in professional guidelines
|
Level C evidence
Variants with approved therapies for different tumor types or investigational therapies
|
Not observed at a significant allele frequency in the general or specific subpopulation databases, or pan-cancer or tumor-specific variant databases
No convincing published evidence of cancer association
These variants should not have been observed at significant allele frequencies in the general population, such as in the 1000 Genomes Project database, Exome Variant Server, or Exome Aggregation Consortium database.
|
Observed at significant allele frequency in the general or specific subpopulation databases
No existing published evidence of cancer association.
Most of the reports usually do not respond
|
Level B evidence
Variants with well-powered studies and having consensus from experts in the field
|
Level D evidence
Preclinical trials or a few case reports without consensus including the variant
|
For example: 1. BRAFV600E predicts response to the approved drug vemurafenib in melanoma
2. KRAS mutations predict resistance to anti-epidermal growth factor receptor monoclonal antibodies in colorectal cancer
|
For example: Alpelisib is approved for PIK3CA exon 9, p. E545K mutation in hormone-positive breast cancer patients only and if found in other cancer types, it would be a Tier II variant with level C evidence.
|
-
While reporting a somatic variant, it is mandatory to include the complete details of the variant as per standard international nomenclature guidelines, allelic fraction, level of evidence, and classification.
Variant Allele Fraction
VAF is the percentage of sequence reads observed matching a specific DNA variant divided by the overall coverage at that locus. Presence of normal cells in the sample and heterogeneity of tumors influence VAF. A somatic assay is generally validated to ascertain as low as 5 to 10% VAF. VAF above 50 when all contributing quality factors including tumor purity, coverage, etc. align could raise suspicion of germline variations in the patient. Caution must be exercised as paired tumor-normal testing is not a common practice in the field at present. A region with loss of heterozygosity (LOH) could also falsely elevate VAF. It is mandatory to discuss complex profiles with variable VAFs in a molecular tumor board before initiating treatment.
Actionable Variations and Therapeutic Choices
An NGS report would certainly contain genomic or transcriptomic variations that may be “actionable” or unsuitable with available targeted therapeutic agents. Nevertheless, all variations are significant from a tumor biology perspective, if not from a clinical standpoint.
It is now common practice for laboratories to flag U.S. Food and Drug Administration-approved and off-label therapeutic agents against “actionable” variants. However, the working group differs in opinion of this practice as it is generating confusion amongst oncologists and patients in the clinic.
It is therefore advisable to reserve the therapeutic information solely for clinicians and issue it as a separate document with the report.
Clinical Trials
Certain variants reported may be under active prospective investigation under a registered randomized controlled trial (RCT). The list of most appropriate RCTs is commonly reported alongside an NGS report. However, all the listed RCTs are performed in countries other than India. This information is thus sparingly useful for our patients and is a source of exasperation.
Within India
The working group recommends clinicians and principal investigators create a common database or Web site containing information on active and recruiting RCTs within the country that is accessible to all. This information is practically more valuable than RCTs in foreign countries.
Outside India
RCTs originating from other countries have secondary value but could be useful for investigating rare variants and for patients who can access treatment from a different country.
Allied Variables in an NGS Report
-
TMB is defined by the National Cancer Institute (NCI) as “the total number of mutations (changes) found in the DNA of cancer cells.” Alternatively, it “is a numeric index that expresses the number of mutations per megabase (muts/Mb) harbored by tumor cells in a neoplasm.” A high TMB is a biomarker of predictive response to immunotherapy and is a good addition to an NGS report with larger panel size.
-
Microsatellite instability (MSI) is defined by the NCI as “a change that occurs in certain cells (such as cancer cells) in which the number of repeated DNA bases in a microsatellite (a short, repeated sequence of DNA) is different from what it was when the microsatellite was inherited.” Tumors harboring high MSI (MSI-H)/deficient mismatch repair (dMMR) are likely to benefit from immunotherapy. MSI is therefore a valuable biomarker to be added to an NGS report with a larger panel size.
-
Homologous recombination deficiency (HRD): owing to the complexity of the concept of HRD, it is crucial to understand the chemistry of homologous recombination repair pathway and the genes involved in the same. A recent definition of HRD is “a phenotype that is characterized by the inability of a cell to effectively repair DNA double-strand breaks using the HRR pathway.” It is a crucial biomarker for initiation of PARP inhibitors or platinum-based chemotherapy, apart from being a prognostic marker for certain cancer types. Clinically, HRD is now restricted to loss of function of BRCA proteins or BRCA-like or BRCA-ness genotype. However, concepts of genomic LOH, telomeric allelic imbalance, and large-scale transitions, a combination used to assess genomic instability/genomic scars, are being utilized by few laboratories based on the SOLO1 trial, as companion diagnostics. Any laboratory reporting markers suggestive of HRD is required to perform extensive validation of scores after appropriate choice of markers. The same must be made available to the clinician upon request. Any report with a positive HRD status must be presented to a molecular tumor board or specialist for in-depth analysis of genotype–phenotype correlation.
Variant of Uncertain Significance/Variant of Unknown Significance/Unclassified Variant
Variant of unknown significance (VUS) is defined by NCI as “A change in a gene's DNA sequence that has an unknown effect on a person's health.” It is important to understand that the effect is clinically uncertain but biologically certain in most of the cases. A VUS need not be considered for precision oncology purposes but it must be reported in all cases, and noted in case of germline assays. A VUS must be revisited by the concerned laboratory every 6 months to check for changes in classification status. The same, if found, must be intimated to the clinician and the patient.
Reporting of Whole Blood-Based NGS Assay
Reporting Germline Variants
Usually, a lower depth of coverage is acceptable for germline testing because most of the variants are either in homozygous or heterozygous. A minimum coverage of 30× is usually sufficient for germline testing. These reads should be balanced for both forward and reverse directions. NGS analysis on tissue cannot distinguish between somatic and germline variants unless paired germline samples are used. While reporting somatic NGS panel germline mutations should be suspected when VAF is 0.5 to 1.0 keeping in mind the cellularity of the tumor tissue in the sample. Such patients should be advised for clinical confirmation with germline samples following genetic counseling and proper consent. This is more so for genes that are established to be causing hereditary cancer syndrome and have established guidelines for clinical surveillance such as BRCA1 or BRCA2 or Lynch syndrome gene variants.
Clinical reports are the end products of germline laboratory testing and therefore, effective reports are concise, yet easy to understand. Reports should be written in clear language and should contain all the essential information about the test performed, including tabulated results, their interpretation, references, methodology, and appropriate disclaimers. These reporting elements are also covered by CLIA regulations and CAP laboratory standards for NGS clinical tests. To this end, several guidance documents and templates have been developed for reporting in accordance with the ACMG laboratory standards for NGS tests.
The methods and types of variants detected by the assay or genetic test should be provided in the report. Assay limitations for variant detection should also be noted. The methods section should include details of nucleic acid capture (e.g., polymerase chain reaction [PCR], targeted capture, or whole genome amplification) as well as techniques used to analyze the germline DNA (e.g., bi-directional Sanger sequencing, NGS, etc.) as this could provide necessary details to the health care provider for the need to carry out additional follow-up genetic tests. For example, WGS offers a thorough study of the complete genome, is objective and future-proof, but is more expensive and analytically challenging. Targeted sequencing, on the other hand, is less expensive, achieves more sensitivity, and completes the analysis faster, but it is biased and offers only a limited amount of future-proofing. This applies to tissue-based assays as well. The laboratory conducting the test may choose to add a disclaimer that addresses general pitfalls in testing such as sample quality.
Given the rise in the number of variants detected by genetic tests, presenting the variant and its associated information in a tabular format may be best for conveying crucial information. These components must include the following but do not have to follow the given order of presentation: gene name, variant nomenclature at genomic, cDNA and protein level, exon, zygosity disease (if known in the online mendelian inheritance in man [OMIM] or ORPHA database), mode of inheritance, and variant classification. Parental origin could be included if the details are available. Additionally, if specific variants are being analyzed in genotyping or sequencing tests, the laboratory should note the variants interrogated with their full description, historical nomenclature, and family history context if available.
The interpretation should contain the evidence supporting the variant classification according to the ACMG-AMP classification system, which would stratify variants into one of the five categories: pathogenic, likely pathogenic, variant of uncertain significance, likely benign, and benign. It is imperative to state whether the identified variants are likely to explain the patient's phenotypes fully or partially. The interpretation section should provide details of all variants described in the results section but may contain additional information such as whether the variant has previously been reported in the literature, present in disease or control databases, and minor allele frequency in healthy population databases. The additional information described in the interpretation section could include a summary of the results of in silico analyses and evolutionary conservation analyses. A discussion of decreased penetrance and variable expressivity of the disorder, if relevant and available, should be included in the final report. The report should also include any recommendations for clinicians for supplemental clinical testing and variant testing of other family members for segregation analysis, mode of inheritance, and variant re-classification. The references, if any, that contributed to the classification should be cited where discussed and listed at the end of the report.
Technical Aspects
For somatic panels, the aim for optimal DOS depends on the limit of detection (LOD) or the sensitivity of the assay that we aim for. Analytical validation using standard reference materials from Seracare/Horizon discovery/National Institute of Standards and Technology/Coriell Institute as well as proficiency testing material from CAP or European Molecular Genetics Quality Network are some options available to standardize an assay for its LOD or sensitivity. Once the LOD is established by analytical validation, the same needs to be reproduced with clinical validation using patient samples. Serial dilution of DNA or RNA, followed by library preparation of these serial dilutions, is one of the standard approaches used to derive the clinical sensitivity of the NGS assay for somatic variant calling.
-
Overall coverage: overall coverage of panel is measured as a mean coverage throughout the genomic region sequenced as part of the targeted panel. This is often measured at different depths starting from 1X, 10X, 50X, and 100X for somatic variant detection. As per the AMP/CAP, a minimum of 200X mean coverage depth is recommended to achieve a LOD of 5% for somatic variant calling.
Bioinformatic Pipelines
NGS requires extensive bioinformatic support for generation of a report. Validation and standardization of the dry laboratory segment is as important as the wet laboratory segment. The GATK pipeline developed by Broad Institute is one of the standard methods used in the clinical setting. Assessing the sequencing data's quality before analysis is crucial to guarantee accurate results. Sequencing data quality can be assessed using QC measures like Phred scores, read length, and base quality scores. FastQC and QualiMap are two tools for QC that are often utilized. After a basic assessment of the NGS data to filter out good quality reads that have a Phred score above 30, the raw reads are subjected to sequence alignment or mapping to the reference genome (healthy individual's genome).
Phred Quality score (Q score): it is a quality indicator used with sequencing by synthesis NGS chemistry. The matrix is a reflection of the accuracy of the base called by the platform.
Calculation: Q = − 10 log10 P, where P is the probability of error in base calling.
A Q score of 30 is ideal where the probability of a wrong base call is 1 in 1,000 and the accuracy of a base call is 99.9%.
Alignment scoring metrics that allow the calculation of true SNVs and INDELs as part of multiple sequence alignment using tools like BWA, Bowtie2, or STAR to the reference genome while generating the BAM and SAM files are an important step that needs to be thoroughly validated and verified. Following read alignment, the next step is to search the aligned reads for genetic variants like SNPs or insertions/deletions (indels). By comparing the aligned reads to the reference genome, variations can be found using software for variant calling like GATK, but other tools, like VarScan and Strelka2, are also used to verify GATK's results. The step's output is a VCF file containing the discovered variations. The next step after variant calling is to annotate the variants to ascertain their clinical relevance. This step entails determining if the discovered variations are known to be pathogenic, benign, or of unknown importance by comparing them to databases like ClinVar, COSMIC, or dbSNP. This stage can be completed using annotation software like ANNOVAR, SnpEff, or VEP. The variations must then be annotated before the results are finally interpreted, and a clinical report is produced. Assessing the importance of the discovered variations may entail analyzing the patient's clinical background and other pertinent data. A clinical report can be produced using reporting software like Ingenuity, Varsome, or Opal Clinical. Report generation and interpretation are extensively discussed below. It is crucial to remember that this is a condensed overview of the data analysis portion of a clinical NGS pipeline and that the precise tools and methodologies employed can change depending on the laboratory and sequencing platform. Many pipelines will also incorporate extra phases like QC filtering, rare variant filtering, and variant pathogenicity evaluation.
Validation
Validation is one of the important aspects of NGS clinical test before it is implemented for routine clinical practice in the clinic. The assay validation needs to be addressed as per the recommendations of ISO15189 (for medical laboratories), which aligns with CAP and National Accreditation Board for Laboratory Testing (NABL) in India.
Optimization and Familiarization Process
The choice of sample type and number of samples in each category are important parameters to decide before we initiate any validation. The scope of the test determines the choice of samples and the genomic alterations to be verified in the validation samples. A minimum of 20 clinical samples (unique clinical data points) is a requirement to address the clinical validation. In any general validation study, the robustness or familiarization of the assay is the first step to verify the reagents and consumables and their performance. The minimum amount of nucleic acid material required to get the desired result (true positivity) is also established here. Establishing clinical accuracy of the testing results is an important step following the robustness. This is achieved by inter-laboratory comparison of results from clinical specimens and is often processed in collaboration with a CAP or NABL-certified laboratory in India. Beyond this, one needs to establish the analytical and clinical specificity, sensitivity/LOD, repeatability, and reproducibility of the testing.
As part of this validation, one also needs to demonstrate the reproducibility and accuracy of the testing with inter-run, intra-run, as well as inter-individual (testing personnel) analysis.
CAP provides data specific to different sequencing technology platforms (by Illumina or Thermo Fischer Scientific). This blinded survey helps in the assessment of bioinformatics pipelines/workflow, which varies across multiple laboratories, to assess the performance of some of the important factors: (1) basic QC that ensures only good-quality sequencing data are only considered for further downstream analysis; (2) variant calling for both somatic and germline workflows.
Basic Assay Validation
Platform
There are two major technology platforms for massively parallel sequencing of DNA and RNA, namely sequencing by synthesis (SBS by Illumina), and ion torrent semiconductor sequencing (by Thermo Fisher Scientific).
Method (Amplicon-Based/Hybrid Capture)
Targeted sequencing of the region of our interest could be achieved by two methods: (1) PCR-amplicon-based approach and (2) hybrid capture-based approach.
For hotspot panels, where the regions of interest are predetermined, the amplicon-based approach is a scalable and cheap option. This is one of the early methods of choice for NGS panels in oncology, which had seen great success rates from all types of FFPE DNA (ranging from 70 bp [heavily degraded sample] to 2,000 bp [average size of a fragment from a good quality processed FFPE tissue block]).
Critical factors influencing the success of variant detection from amplicon sequencing: it is always ideal to have the primers designed such that the variant of interest is in the middle of the PCR product. Multiplex PCR step is the heart of the amplicon-based approach that is critical to any successful validation. Primer design plays an important role, and nonoverlapping PCR products/staggered design is one of the strategies adopted to ensure complete coverage of the region of interest if the panel is aimed at sequencing complete CDS or complete exons in case of hotspots. This factor is measured, in amplicon-based panels, by verifying the uniformity in PCR amplification across primer pairs in the multiplex PCR. The choice of enzyme, buffers, its molarity, and additives used in multiplex PCR master mix determines the yield of PCR. When the panel size increases beyond 1 Mb, although technically amplicon-based approach could be feasible, it may not be an economically viable option as compared with the probe-based approach.
In the probe-based approach, the template DNA is PCR amplified to increase the concentration of template copies. Following this, the genes of interest regions are captured by hybridization assay. The captured template copies are further PCR-amplified and subjected to sequencing. Unlike in the amplicon-based method, where we could have a wide range of PCR amplicons, in the probe-based approach, the template remains untouched, after the initial fragmentation step. All the available commercial probe design algorithms in general range from 60 to 120 bp. Post-capture, the template fragments are further enriched with PCR step before subjecting the libraries to NGS.
Probe-based chemistry could be a challenge in poor-quality FFPE specimens. This could be an adduct formation in the tumor DNA that leads to an improper binding or presence of any other impurities that may impact the hybridization process.
Description in the Final Validation Report
The brief validation process or peer-reviewed publication containing the same must be included separately toward the end of every NGS report.
Raw Data Storage, Consent for Reuse, Traceability
Storage of sensitive data, and patient consent for utilization of the data generated outside the purview of the primary indication, is an important concern to be addressed by the stakeholders and end users of the technology. We recommend that the laboratory and health care facility adhere to clauses provided in the Digital Personal Data Protection Act 2023 published as the Gazette of India CG-DL-E-12082023–248045 under the Ministry of Law and Justice: (https://www.meity.gov.in/writereaddata/files/Digital%20Personal%20Data%20Protection%20Act%202023.pdf).
The benefits of universal, de-identified genetic data sharing promote the exchange of information, research, and identification of specific mutations/alternations in different ethnic groups. Such information in a centralized manner and internationally could help identify new pathogenic genetic alterations/targets for future drug research. This will also influence daily practice in the community.
The issues with such data storage such as identifier anonymization, consent of the patient/carrier, quality of stored data, data storage and transfer overseeing authorities, encrypted access to data, and the logistics are to be addressed by a national body, as it is outside the purview of our guideline.