Work Package 3
Lead Beneficiary BRFAA
Production of transcriptomics/epitranscriptomics, proteomics, peptidomics datasets for elucidation of MM onset and progression.
Description of work:
Representative samples (total n=180) from the following groups will be analysed by omics approaches MGUS n=60, sMM n=60 and active MM n=60. The 60 samples in the MGUS and sMM groups will be composed of the following subgroups: 30 stable and 30 that progressed to active MM based on at least 5-year follow-up.
The analysis of bone marrow plasma cell samples by transcriptomics and proteomics will provide insights on molecular features associated with the onset and progression of MM.
The urine samples (n=180) analysis will allow creation of peptide biomarker panels for monitoring MM onset and progression.
Lists of omics molecular features with quantity ratios will be generated for changes associated with the onset of active MM by comparing: i) MGUS with sMM, MGUS with MM, and sMM with MM, ii) samples from MGUS progressors with samples from MGUS individuals that did not develop active MM, iii) samples from sMM progressors with samples from sMM patients that did not develop active MM
Task 3.1: Transcriptomics analysis of Bone Marrow Plasma Cells (NKUA, M6-30)
Following RNA extraction and ribosomal RNA depletion in the 180 individual samples, RNA libraries will be constructed and subjected to next-generation sequencing (NGS). Initial RNA-seq data will be analysed through a standardized and well-documented pipeline by usage of several bioinformatic tools. It will involve alignment of raw reads to the human reference genome using Partek® Flow®, followed by assembly of long RNA molecules, then quantification of mRNA, long non-coding RNA and small RNA (focusing on microRNA) molecules, using Partek® Genomics Suite®. After obtaining read count tables for all RNA-seq data, differential gene expression will be determined based on the available clinical annotation through DEseq2 ; thus, selection of RNA molecules composing molecular signatures will be done. Targeted 3rd generation (long-read) sequencing based on nanopore technology and relevant bioinformatic analysis using in-house built PERL scripts will be used to determine the full-length sequence of selected mRNA and long non- coding RNA molecules, to identify alternative splicing products of genes of interest. Quantitative real-time PCR arrays and/or multiplex digital PCR will be used to quantify each component of the RNA signature. Additionally, mRNA and microRNA sequencing data will be integrated to explore complex biological relationships and pathways between genes, using Partek® Pathway™.
Associated deliverables: D3.1
Task 3.2: Proteomics analysis of Bone Marrow Plasma Cells (BRFAA, M6-30)
BMPC samples will be analyzed by BRFAA via nano flow chromatography coupled to a Bruker timsTOF Pro 2 mass spectrometer operated with Data Independent Acquisition (DIA). Abundance will be determined by label free quantification and protein functional annotation will be performed using multiple bioinformatics tools as applicable to place findings within the context of current MM knowledge and predict dysregulated processes. Lists of proteins with quantity ratios will be generated for changes associated with the onset of symptomatic MM by performing the comparisons mentioned above for MGUS, sMM and MM groups.
Associated deliverables: D3.2
Task 3.3: Epitranscriptomics analysis of Bone Marrow Plasma Cells (CNRS, M6-30)
After BMPC RNA purification, detection and quantification of RNA modifications (26 modified nucleosides) by mass spectrometry (LC-MS/MS) will be conducted by the clinical proteomics platform of the University Hospital of Montpellier . 400 ng of RNA will be digested by 5 U of RppH. Decapped mRNA will then be digested by 1 U of Nuclease P1. Dephosphorylated nucleotides will then be filtered and the solution will be injected into LC-MS/MS. The nucleosides will be separated by reverse phase ultra-performance liquid chromatography on a C18 column with online mass spectrometry detection using Agilent 6490 triple-quadrupole LC mass spectrometer in multiple reactions monitoring (MRM) positive electrospray ionization (ESI) mode.
Using this approach, CNRS will decipher epitranscriptomic modifications in 180 individual samples including MGUS, sMM and MM. The data will be compared to the epitranscriptomic characterization from normal B to plasma cell differentiation stages. Using RNA-seq data, we will characterize the enzymes involved in the epitranscriptomic remodeling during myelomagenesis and MGUS/sMM progression to symptomatic MM. Mass spectrometry-based RNA mark quantification will be performed to monitor the evolution of the epitranscriptome in myelomagenesis and disease progression and will be integrated with other omics data in order to identify the downstream transcriptional impact involved in the progression to active MM.
Based on the most significant identified RNA modifications, we will perform MeRIP (Methylated RNA immunoprecipitation) sequencing to analyze the target RNAs involved in MGUS/sMM progression to MM. Associated deliverables: D.3.3
Task 3.4: CyTOF and multispectral flow cytometric analysis of the Bone Marrow microenvironment-immune system interplay (BRFAA, UKW, NKUA M6-30)
In total, we will use CyTOF technology to analyse 150 Bone Marrow biopsies with matched PBMC samples (n=50/ group). Power calculations were performed to derive a sample size of maximum 50 per comparison group which would allow detecting a 20-30% difference in cellular abundance between MGUS, sMM and MM cohorts, assuming 50% CV, with 90% power. This sample size will ensure robustness for the implementation of the different algorithms we plan to use, like CITRUS that requires a minimum of n=8-10 samples per comparison group. Further, given the sample sizes used in other proteomics and transcriptomics studies, a sample size of 50 seems reasonable and comparable. Cells will be processed and stained with a pre-validated high dimensional mass cytometry panel for the detection of targets for malignant cells, bone marrow microenvironment and immune cells. We will scale the yield of acquired cells to have at least 100 events for the rarest population to be examined. Samples with low cell recovery and low viability (< 85%) will be excluded from the analysis. Stained cells (in batches of 5-10 samples) will be acquired with a Helios-CyTOF instrument in the BRFAA mass cytometry unit. Acquired data will be normalized and cleaned for gaussian parameters. Then, nucleated live singlet events will be selected to feed downstream analyses. We will perform dimensionality reduction, clustering and differential analysis using widely used and verified algorithms such as viSNE, FlowSOM, Phenograph, SPADE, CATALYST, diffcyt and CITRUS.
Specific multicolor panels will be designed for validation of the CyTOF results by multispectral flow cytometric analysis in an independent cohort of 150 samples (50 MGUS, 50 sMM, 50 MM)
Associated deliverables: 3.4
Task 3.5: Peptidomics analysis of urine samples (MOS, ATOS, M6-30)
Urine samples (n=180) are to be analysed by CE-MS in MOS. Sample preparation will be conducted based on the established SOP . CE-MS analysis and data processing are performed according to ISO13485 standards yielding quality controlled urinary data sets. This allows to obtain highly comparable data, which are further integrated in the database system, allowing to perform comparisons in the wide spectrum of clinical applications. Peptides are deposited, matched, and annotated in a Microsoft SQL database, allowing for further analysis. Peptide sequences are determined by a Q Exactive mass spectrometer. Protein matching and data analysis is based on Proteome Discoverer 2.4. Matching of the amino acid sequences with the CE-MS acquired ion peaks is based on mass correlation with liquid chromatography- tandem mass spectrometry analysis (LC-MS/MS). Further validation of the obtained peptide identifications is based on the assessment of their charge and the CE-migration time results.
CE-MS analysis to develop urinary peptide biomarker panels (classifiers)
Case-control statistical comparisons will be conducted to investigate significant specific-peptide biomarkers per clinical scenario. The datasets will be divided into discovery and validation sets, according to the ‘2/3 – 1/3 rule’. The peptide profiles will be compared for differences in the individual peptides urinary excretion level by applying the Wilcoxon rank sum test. Only significant identified peptides (p<0.05 after Benjamini-Hochberg adjustment) are selected for further consideration as significantly associated with the initial hypothesis. For those, a fold regulation threshold of 1.2 is applied to further select for highly valid peptide biomarkers associated with the health to disease transition in MM. The urinary peptide marker panel is optimized in the training set, using the SVM (Support Vetor Machine)-based MosaCluster software (version 1.7.0). Validation is performed in an independent validation set. The sensitivity and specificity estimate for the SVM-based peptide marker pattern are calculated based on the number of correctly classified samples. Receiver operating characteristic (ROC) plots and the respective confidence intervals (95% CI) are calculated in MedCalc 188.8.131.52 (Mariakerke, Belgium). Statistical comparisons of the validation classifications scores between: i) the symptomatic MM cases and MGUS/sMM controls and ii) MGUS/sMM progressors with MGUS/sMM individuals that did not develop symptomatic MM, of the validation cohort, will be performed by the Kruskal-Wallis test using MedCalc3,6.
Associated deliverables: 3.5