Thursday 27 April 2017

Cancer On The Lung

matthew myerson:all right, thank you, lou, for that very kind introduction, and it's a real honor to behere at this symposium and just to be part of this just wonderful explosion of data generationand data analysis that has been tcga over the last several years and really coming tomore and more fruition this year. so i'm going to be talking on behalf of, first,of my co-chairs, ramaswamy govindan and steve baylin, both of whom are also here at thismeeting. and i'm trying to get to the next slide here. let's try that. there we go. onbehalf of all of our colleagues in the cancer genome atlas lung cancer analysis workinggroup, many of the prominent members of whom are named here on this slide in front of us,and i'll be thanking various people individually

as we go through the presentation. i think as all of you are very familiar with,lung cancer accounts for over 25 percent of cancer deaths in the united states each yearand is the leading cause of cancer death both in men and in women in the united states.in total, lung cancer kills more than 150,000 americans each year and more than one millionpeople worldwide. the major lung cancer histologies, as, again, i think most of you are very familiarwith, are adenocarcinoma of the lung, squamous cell carcinoma of the lung, and small celllung carcinoma. we now have projects in each of areas as part of tcga, and as most of youknow, we've recently published the first squamous cell lung carcinoma marker paper from tcgain nature earlier this fall.

among these cases, lung adenocarcinoma accountsfor on the order of 40 percent of lung cancer diagnoses within the united states and about65,000 deaths per year in the united states, and the percentage may be even a little bithigher worldwide. and so the number of deaths per year from lung adenocarcinoma probablyis over 500,000, and while lung cancer is generally associated with smoking, reallylung adenocarcinoma, uniquely among lung cancer histologies, does often occur in non-smokers.and lung adenocarcinoma of non-smokers is especially prevalent in women, in youngerpatients, and in patients from east asia or of east asian origin. lung adenocarcinoma has really become a paradigmfor molecular subtyping, as in recent years

the treatments for lung adenocarcinoma haveshifted from histology-based strategies to molecular-based strategies. and we've reallymade major advances in treatment for lung adenocarcinoma with targeted inhibitors ofboth egfr, such as gefitinib or erlotinib, and alk, such as crizotinib, thanks to genomicdiscoveries. with the -- my group fortunate enough to be able to participate, bill sellers,bruce johnson and i, in the discovery of egfr mutations in 2004, along with the palamvarmus[spelled phonetically] group, and the groups of tom lynch and daniel haber. and this is-- shown here is just one example a patient with lung adenocarcinoma with a somatic egfrdeletion mutant in exon 19, and you can see here multiple nodules in the right side ofthe lung in this transverse ct section, and

then complete clearing of these nodules aftertwo months of erlotinib treatment, a slide from our colleague bruce johnson at dana-farbercancer institute. there have been a number of previous comprehensivegenomic studies of lung adenocarcinoma, and i just mentioned a few of these here. a copynumber study by weir and colleagues, and a mutational study by li ding and gaddy getzand colleagues, are both part of this the tumor sequencing project spearheaded by thenhgri as a kind of pre-cursor to tcga, identifying amplifications of genes such as nkx2-1 andtert, and mutations of nf1, atm, and apc, among others. the directors' challenge -- ncidirectors' challenge expression classification project, and a recent report by ramaswamygovindan and colleagues with rick wilson at

washington university on whole genome sequencing,identifying smoking and non-smoking signatures. and the work of marcin imielinski, alice berger,and colleagues, working together with my group at dana-farber and the broad, both whole exomeand whole genome sequencing, identifying recurrent mutations in rna splicing genes includingrbm10 and u2af1. and then finally, a recent report from jun sung seo's group in seoulnational university identifying transcriptome alterations including recurrent met splicingalterations. so this has led to the definition of a largenumber of potential therapeutic targets, and this is a paper from -- adapted from a paperby pao and hutchinson showing kras mutations, egfr alterations, alk fusions, and a numberof other alterations, erbb2 mutations, and

most recently the identification of ros1 fusionsand the kif5b-ret fusion reported by numerous groups earlier this year. but it also saysto us that the leading driver in many lung adenocarcinomas remains unknown and stillremains to be uncovered, and this is one of the major goals of our tcga effort. if we look at our current project status,there were actually -- we've actually reached a full accrual of the estimated 500 casesto the bcr, but there were 303 samples for which there were comprehensive molecular dataat the time of our data freeze, october 2nd, but really working very closely with billtravis at mskcc who had histological confirmation, we excluded many samples due to pathologyreview, and we're left with 230 samples for

the lung adenocarcinoma data freeze. and theremaining cases that were excluded as not being adenocarcinoma will nevertheless beincluded in a subsequent pan non-small cell lung cancer cell study from tcga that willalso include cases initially reported as squamous cell lung carcinoma and excluded. so we have high quality data across multipleset platforms for all of the samples, including dna sequencing, rna sequencing, snp-arraybased, copy number, methylation array data, proteomic analysis, and fusion discoveredby low-pass sequencing and rna sequencing. we expect to have 38 sample pairs with wholegenome sequencing data, which i won't be discussing today. our first face-to-face meeting is tomorrow,and our goal is to prepare data for a manuscript

submission sometime early next year. first, i want to speak about copy number analysisof lung adenocarcinoma, led by andy cherniack and gaddy getz, as part of the broad institutegenome characterization centers. so i just want to point out, first of all, you can seeoverall here red is copy number gain, blue is copy number loss, and white is neutralish;samples in this dimension, chromosomal position in this dimension. you can see many overallsimilarities between lung squamous carcinomas and lung adenocarcinomas. gain of chromosomearm 1q, gain of seven, loss of 8p, gain of 8q, loss of nine, et cetera. probably themost striking difference is that chromosome arm 3q is almost never gained at lung adenocarcinomaand is frequently gained in squamous cell

carcinoma. if we look at focal alterations, first i'llspeak about the deletions. the predominant focal homozygous deletion in lung adenocarcinomais the cdkn2a, cyclin-dependent kinase inhibitor gene locus, and if we look at focal amplifications,these include a number of cyclin-dependent kinases, cyclin d1 and cyclin d3 as well asthe cyclin-dependent kinase gene cdk4 itself. it includes the telomerase genes, the catalyticand subunit tert, and the rna subunit terc. it includes receptor tyrosine kinase genes,egfr, met, erbb2, signal transduction downstream kras, and the nkx2-1 lineage-specific transcriptionfactor, and finally, the myc transcription factor.

if we look at exome and rna sequencing analysisled by juliann chmielecki, mara rosenberg working at dana-farber and broad, matt wilkersonat unc, marcin imielinski, bryan hernandez, michael lawrence, neil hayes, and gaddy getz,we see first this is the famous slide that gaddy and mike lawrence frequently show. it'sa variant showing the variance in mutation frequency; so different types of cancer acrossthis dimension, mutation frequency on a log scale in this dimension here. and you cansee the highest mutation rate tumors are the carcinogen-induced tumors, melanomas, squamouscell lung carcinomas, and lung adenocarcinomas. this very high somatic mutation rate posesa major problem in identifying significantly mutated genes. and we heard from petar stojanovat the broad yesterday about approaches to

really identify significantly mutated genesand overcome these challenges, and similarly, we heard today from nickolay khazanov fromcompendia about other approaches for this. so just some issues that we see. first ofall, known recurrently mutated genes, such as erbb2 or î²-catenin, do not show up assignificant in this dataset, at least to date, regardless of the method that we've used.we see a lot of spuriously mutated genes, genes like olfactory receptors. this can beeliminated by a number of signal to noise analyses, as well as expression filtering,but i think we'll still need to consider a variety of alternative approaches includinginclusion of functional significance analysis and two-stage statistical analysis. furthermore,in the end, a much larger sample size may

be required for elucidation of the full populationof lung adenocarcinoma causative mutations. this is a list of the top 21 mutated genesin lung adenocarcinoma, expression-filtered, generated by juliann chmielecki and mara rosenbergand their colleagues. and i just want to point out at the top a number of known genes. mostof these present previously in the ding and getz manuscript, ding and getz et al manuscripttp53, stk11, kras, egfr, rb1, braf, so, again, recurrent drivers from that pie chart, kras,egfr and braf are all here. others identified by other more recent papers such as imielinskiet al paper, rbm10, arid1a, and u2af1. but i want to highlight here with stars someof the candidate novel genes, and just point out a couple of these here. bcl9l is homologousto the b-cell lymphoma translocated gene,

bcl9, and is recorded to encode a proteinwhen interacting with î²-catenin, also frequently mutated in lung and other cancers. mga encodesa reported suppressor of the myc pathway, and this gene is recently been reported tobe subjected to inactivating mutations in b-cell leukemias and lymphomas. and mki67ipencodes a protein that interacts with ki-67 the well-known histological proliferationmarker, which is encoded by mki67, which has been found in the tcga study to be recurrentlymutated in endometrial cancer. and this is just a kind of diagram of correlationof gene mutations. just a couple features that jump out here. first of all, egfr mutations,frequently insertion/deletion mutations, shown in yellow here, frequently in samples withlow overall mutation rates. p53 and lung adenocarcinoma,

as in both -- as in squamous cell lung carcinoma,as in small cell lung carcinoma, the most frequently mutated gene. and the other pointthat i want to make on this slide, though it's a little bit difficult to see here, ismutual exclusion between mutations and kras, egfr, and braf. we also see recurrent loss-of-functionmutations in swi/snf complex chromatin remodeling genes, most notably arid1a and smarca4. smarca4originally reported to be mutated in lung adenocarcinoma by moses, sanchez, desperez[spelled phonetically] from barcelona. expression-based classification of lung adenocarcinoma,really some elegant work done by matt wilkerson and neil hayes at the university of northcarolina. they did expression clustering, showing reproducible classes, the bronchioid,magnoid, and squamoid classes, and using a

subtype predictor based on their previousstudy of over 1,000 lung adenocarcinoma expression profiles, they identified the same subtypesin the tcga dataset. and then went on to perform integrative analyses, and you can see thatthe bronchioid subtype are most enriched for those patients who are non-smokers. they aremost enriched for egfr mutation as well as alk, ret, and ros fusions, which are alsooccasionally seen in the squamoid subtype. in addition you can see the smarca4 mutationsare -- and keap1 mutations are both enriched in the magnoid subtypes. low pass whole genome sequencing analysis,from raju kucherlapati's group at brigham and women's hospital and harvard medical school,work led by angela hadjipanayis, in cooperation

with matt wilson and neil hayes at unc. andi should also mention here, infusion analysis, xiaoping su at the md anderson cancer center. first, from rna sequencing, able to identifyknown fusions, alk, ros1 and ret fusions, in about 4 percent of cases, as i mentionedpredominantly in the bronchioid subtype. one of the more intriguing novel fusions, identifiedby angela hadjipanayis' analysis, is a fusion between vmp1 and the ribosomal protein s6kinase sub-unit, rps6kb1, in several percent of cases, and these fusions generally preservethe bulk of the catalytic domain of the protein kinase and are potentially activating. inaddition, there's some intriguing peptidase fusions as well that also bear further exploration.

i want to briefly touch on dna methylationanalysis, led by leslie cope, ludmila danilova, and steve baylin and jim herman at johns hopkins.and i think one of the findings to date, we saw cdkn2a, and this is similar again to squamouscell lung carcinoma, cdkn2a one of the most frequently inactivated genes by mutation,one of the most frequently inactivated by copy number alteration and by methylation.so we see, again, multiple means of inactivation of cdkn2a. work of gordon robertson and andy chu at thebritish columbia cancer agency has identified microrna signatures, which, in particular,expression of mir-21, defines a large subset of lung adenocarcinoma.

finally, a group led by alice berger, ericcollisson, who's really been playing a major role in the entire project, william lee, andmark ladanyi, has examined mutations in those tumors that lack receptor tyrosine kinaseand downstream signaling events. ras mutations, egfr, erbb2 and braf mutations, and the alk,ret, and ros1 fusions, and what they found by looking at genes mutated in the oncogenepositive group, uniquely enriched in the oncogene-positive group so far was rbm10; uniquely enrichedin the oncogene-negative group, interestingly, was nf1, suggesting that nf1 loss of functionpotentially could substitute for the receptor tyrosine kinase pathway signaling oncogenegain of function. and i think this is the first analysis that's really been poweredsufficiently and has comprehensive enough

data to address this question. finally, in terms of integrative cross-platformanalysis, i'm just going to show one slide out of many, work of a number of people includingchad creighton, eric collisson, ron bose, niki schultz, ted goldstein, and sam ng. justwant to point out the significant well-known, but -- and i apologize a little bit for thefuzziness of this slide -- a significant de-regulation of multiple genes in the rtk/ras/raf and pi3kinase pathways, including, in addition to the genes that we've shown, mutations of metand of cbl. finally, gordon mills, lauren byers, and lixiadiao have been doing a reverse phase protein array analysis, which has given groups thatwill be able to compare with the mutational

analysis, but in particular, groups with avery strong signature of receptor tyrosine kinase activity, map kinase pathway activity,and dna repair pathway activity. so just conclusions thus far. both lung adenocarcinomaand squamous cell lung carcinoma have very similar copy number profiles. there's a veryhigh mutation rate, which makes it a real challenge to identify novel candidate mutatedgenes including mga. three distinct expression subtypes identified from rna sequencing datawith interesting mutational correlations. multiple fusions including a number of novelfusions expressed in lung adenocarcinoma. multiple mechanisms for a cdkn2a inactivation.distinct microrna and proteomic clusters that we will be working to integrate with the expression,as well as mutational and copy number based

clusters. and finally, mutational differencesbetween the oncogene-negative and oncogene-positive subtypes, including the enrichment of nf1mutation in the oncogene-negative group. there's an enormous number of people to reallythank for their work in this project. i just want to call out a few people. again, my co-chairs,ramaswamy govindan and steve baylin; angela hadjipanayis, doing the fusion analysis; juliennechmielecki, who's led the exome sequence analysis; carrie sougnez, who's put together the datafreeze; and neil hayes and matt wilkerson leading the rna sequencing analysis; and spectacularcontributions from bill travis on the histopathology side. thank you very much for your attention. [applause]

lou staudt:questions for matt? male speaker:yeah, there's been a number of groups interested in the possible er positivity of some subsetsof lung adeno. is there any signal of this in the tcga dataset? matthew meyerson:that's a great question, and there's evidence from jill siegfried's group and others, havea role for er signaling and lung adenocarcinoma. we haven't asked that question, but we haveour face-to-face meeting tomorrow and we will put that on our list of topics to examinein the rna sequencing data. lou staudt:i had a question, matt. so the expression

subgroups, do they indicate anything aboutdifferent types of lung cells that they may have come from? are these embryonic signatures,what do they look like? matthew meyerson:lou, i think that's a great question. the bronchioid subgroup, i think, has a lot ofexpression features of alveolar type ii pneumocytes, and are, you know, likely to represent growthfrom that lineage. i think i would turn to matt wilkerson for comments on the other subtypesif he wants to add anything. so if he does, he can come forward. male speaker:i have a quick question. so in your previous paper you reported the difference in mutationspectrum between smokers and non-smokers.

i wonder if you also see a difference in expressionsignature, expression subtype for smokers. matthew meyerson:i think that's a great question as well. we're still roughly -- and i don't have the numbersat the tip of my fingers -- roughly 15 percent of these cases are from never smokers. we'rejust starting to do analysis between the never smokers and the smokers, and i think expressiondifferences as well as mutational differences will be important to look at. male speaker:just a follow-up question, so do you see any difference between the smokers and the non-smokersregarding fusions? matthew meyerson:i'm sorry -- regarding mutations?

male speaker:fusions. matthew meyerson:regarding fusions. the well-known recurrent fusions, the alk, ret, and ros1 fusions, generallyarise in tumors from non-smokers. but we haven't systematically queried the incidence of fusionsbetween the two subtypes -- between the two populations.

No comments:

Post a Comment