daniel katsner:all right. hi, i’m dan katsner. i’m the scientific director of the national humangenome research institute, and it’s my enormous pleasure to welcome you this afternoon tothe jeffrey m. trent lecture. jeff trent, as many of you know, was the first scientificdirector of the nhgri, and he was, of course, the founding scientific director, startingin 1993, and really, jeff, over the course of a nine-year span, built the intramuralresearch program of the nhgri into, i think, a real powerhouse of human genetics and genomics.jeff, of course, went on, after his very successful scientific directorship, to be the founderof tgen, the translational genomics research foundation in phoenix, arizona. jeff couldn’tbe with us here today, but is still very actively
involved in his chosen field of cancer genetics,and in 2003, the jeffrey m. trent lecture in cancer research was initiated by the intramuralprogram of the nhgri in order to invite leading figures in the world of cancer genetics whowould embody some of the same ideals of energy, and enthusiasm, and imagination, and reallyraw intelligence to the world of cancer genetics. and so, it’s my enormous pleasure to introducethis year’s jeffrey m. trent lecturer, and that is dr. stephen chanock, who is the directorof the division of cancer epidemiology and genetics of the national cancer institute,and steve is truly someone who falls into that mold of the jeffrey m. trent lectureship.steve is someone who has been, for a number of years, the leading figure in the studyof susceptibility loci for human cancers,
who has been a leader of the efforts of thecancer institute, and is also, i would say, a renaissance man, and a real mensch, as well.so, dr. chanock got his undergraduate degree, actually, at princeton university in 1978,and his a.b. was actually in music, and i understand that he may be breaking into songat some point -- [laughter] -- in the course of his presentation. in anycase, he went on to harvard medical school, and got his doctorate in medicine in 1983,and then went on to do his further clinical training in boston, in pediatrics, and inpediatric infectious diseases, and in pediatric hematology oncology, and he did research atboston children’s hospital, and the dana-farber
institute, with the legendary and still-activestuart orkin, studying molecular biology and human genetics, really, at its inception ofthe modern era. so, dr. chanock then came to the nih in 1991 as a medical staff fellow,or senior staff fellow, i guess it was, in the pediatric oncology branch. he went onto become tenure-tracked, and in 2001, received tenure. and over the course of the subsequentyears, he’s advanced, in terms of his career, in the national cancer institute, and in 2013-- august of 2013, he was named to his current position as the director the dceg. really, he has, as i’ve mentioned, doneenormously in terms of his science, and in terms of his administrative things, but ithink that something that really stands out,
in my mind, is something that’s a real measureof the man, and that is that since 1995, he has been the medical director of camp fantastic,which is a camp for children with various pediatric cancers that’s held every summerfor a week, and it’s run by the national cancer institute, and special love, incorporated.and so, anyways, i think that that’s just a telling sign of just the complete man, dr.steve chanock. so, steve, take it away. [applause] steve chanock:thank you very much for that lovely introduction. i want to assure you, i will not break intosong, i will not play the guitar, and i will leave it at that.
it is really an honor to be recipient of thislectureship named in honor of jeff trent, someone i know very well, who is a very dynamicand lively character, who i worked with on and off over the years. and energy and passionare two words that certainly come to mind when thinking about jeff, and his vision forwhere and how cancer research should go forward have been infused with those two, i think,very important qualities, along with his scientific rigor, and i think, you know, i’m sorrythat he’s not here, but, you know, i’ve corresponded with him quite a bit by emailin the last few days, and he was very happy to know that someone who was interested ingerm line, and primarily the germ line, would be speaking, because that’s a part of jeff’sportfolio. and today i will speak almost exclusively
about the germ line, but i think, in the contextof understanding how the germ line informs our understanding of the different types ofcancer that we encounter. so, let me start with question of heritabilityin cancer. we know that, going back to 1866, that paul broca, the famous neurobiologist,had observed heritability based in his own family, with breast cancer, with a numberof women in his family: sisters, aunts, mothers, and the like, and had actually published anddescribed this familial cluster, and then, in the interim, there were, you know, legionsof studies of twins’ families and siblings studies began to really assess what wouldreally be the risk if you had one particular cancer in a family, that another family memberwould have that, and that’s still an important
bolt-work, i think, how and in what way weprosecute the question of the genetic basis of cancer. in 1969, joe fraumeni with fredli observed the familiar clustering of multiple cancers in families, not just one cancer,but a number of them, and it subsequently identified that the mutations and the tp53gene are responsible for a high fraction of the li-fraumeni patients, and they were ascertainedthrough these familial studies, and i’m going to come back to this question of lookingat li-fraumeni-like mutations in the population, particularly in osteosarcoma. and then, alknudson postulated two hypotheses for retinoblastoma, really a central tenet i think of how andin what way we look at germ line genetics, thinking of the diploid as the model, but,as i’ll talk a little bit later, the genome
does come apart, and we certainly know thatthere are many different ways in copy number states can vary. and then, finally, the chaseusing the technologies of the 90s -- the 80s and 90s, led to the first positional cloningof the familial breast cancer gene in ‘91, and then subsequently, by ‘94, it was describedas brca1. this is in the background of looking at heritability from an epidemiologic pointof view, where twin registries were really quite valuable -- or have been quite valuable,so this is an important table that i will come back to when i show you the current statusof how we are looking at the genome-wide association studies, to be able to explain a fractionof what we think would be the heritability. but, the key issue here is looking at theheritable factors for prostate, colon, bladder,
breast, and lung: five of the major cancersthat we face, and this issue of shared versus non-shared is an important question, whichreally underlines, we think, the importance of the germ line genetic susceptibility in-- you know, in the context of the different kinds of exposures, and i’ll come back tothat. so, at this point, i would say, why do westudy germ line susceptibility? well, we can try and explain the heritability of cancer.we certainly know about it clustering in families and distinct populations, but now, with thegenome project behind us, and the annotation of hapmap, and knowing what genetic variationlooks like in the common, we now have the tools to begin to really ask the questionin sporadic cancers, which represent probably
90, 92, 93 percent, depending on your definition.how can we explain genetic predisposition? we know that within families there are increasedrisks of breast and colon -- you know, for one and a half to twofold increase, just inthe general population. but i think the ability to look at genetic susceptibility is reallycrucial to begin to try and pull apart the many, many things that are contributing togenetic susceptibility. and then, of course, the value of this in using it for risk assessmentfor individuals, which, i would say, we are very far away from being able to do otherthan the familial, and i think we have to be very careful not to oversell precisionprevention, precision medicine, at this time, with respect to predicting individuals’specific risk for cancer. that’s a place
where we all want to go, but we have a lotof distance to traverse. but i think the population-based screening issue becomes very important, andhow we use the information that we have now, and that we’re about to have in front ofus, to begin to think about stratification, that may have no public health implications,for using screening trials, screening tools, and the like, and i think genetic variationis helpful for that. we get tremendous insights into the etiologyof cancer, the opportunity to look at gene and environment interactions, and, particularly,as i mentioned at the beginning, how the germ line informs somatic alterations. and then,of course, everyone is excited about pharmacogenomics, but this is a very difficult thing to pursue,and it’s, you know, we’re sort of at odds
with most of the industry because it’s notin their best interest for us to identify the 30 percent of the women who should getherceptin and exclude the 70 percent from being prescribed that drug. so, this is somethingthat sort of at a cultural level, as well as a scientific level, is really lagging behind.there are some very exciting examples but i am not going to really focus my talk onthat today. i do want to start my talk, and really separatethe spaces. when we think about cancer genetics, we really have at least four different spaces,as demarcated here. one is the germ line, which is where i’m going to spend most ofmy time talking, but we also have the somatic. those are the alterations, the actual tumorswe see in the ncis-like sequencing from tcga,
where we see all of the large-scale eventsthat have taken place. we know that we’re heavily in the range of discovery, but withvery little clinical action at this time, and i think we have to be very careful innot overselling what we can do with, particularly, the germ line information other than in veryselect circumstances. i think we all want to get to the next stage but we still havequite a ways to go. so, when we think about the germ line -- this is sort of the outlineof what i’m going to talk about today -- we know of at least 110 to maybe 115 cancer syndromes,where we know that there is a very important mutation in the germ line, that explains thefamilial clustering of cancers in a family or sets of families, and these are very importantin giving us very good insights into the cancer
biology cancer drivers, and i’ll come backto that. through the genome-wide association studies, which i’ll talk about, we knowthat there are some 470 regions, and there are literally thousands more to be discoveredas we supersize, and i try to make the argument for why we should continue doing that. weknow that the somatic -- when you look at the tcga and the icgc, the cancer genome atlas,the nhgri and nci joint effort has been a resounding success in beginning to get a portraitof the landscape of genetic alterations, and we see that there are these drivers, thosethings that we think are very important, and we use both frequency and biologic investigationin the laboratory, but we also recognize that there is heterogeneity and [unintelligible],that are real challenges.
now, if we go to the clinically actionable,of all the things that we see on the upper-left-hand corner, only in a small faction can we reallygo into the clinic and advise or talk to someone in what, we think, is a really, you know,sustainable and supportable position. in the same way, only a small number of agents havereally come to market -- it’s larger than this, this is an example of the targeted therapy,in where, i think, the precision medicine initiative that dr. varmus has certainly beentalking about, as a very important way in being able to target the alterations in thetumors that we would be able to actually intercede, and either stop or slow down the growth ofthose particular cancers. and we do this in the context of looking at tcga, where we’vehad extraordinary lessons that have come from
looking up and lining up all of the differentmutations, and seeing the spectrum. we can see, literally, a four-order-of-magnitudedifference in the number of mutations, and then the types of mutations. how many arereal drivers versus how many are passengers, and this is something that’s an importantelement that’s come out of the somatic sequencing, that i’ll come back to in the germ linein a minute. so, for instance, if we look at lung adeno c-a, lung adenocarcinoma, oneof the most common cancers, heavily driven by smoking, the attributable risk is somewhere,depending on who you’re talking to, 75 to 85 percent for this particular cancer, andwe can explain a fair number of the cases that we see having these mutations in genesthat are quite disruptive, that we understand
something about the biology, or see the frequencythereof, tells us that this is an important event. we have a number of agents that arevery important that could be used and are going into clinical trials right now thatare very exciting for lung adeno c-a, but this is probably further ahead than just aboutany of the other cancers. when we then look hard, and ask the questionnext, of how, and in what way, does this really like up with what we understand of the geneticarchitecture -- i want to take a step back, and we’re going to first talk about thisspace here, the rare alleles that are causing mendelian disease, or familial clusteringsof cancers, the brcas, the tp53s, the patch, and the like. and currently, at this time,we know of about 115, and they’re scattered
all across the genome, and the interestingthing is that they are almost all ascertained in families. they’re rare mutations, withvery strong effects. from an evolutionary point of view, it’s very hard to sustainthem in a population. there’s a lot of very interesting population genetics, and there’sa whole other lecture on the brca founder mutations that we certainly see, and then,we certainly can see ankh genes and tumor-suppressor genes. so, those are important with respectto the kind of, sort of, classical genetic models of whether you’re looking at autosomal-dominant,or autosomal-recessive, but we certainly can see these in these 115 genes or so that havebeen identified. if we look here at a classic brca1 pedigree, we know that, in most cases,having a brc mutation is not necessarily a
100 percent likelihood that that woman isgoing to develop breast cancer or varying cancers. depending on the specific mutation-- this is a very important point about where in the gene the mutation takes place, andthen, there are 24,990 other genes, not to mention the environmental exposures. so, youknow, there are large consortiums that are identifying what are the important clinicalmodifiers as well as biologic modifiers of brca1 and 2, and i think that this is a veryexciting thing that’s going forward, and we’re just beginning to identify maybe thefirst 10 or 12 regions of the genome that are interacting with brca1 that may be veryimportant in identifying -- in contributing to the risk of breast cancer. but, the questionis in what type of breast cancer, because
we know that there’s heterogeneity, thereare different types of breast cancers that do track with the brca1 and 2, but not perfectly.again, this notion of genetic determinism we have to get past, it’s a more complexscenario, and i think the further we get into this, the more we recognize these factorsthat we really need to be able to identify, and put together these critical, sort of,compendiums, or catalogs, to be able to then look in larger studies. so, if we look at these -- excuse me, 110genes, pardon me, interestingly enough, about a year ago, naz rahman in the uk, publisheda very nice paper in “nature,†in which she looked at, at that time, this questionof what fraction of those genes were identified
in the germ line as explaining a familialcancer that had already been identified in a somatic setting, not necessarily in thegene, or not necessarily in the cancer that was being identified in that family, but nonethelesswas considered to be an important driver. and interestingly enough, about 50 percent,at that time -- and now, if you revisited this, it’s probably up to about 65 percentof the familial cancers of these 115 -- have, you know -- the mutations are lying in a placethat we know somatically is very important in one or more cancers. so, these have highfrequencies, and the somatic mutations, again, are the important drivers, but i want to keepseparate the idea that what’s driving the cancer, once it starts, as opposed to wherethe germ line is, you know, a susceptibility
factor, and this is where we have to invokeknudson’s two-hit hypothesis in the retinoblastoma model, that you can start with a germ line,and then add a somatic event that happens on top of that, in certain key genes, andthen there’s a very high risk of cancer. so, when we looked at, you know, this particularlist, we certainly, in dcg, have had a long-standing interest in tp53 and the li-fraumeni syndrome,and the characterization thereof, and we had done a large, genome-wide association studyof osteosarcomas and found a few regions that looked to be very important to risk for osteosarcoma,but one of our junior faculty, lisa mirabello had a terrific idea to take those key samplesand sequence p53 with next-generation sequencing, not illumina, but ion torrent. there are othertechnologies that do work, you can be published,
for the young that are out there. it’s importantto recognize that, hard as it is to past that, but the issue was to look at our distributionof osteosarcoma. i show this because the punchline is going to basically show that there is adivide in terms of where we think the genetic susceptibility with respect to p53 is factoring.so, we also know that there are some other things, such as the recql4, the rothmund-thomsonsyndrome, hereditary retinoblastoma. we know that osteosarcoma can arise in other settings.so, it’s very interesting that we went and sequenced all of the p53 exons, the utr intronicflanking regions, and then we classified, on the basis of already the internationalclassification of the li-fraumeni syndrome mutations that are already in the ir database,and, then ones that are likely to be those
on the basis of predictive deleterious mutations,and then rare exonic variants, which had very low mass in the public database. so, when we looked at this data, what wasinteresting to see was roughly 10 percent of the children and young adults that hadosteosarcoma, none of whom had been ascertained through family studies and we had no evidenceof family history in these 765, were harboring one or more -- not more, but one of thesep53 mutations. and so, when we looked in a different way here, as you can see, the p53mutations by age overall, as opposed to looking at zero to 10, 10 to 19, 20 to 30, we couldsee that there was a break by the time of age 30. so, in other words, when we look atp53 and those mutations and osteosarcoma,
if they’re going to happen, they’re goingto happen earlier in life. and this is an important issue that’s very difficult withthe studies that have in hand now to really start to layer on age in terms of, when isthe risk really an important risk and when does it goes away and you can say that someoneis, pretty much, out of the woods. so, we thought that out findings were very important-- and they are getting published in jnci in the next week or two, and kudos to lisafor really pushing through on this -- that the young onset of osteosarcoma has a distinctgerm line genetic etiology compared to the adults, which we know. and then here’s thequestion, considering genetic counseling in p53 mutation testing, should this be introducedinto the children’s oncology groups? so,
there’s an active discussion now and we’veshown this data in europe, and the europeans are looking at this as well, asking the questionin the pediatric oncology clinic where you’re seeing, you know, children with osteosarcomaand young adults, should you be considering, you know, a more formal type of genetic counseling?we’re moving that way but, you know, this is discovery. this is very new. one is notsaying that this means that every osteosarcoma patient tomorrow should get screened for p53.i mean, we have to think hard about how we are going to study and validate this, butwe find that these are the kinds of exciting things that we’re getting in this discoveryseries and how and what way we move to the next level is clearly very important, becausethe other question is what other germ line
mutations lurk in the, you know, in the young-onsetosteosarcomas, and we have to ask the question, should we be sequencing more than p53, andthe answer is, of course, yes. so we are moving down that road towards exome sequencing nowand, potentially, whole genome sequencing. so i think, you know, it’s important torecognize that there is a whole spectrum of how we’re doing these analyses. so, if wecome back to this figure here and we see that there are these very rare, highly-penetrant,strong mutations that we see and then, as we bleed down into these lower frequency withsome moderate effects, these are the hard ones to be able to identify, and then, ofrecent, the genome-wide association era has really focused on common variants that havebeen implicated in gwas. and so, i do want
to talk about these because these are veryimportant in building a polygenic model, in seeing that there are many, many small effectsthat are contributing to the risk of both common and uncommon cancers. so, as we goforward, and we see, really, what is a genome-wide association study for those who have not conductedthose -- this is not for the weak of stomach. it’s not dissimilar to a vaccine study,where you start, you go a long period of time, and then there is a p value or set of p valuesthat you are either really happy with, or you’re really depressed. so it’s not for the weak of heart goingforward and doing these large-scale studies. but i think that the key issue here is startingwith case control studies, and whether you’re
in cohorts, as we’ve learned very good casecontrols and sometimes not-so-good case controls, we can use the large agglomeration of casesto be able, with adequate controls, to be able to identify the big, top signals, butwe don’t really have the full panoply of all of the polygenic signals that are partof that. but as we go forward, we know that we go from the cases and their dna, to scanningthem, to doing the qc steps and the agnostic analysis, and this is really what’s differentabout genome-wide association studies. you look and the data tells you where there’san interesting area, you don’t say “gee, i always though the p53 was interesting inosteosarcoma, so i’m going to therefore look at that.â€
that’s a very dangerous thing to do. andthen we have our manhattan plots in our replication, and these are stages that take a number ofyears but they are now part of the fabric of genetic susceptibility studies. and asyou can see here, as of december, the field now has about 475 that have been publishedin some 29 cancers, only one of them with a copy number variation, and interestinglyenough only about 8% are shared which raises this question of how much is the bias of howthese studies have been conducted versus what is the shared heritability. and i’ll comeback to this concept of shared heritability as we look at the large set of gwas that wehave, and try and compare them in some very sophisticated analyses.
and the other thing, what’s very interestingis, of these 475 or so, not -- almost none are associated with outcomes. so, it’s tellingus, i think, an important question that there are certain genetic regions that are veryimportant for risk for developing the cancer, but whether you go on and develop an aggressiveform of prostate cancer, or whether you survive breast cancer, ovarian cancer -- they arenot the same regions. it tells us the complexity of these things going on over an extendedperiod of time is really driven by a number of factors and, again, we are just beginningto pull apart what they are. so, if we take prostate cancer, as we can look here and seethere’s some hundred different regions -- it’s just like shotgun across the genome. they’rescattered but very few differentiated between
aggressive and non-aggressive. a number havebeen published, most of them died on the altar of attempted replication, and maybe two arethree that are able to actually survive, i think, stringent statistical significancewhich is very important, because whatever our findings are, there are wonderful post-docsand pre-docs who work on each of those regions, and you don’t want to send people off towork on false positives, and so, there is value in the genome-wide significance. so, for prostate, we really don’t see awhole lot of measure of things really coming together, whereas in testicular cancer, whichhas the highest familial risk -- what’s really interesting here is just about allthe hits, and they come at a much faster rate
relative to the number of cases scanned, where21 with another, i think, 11 getting ready to be published from conglomerating the studies,and you can see all of them -- localized genes that are important in telomeres regulation,germ cell development, sex differentiation, which has really been quite striking. it’sthe one cancer with the highest heritability as we know from the twin studies. monozygotictwins from the same are 75 fold increase. dizygnotics are 25 to 30. brothers are eightfold,so there’s clearly a very strong -- it’s a very rare cancer, so it does raise thisquestion of the issues of absolute risk, and we’ll come back to that. and i think, youknow, it’s also the one genome-wide association hit that’s of a high enough effect sizefrom the kit-ligand, which is an interesting
gene under selection having to do with haircolor, at this moment, but there have been some very eloquent papers in “cell†onthis, that it’s strong enough for a genetic counselor to actually think about wantingto give someone advice on the basis of being a homozygote. now, again, thinking about it,i don’t think at this point it’s ready to go into the clinic and, you know, mostgenetic counselors would not jump on this and say that this is something that we’regoing to do today, but again, this is where the discovery engine is now pushing and thestructure is how and in what way are we going to be able to address those questions. when we look a bit further -- so, for instance,one region becomes six. here, around the telomeres
gene, there are now ten different cancers,and the interesting thing with [unintelligible] and xiaoming [spelled phonetically] had reallyidentified, is that in each one of these regions, we saw kind of pleiotropy, where all six ofthe independent regions had some that are protective and some that are susceptible.so, in other words, the exact same allele in one disease can be a protection and theother can be susceptible, which really underscores the importance of looking at other genes and,more importantly i think, the environment. and the place where this is most confusing,is if you look at skin cancer. basal cell and melanoma are in the opposite of directions,pancreatic and lung, as well. so, you know, this is a very interesting region that weknow that there are highly-penetrant mutations
that are very important, as well, there. so, when assess these gwas regions, you know,there is this tension between wanting to look at all of them for risk purposes versus wantingto look at individual ones and go and do the fine mapping, and we do both, and where weknow one region becomes five, often they’re hidden in there and this is helping to explaina little bit more the heritability. some regions harbor alleles for many different cancers,and then we have to start asking the question, what about accounting for exposures like smokingand lung cancer, smoking and bladder cancer, and i’ll come back to bladder cancer ina minute, and then the question of the value of these large sets of snps, as we put togetherthese catalogs, the idea is, at what point
will they be of sufficient use to be ableto think about in terms of public health. at this point, it’s really very clear thatthey’re not ready for individuals, even though 23andme decode made -- a number ofgroups have tried to sell these over a period of time. it’s, you know, genetic snake oil.enough said on that. so, when we look at the gwas cancer susceptibilityhits themselves, when we look at these 475, we see a very different kind of biology underlyingthese particular regions. so, there are 470 unique regions, only about 25 percent of whichhave been explained. twenty percent are in regions where there’s no gene that is inany way connected to any of the correlated variance. so in other words, it’s doingsomething for some funny rna, or something
that’s important in genomic regulation.when we looked at this, less than five percent of the genes that fall under the peaks ofthese genome-wide association studies mapped to the cosmic database, unlike what i showedyou with the highly-penetrant mutations, where 50 percent was reported, and it looks likeit’s moving closer to 65 percent. nearly all of the gwas hits that we’re lookingat are looking at things that are really perturbations, that are changing pathways, but that are notnecessarily altering a particular gene. there are very few that are coding when we reallyexplain them. so, mitch makela [spelled phonetically] in the lab spent quite a bit of time lookingat the first blush of about 265 going, and really very carefully, looking at all thegenes and looking at cosmic, and we could
see that there was really no difference inthe kinds of mutations, the number and the types of mutations, between those genes thatwere under the peaks of gwas and those that would be randomly permuted based on genomiclocation, gc content, the kinds of things that we think of in terms of the locale thatmay be important in contributing to the risk for random or real mutations. so, now, we come back to this architectureof genetic susceptibility of cancer and we can see that we really have perturbationsof key pathways and the common variants, and each one is making a small dent that’s neithersufficient nor required for developing the cancer, unlike what we see in the familialsettings, where we really do have those damaging
drivers. so, as we look at this map, and seethat, well, with linkage studies and family studies, and, you know, historically we’refilling in this part of the space of any given cancer, and with gwas we’re going here,but in between these low-frequency variants that are part of this allogeneic model arevery tough to get at. and this is where next-generation sequencing and laboratory investigation reallyhave to hit the road. so, what i would now subscribe is that eachmajor cancer has a unique underlying genetic architecture and we are at different pointsin being able to build up the catalog of understanding what contributes to breast versus what contributesto prostate. there may be a few shared things, but nonetheless, they really are in our mindsvery important to build these models, because,
as we build these catalogs there may be increasingclinical utility as well as the obvious value of being able to use these catalogs, particularlyin common cancers, where, if you were able to stratify and ask the question of the changesin absolute risk for a common cancer, that may have real public health implications.for a very rare cancer, a shift in the absolute risk is a harder sell at this point. and so,i think, if we had, you know -- with infinite money, i would submit, if we are going toreally make an effort to have a complete catalog of what we think genetic variation looks like,one could make the argument that some of the common cancers are the places where we shouldput our money and, in fact, we’re doing that with the game-on and the oncorays, andthe like. we’re doing that with breast,
and prostate, and lung, and colon, and ovarian. so if we look at the genetic predispositionto breast cancer, you know, in 1994 i mentioned about the cloning of brca1, and shortly thereafter2, we can now look and see this kind of sweep, of having a number of genes that are highlypenetrant, but they are very rare, and then the gwas era has put a number there, but wehave very little in this space here, that we are now going after with xm sequencing,where we have to try and put together very, very clever stories. similarly, we can seethe doubling of the number of hits tells us that we can explain about 35 to 40 percentof the familial risk. and familial risk is defined here, we think, in these common cancers,as between 1 and a half to twofold increase.
so, if there’s someone with a prostate cancerdiagnosis or a breast cancer diagnosis in the family, there’s an increased risk thatsomeone else in that family has that. and this is a statistic to try and explain howand in what way can the snps -- the space of common variance, explain a fraction ofthat familial risk, because therein lies an opportunity to be able to then have stratificationbased on what we would see as a useful shift in the -- in that. so, prostate is lookingvery different. here, as you can see, there is virtually nothing in the high-risk, and,you know, for the snps it has been very, very exciting. so, for quite some time, nilanjanchatterjee, a colleague in dcg, and others have been looking at this question of, whatare the limits of using snps? because, at
one point, there was a lot of excitement,people were jumping all over, and suggesting like in crohn’s disease, where we have veryhigh sibling relative risk, that you could use snps to be able to move the auc in thearea under the curve, to be able to get close enough to a place where we would think itcould be clinically implemented. now, if we look at the common cancers, ofbreast, prostate, and colon -- when we were looking nilanjan and park have made some very,very important estimates based on empiric information, as well as, now being borne outby the larger consortium, that there are clearly limits to looking at the sibling risk modelof about two, for breast and prostate. and, in fact, he just yesterday was very kind tobe able to provide a very important new slide,
here, looking at it, we know that we can explain35 to 40 percent of the familial risk of breast cancer, which again, the issue is in the absoluterisk for screening, and the question of how we do that is still very much on the table.but, as you can see these curves, as we go from what’s empirically to what would potentiallybe known from going ahead and looking at as many as 500,000 cases and 500,000 controls,which is something that the world is moving towards. right now, there’s a big studyof roughly 150,000 breast cancers and 150,000 controls and that will continue on, and thatallows us to really identify what we think are, at least, on the order of estimated tobe 3,000 or more different snps. now, each of these may have their own story but it’sgoing to be decades before we understand what
those stories are, but the question is usingthis information in terms of risk assessment is a different question, and i think thatwe will be most effective in having as comprehensive a setup as possible. because, even if we justlook at prostate cancer risks -- as here from antones [spelled phonetically] [cough] inthe uk -- excuse me -- if you look at the 76 snps that have been identified, if youstart looking at the risk factors, and -- i mean, you look at the risk for developingthe disease, age is very important. and again, as we do these studies, we’ve been reallytwo-dimensional in our genome-wide association studies and many of our familial studies,and not really being able to assess the value of age and, particularly, of secondary factors,whether they’re environmental and/or, particularly,
other genetic affects. excuse me, for justa second. so, with that, i think, as an important storyto just delve into for a minute, with respect to looking at smoking. so, out of the genome-wideassociation studies of bladder, interestingly enough, a gene that had been identified bythe canada g [spelled phonetically] world of one of the four or five that had survivedthe 10,000 attempted publications that hadn’t really survived replication, using that toslow acetylation g, and, interestingly enough, you only see the effect in individuals whoare smokers. so, this really let us -- particularly with nilanjan, and nat, and montse garcia-closas,to look very closely at this question of the cumulative 30-year absolute risk, and someof the cohorts that are available to us, to
ask the question, what would be the risk fora 50-year-old male in the u.s., by looking at smoking in 12 snps, including the nat andthe nat2? as you can see, the r.d. here does separate quite nicely between the low-riskand the high-risk. this is not to say that we should start screening every potentialsmoker for bladder cancer. there are many other comorbid conditions and the like, butlet’s just do a thought experiment. that if we had 100,000 smokers with high geneticrisk who stopped smoking. we had effective cessation. if we are thinking about bladdercancer, where we have enough known about the genetic susceptibility, we would eliminate5,400 cases. this is just focused on only bladder cancer. if we then look at 100,000smokers with low genetic risk, and they stopped,
we would eliminate 1,500 cases. so it raisesthis question, of how and where we would apply these kinds of risk reduction strategies.so it’s a possible example, really, of how genetic and environmental stratification maytranslate into targeted prevention, the so-called “precision prevention,†but we are a longway from doing that, i think. the next sets of studies need to really assess and confirmthis and the question is, what are all the co-factors and the comorbidities of lung cancer,cardiovascular disease, and the like. so it’s really an important question but i think,you know, it’s that sort of exciting opportunity that really tells that we need to look very,very closely, and why we need to have, i think, these very large compendiums. [coughs] excuseme one second.
so, really, what fraction of the polygeniccomponent contributes to each cancer? so we’ve done scanning of over 100,000 individualsand 15 different cancers, and so josh sampson and our program, one of the young, terrificbiostatisticians, looked at 13 cancers, used the genotype snps to explain anywhere from10 to 50 percent of the variability on the liability scale. so, going back to this kindof study that i showed at the beginning for the twins, you can see at this point, forgwas, we can explain some fraction and depending on the cancers, it can be anywhere from 15to 50 percent of what we can see in terms of familial risks with the large number ofsnps that are available at this time. so, the shared heritability, when you start lookingat these and lining them up there are some
very interesting things that are startingto come out, where you can see very strong correlations, where there’s overlap betweencancers that you would expect, like testes, and kidneys, and cll, and large-cell b-celllymphoma, but interestingly enough, dlbcl, a type of lymphoma, osteosarcoma, share aheritability that was not anticipated. so again, this discovery element of giving usnew clues and ways of thinking about diseases is very important -- not different from howthe tcga analyses in looking at certain mutation signatures have told us things about bladdercancer and potential viral pathogens in some of the, you know, gastric cancer, and ebv.these are the kinds of things that, looking across these large sweeps, give us new placesto go.
so, lastly, really, what when we talk aboutgwas, what’s in a gwas region? we know that it can inform our understanding of the somaticchanges, particularly in that region. we have many correlated snps that have to be mappedand choose the best variance for laboratory evaluation. so, of those 475, only about 25are explained right now, and a small fraction of them may allow us to really look at environmentalor other genetic alterations. so, we have a wonderful resource that, again, nhgri wasreally the driving force of this. it really has, i think, transformed how and in whatway to prosecute these and this gives us the opportunity to be able to map and go aftereach region. so, let me just give you an example here,where there’s a potential clinical implication,
i think that mila profinina olsen [spelledphonetically] had identified with one of the bladder signals on chromosome eight, the pscagene, where she mapped it, figured out the very best snps, statistically and analyticallyin the laboratory. you can see that the functional snp changed the expression of this particulargene, and it just so happened there’s a humanized antibody to this particular genebeing tested in other cancers, and so, we’ve been trying unsuccessfully, but still pushinghard to ask the question, is this the disease where this particular humanized antibody shouldbe tested. and here is a potential translational application of identifying a particular biologicstory that comes out of looking at a particular gwas region. this was not to promise thatall gwas regions will look like that, but
i think there is very interesting biologyin terms of the regulation and the changes in very important pathways, in particulargenes, that are critical for cancer. so, we come back to the architecture of geneticsusceptibility of cancer and this sort of middle space here of the low-frequencies thathave the intermediate effects. and, actually, jeff trent was central to one of the examplesi’m going to show in a minute that kevin brown, who we hired from jeff as an intramuralinvestigator, was able to do this. and it’s very important that the laboratory activityis there, because it really does help us get past the signal-to-noise ratio. for thosethat have looked at exome sequencing and seeing 5,000, 2,000, 10,000 interesting-looking variants,it’s very hard to know which ones are the
right ones and to do the agnostic search forrare variants is very difficult, as eric lander and others have suggested, that we’re goingto need 20, 30 thousand cases just to begin to identify, and get the right signals tocome out of this, and, you know, this gets at some very profound questions. so, kevin looked at the mitf gene, the e31akmutation, that had no perfectly segregated in families with melanoma, nor was it thatstrong of an effect in the population, but it was clearly there. but the interestingthing was when they when in for the laboratory, as did a group in france in parallel, [cough]they published -- excuse me -- in “nature,†this very interesting study where they -- therewas a lot of biology of assimilation [spelled
phonetically] of that particular moiety onthe mitf gene, sort of giving the scientific laboratory corroboration that this is an importantrisk factor. similarly, we’ve gone ahead, and so, terry landi and jianxin shi in ourprogram, looking at pot1, and important gene in telomeres’ stability, identified verysimilarly. families in populations where the effects were not segregating like our classicalmendelian families, and they weren’t quite strong enough, as we saw in the genome-wideassociation studies, but they had singles in both, that when we went to the laboratory,were able to identify very important laboratory experiments that really sealed the deal, soto speak. and i’m afraid that this space is going to be one that is very difficult,it’s going to be much slower to get to,
where we’re going to have to use laboratoryevaluation as well as statistical operations to be able to get there. so, we know thatthere are many difficult regions to get at, and we won’t necessarily get to all of them,and the low region here individually, but the polygenic models of snps will allow usto get that, and then we have, occasionally, these very rare things. so, i think at this point, the report cardon cancer genomics in 2015 is that it’s really just the start. we really have a waysto go. and it’s been a very exciting discovery period, but it should not end. the battleis clearly not over. we know that discovery is in progress and we’re defining very differentstructures for the underlying genetic architecture.
we know that the current profiles are bettersuited for risk stratification, a form of precision prevention, but we haven’t figuredout how to develop those studies yet to confirm those. and then the discovery of new biologicinsights into cancer are clearly very important but the next real frontier, i think, is toget to the environment and molecular heterogeneity. so with that, i’d like to acknowledge -- i’mgoing to go on for about five more minutes, dan, because i want to talk about how thegenome is falling apart, but i wanted to identify and particularly call out the remarkable colleaguesthat i’ve had the pleasure of working with, particularly, joe fraumeni and bob hoover,and peggy tucker, and certainly a number of other individuals within dcg, and, you know,the slides would go on like a hollywood movie
for three minutes of music if i were to showthe 400 collaborators, and all of their associations, but this is really the value of team sciencein a really very special way. so, let me just say a word or two about genome screening,you know -- is it promising? is it haunting? is it dangerous? it’s probably some of each, at this point.and, you know, as we go forward with next-generation sequencing of tumors that means we are sequencingnormal tissue, and it’s a very hard issue ethically to how not to look at that information.and when we start looking at that information, we ask very important questions about what’sgoing on with our genome. are we going to have to sequence our genome more than once?and, i would say, we may be faced with having
to make those decisions, because i would sayaging is tough. it really is. we fall apart with arthritis, obesity, heart disease, cognitivechanges, we have decreased oxygen consumption, immunosenescence, fewer neurons, telomereattrition, genetic mosaicism. and so, one of the things that we’ve noticed, as wewere doing six years of gwas, we looked at over 100,000 samples in 45 different populationstudies. we kept seeing these, sort of funky q.c. failures, but we saw enough of them thata couple of these astute people in the laboratory said, “hm, [spelled phonetically], theremay be something there.†so, what we unexpectedly found was that largechromosomal abnormalities. so, about one and a half to two percent of the population overthe age 50 is walking around with a sub-population
of cells in their blood or their buckle componentthat are these very ugly-looking mosaic events. and i think this is something that sort oftriggered this whole world of asking the question of the dynamic genome. and we know that wecan see this in many different populations. we’ve extended this now to over 125,000individuals, and we know that the phenotypic expression of mosaicism has certainly beenaround. classical genetics told us about eye disease, you know, calico cats, neurofibromatosis,and the like, and we know in the extreme it’s an age-old explanation for a subset of neurofibromatosisand trisomy 21s, and turner’s. we know that it’s also very important for rare, highly-penetrantmutations that lead to variegated aneuploidy, that tell us a lot about stability of chromosomes,the bub1b families, and the set57 families,
as rare as they are. and we also know that,you know, the complex syndromes, where we see -- and some of the leading investigatorsat nhgri have identified some of these very perplexing situations where what we wouldsay in the cancer world, an akt1 mutation of lung cancer or neuroblastoma, you see inthe germ line, that is expressed only in a particular tissue, you know, giving rise toproteus syndrome or ollier’s disease. it’s really quite striking. so, we clearly know about this, but when westart looking at the large -- here is the chromosome, you know, the homunculus chromosomeof 122 of the autosomes, and, if we look at 127,000 individuals, this is a paper thatmitch has led that will be published very
shortly, we can see that there are differentkinds of events that are gains, or copy-neutral, or losses, in very large-scale. and we knowthat this is really the tip of the iceberg, and we know, unfortunately, that this increaseswith aging. so, if we look at all of our cohorts from age 50 to 75 these events increase insize. and the question is, are they harbingers of neurodegenerative, diabetes, cancer, andthe like, and this is really a very difficult question. we don’t have evidence at thispoint that there are strong risk factors for developing cancer, per se. there’s somepapers looking at complications of cardiovascular diabetes disease, but, again, this is stillthe start of it. if we look at bichromosomes, we can see that the x chromosome was hit veryhard, interestingly enough, and the y has
the most number of mosaicisms. so for men,15 to 20 percent of the men at age 60 are missing a good part of their y chromosomes,which is very interesting -- i’ll show you data on that. but here, it’s why the x chromosomeis very interesting, and then there are these regions on chromosome 13 and 20 that are veryimportant in the hematologic cancers, that you see normal individuals walking aroundwith those. so, the hypermutation of the inactive x is very a very active field in the somaticworld. so, it’s telling us something about the replication of the inactive x being thelast chromosome to be replicated, that there’s more error that takes place in the icgc andthe tcga data. and so, we’re asking this question, could this be restricted to theinactive x, and we’re not sure, but we’re
moving forward on that right now. there have been a couple of papers lookingat y loss as causal or consequence of aging and smoking. the swedish group has suggestedthat it’s important to susceptibility to cancer, and have actually started a company,you know, i mean, you may as well flush your money down the toilet the best we can tellbecause they data is not very strong. when we look at our large cohorts we see absolutelyno effect for susceptibility, but we do see age. and here, you can see, by the time peopleare in their 60s to late-70s, 20 percent of the men have lost almost all, or all of theiry chromosome, and their mosaic thereof, and so it’s really important. similarly, thesurvival analysis has been suggested, but
we don’t see any effect whatsoever whenwe look in our large cohorts that we followed for a number of years, and the probabilityof an event decreases, interestingly enough, if you stop smoking, you have less risk ofhaving that y chromosome mosaicism. so, in our minds, this is sort of the tip of theiceberg, looking at these large-scale events. and a number of other groups have startedto look at this with sequencing, looking at favored hematologic genes. there are age-relatedmutations that are associated with clonal hematopoietic expansion and malignancies,looking at the tcga data, age-related clonal hematopoiesis, again, of looking at individualsequence-based changes seen to populations of cells.
so, to take a step back, the hematologic cancersare very different, in our mind, and we’ve looked very closely in our scanning, in ourcohorts, and are able to see -- as well as did the geneva -- that there’s an increaseof the number of these kinds of events that are seen particularly in individuals who goon and later develop a cancer. so, in other words, is this a potential biomarker? couldthis be exploited to screen people who might be at high risk? but, we don’t know whatall those risk factors are but we certainly can look and see, you know, untreated leukemias,particularly have an increased risk of having one of the events of 13 or 20 per se. so,when we looked at the cll gwas that we did, and we particularly looked at individualswho had blood anywhere from two to ten years
or more before their diagnosis in these prospectivecohorts, we can identify a series of mutations that were seen in mosaic states in these individualsas long as 14 years before their actual frank diagnosis of cll. we don’t have the full,you know, the full karyotyping of all of the events, but we can see that they are all eventsthat are reported in cll. the reds are those that are of poor prognosis. so, it does raisethis question, of being able to see well, you know, as much as 10, 12 years in advanceof a diagnosis, the cll that somebody could have one or more clones that are identifiedat a high-enough fraction that could be detected by our current technologies, which is aboutfive percent of the circulatory cells. we can -- that’s our discriminatory difference.
so, our implications for aging, really, inour minds, are very important in thinking about this sort of global concept of genomicinstability that has been described by others, and it gives us clues to what can be tolerated,or potentially selected in cancer. and as we go forward in thinking about precisionprevention, i think this notion of mosaicism is going to be particularly important in monitoringindividuals up to a certain point. it’s are those changes specific to the cancer,or are they part of a sort of global system that’s sort of falling apart, and this isa very important hypothesis that i think large-population studies really need to look at. and we certainlyknow the role of this in non-cancer diseases that track particularly with ages, and therehave been some reports recently, and i think
we will see more and more as they come along. so, let me just end by saying, you know, tobe able to do this work, it’s really a pleasure to work with so many people -- to quote agreat princetonian president who happened to then move to the white house, woodrow wilson,you know, but to paraphrase, we use not only all the brains we have but all that we canborrow, and in doing that, for instance with the mosaicism, we have these huge consortiumswith hundreds of individuals. and, again, these are the kinds of things that are criticalto be able to make these large, population-based observations. it’s going to drive us backto the laboratory and make us think hard about the next public health questions, and wherewe want to implement these particular observations.
so with that, i will stop, and thank you again,and i very much appreciate the honor of giving the trent lecture. thank you. daniel katsner:well, stephen, thank you very, very much for that spectacular lecture, and, limited aswe are in terms of the kinds of gifts that we can bear, at least publicly, i have thissmall token of our appreciation for the national human genome research institute that commemoratesthis wonderful lecture. all right, and i think we have a little bitof time for some questions, and, of course, there is a wonderful spread of food out inthe library awaiting us after the last question is asked. so, anyways --
male speaker 1:just two brief questions. as a pathologist, i’ve seen problems with prostate and breast.in particular, prostate is a natural occurrence in older males -- steve chanock:right. male speaker 1:-- and, in -- all through the spectrum when you find a low gleason-grade prostate carcinoma,or what appears to be one, you don’t know if it really is something that is just goingto be there for the guy’s life or what’s going to go on. so i suggest to you that thedefinition of what is a prostate carcinoma is important in terms of biological behavior.and the second thing is, in terms of breast,
you get what’s called an atypical ductalhyperplasia and nobody knows what it is. is it going to become cancer, or isn’t it?i understand your markers, but if a pathologist is going to start calling them when they getto a certain point of atypia, “this is a cancer,†it’s not going to be helpfulin terms of you defining cancer risk. so, i think definitions by pathologists, in termsof those two cancers specifically, are needed before you can move forward in defining cancerrisk. you have to define which biologically or which -- what is your definition of a cancerfor prostate, and what is your definition for breast cancer versus atypical ductal hyperplasia? steve chanock:well, thank you. your point is very well taken,
and i would like to assure you that thereis a tremendous amount of effort focused on this. so, in a number of the scans that havebeen done, both in breast and prostate, the question -- the issue of limiting them toindividuals that have particular states of gleason, seven and above, and a tremendousamount of effort has gone into trying to standardize in a lot of the international studies. this has been one of the challenges. what’svery interesting about prostate cancer is, when you look at the regions that you findfrom scanning or analyzing seven or even eight and above, and then you bring back in thefive -- the rare fives and the six, you see, basically, the same regions are lighting up.and this has been a big disappointment. this
has been a hard thing, the idea of are theregenetic determinates or genetic factors that contribute to the risk of having very aggressivetypes of prostate cancer. we have been very hard-pressed to find them. we have one ortwo that we are just putting to press and another group has one or two that has withstoodthe test of time of being replicated over a number of studies, that spend a lot of timegoing through this issue of pathologic review and pathologic, sort of, transmissibilityfrom one study to the next. for breast cancer, the breast cancer consortiumhave been very careful in trying to separate out the early legions from what -- in cytofrom what would be considered to be frank carcinoma. and a number of the studies arenow moving on using molecular characterization
of saying, you know, luminal a, luminal b,triple negatives, part two, end up in those types of molecularly-defined types, to beable to more -- to better refine the analyses and to be sure that we are looking at comparablethings, because, a disease like breast cancer may be as many as nine or 10 different typesof breast cancers. and so, i think as we accumulate these very large studies we do have opportunitiesto look at the sub-type specific effects and how they are, but i think, you know, thisepidemiologic world works very closely with, you know, scores or hundreds of pathologistsworldwide to address these questions. male speaker 1:and just one brief comment, your aging hypothesis has been given credibility by another pathologisthere who gave a lecture last week, saying
that the bcl2 is known to be elevated withage. male speaker 1:bcl2 being the, what is it, bcl lymphoma gene that is known to cause immortality in cells,and antiapoptotic. thank you very much. steve chanock:thank you, sir. thank you. male speaker 2:so, what is about the y chromosome, and why is it disappearing faster? steve chanock:good. my wife’s here -- you can ask her [laughs].well, the y chromosome is -- there are two aspects that are very interesting about it.one is, if you actually look at the topography
of the y, there’s a lot -- you know, there’sthe pseudoautosomal, and shared with x chromosome. so, the amount of real estate that we considerto be unique is a much smaller amount that we’re looking at, and, you know, there arerelatively fewer genes that are involved in things other than sex determination and fertilitythat are on the y chromosome. so, you know, there is a lot of hand waving but i don’tthink anybody have a very concrete definition of, you know, gc content, or something aboutthe actual structure of where those breaks would take place. you know, it is a remarkablephenomena that that’s tenfold more common in men than you see in women. remember i showedyou that the women had higher rates of these larger events, but again, that’s still downin the weeds compared to where the men are.
male speaker 2:so, what happens to the function of the people that are most, i guess losing most of it,most of the y chromosome? steve chanock:well, many of them go on, and may be president of the united states, or do whatever. i mean, it’s a high enough fraction. ifwe look at the plco, for instance, a cohort where there are a lot of very-high-performingindividuals, we haven’t done the mapping against their professional and their personaloutcomes, and i wouldn’t want to care to do that, but it is, you know, an interestingevolutionary question, at what point does the y chromosome become less important biologicallyand i think the mosaicism certainly does press
that question. male speaker 2:thank you. good luck in exploring. daniel katsner:all right. well, i think we probably ought to call it to a close, and please do joinus for the reception over in the nih library, sponsored by our friends at the faes. thankyou very much for coming. [unintelligible dialog] [end of transcript]
No comments:
Post a Comment