• Decrease font size
  • Return font size to normal
  • Increase font size
U.S. Department of Health and Human Services

Animal & Veterinary

  • Print
  • Share
  • E-mail

MUMS Research - Species Grouping

Dr. John Babish  

DR. BABISH:  Thank you very much, Alistair.  I am going to go through a lexicon of new words this morning probably for most of you.  What I would like to do is to introduce some of the concepts of bioinfomatics that have been developing over the last five years, primarily in the areas of cancer research.  They include cluster analysis, self-organizing maps, and mathematical techniques to illustrate connectivity.  So the first word that I want to reintroduce to the lexicon is species grouping.  If you look at Medline and do a Medline search on species grouping what will come up as synonym is crop grouping.  I would like to suggest historically that I believe comes from the IR-4 program and the plant and agricultural people. 
  
I believe species grouping for what we will be talking about and what we would like to do in MUMS is more appropriate for a couple of reasons.  One, I don’t believe there is any one of us that would like our dog or cat to be referred to as a crop.  Secondly, I think that using the term crop grouping imposes a limit to how you define the problem.  For example, if you are thinking of catfish as a crop, you might think of a catfish fillet as what you are interested in grouping with a chicken wing.  So crop grouping is not exactly a term that allows you to be expansive in your thinking.  You will see that expansive thought is an important part of cluster analysis.
  (Slide.)
  
Now let’s take a look at the problem.  We spent yesterday seeing what the problem was worldwide, and what I was struck by was the fact that there is an enormous similarity in minor drug us around the world, and I was also struck by the lack of movement in addressing the issues we have identified.  We are all pretty much stuck in the same place.  Some of these figures, well, the figures we will see in the next couple of slides, are basically from the NRSP-7 program and it is 23 years of work in minor species.
  
The economics of species grouping is very interesting because as an individual species or these minor uses or minor species, they all represent very small markets.  But if you look at those markets in aggregate, they become very big.  For example, one set of numbers will bring the total of this diverse aggregate to about 1.5 billion in the United States.  The second thing that most analysis of the market impact frequently ignore are the accessory industries that are built around these niche markets, and those will multiply this figure by about three-fold.  Another problem in assaying, trying to get your hands around these figures, is that they are very loose.  You can make these figures go anyway from about 1.5 to 4.5 billion dollars or you can make this figure almost double.  So there is a range of estimates.  I tried to take the lower estimates of each of these, but also want to give you what I have seen as some of the upper estimates.
  (Slide.)
  
So in reality we have a large aggregate market with a large number of associated accessory industries.  As we saw yesterday, this is a low figure for a new animal drug, but it is close to the median on the figure.  It ranges from the
50-million or so down to about eight-million.  We have been told in our review last year that the cost of addition of a new species to the label is about two- to eight-million dollars.  The NRSP-7 cost for a species is about $300,000.  That figure is artificially low, because it doesn’t include the contribution of the states.  There aren’t any state officials here, so I can say that what they are paying is not include.  That figure is at least $500,000. 
  
Currently we have 13 active projects.  Again, a slight underestimate because some of those species drug combinations include several species.  For example, in aquiculture an oxytetracycline fish project may involve four species.  So there are really about 20 active projects, 41 potential projects, again a slight underestimate.  That might include 50 unique species drug combinations.  So if we look at this, the cost becomes relatively high, even using the low figure for drug approval, but what is most important to me and I think is important to the industry is the time that this would take.  NRSP-7 has been in existence for 23 years, and we have 31 approvals.  A lot of those approvals came early on before Steve and his crew were involved, and now it is very difficult and the bar is raised.
  (Laughter.)

Don’t put that in the transcript.
  (Laughter.)
  
Off the record.  No.  As we heard yesterday, worldwide the criteria are becoming more and more strict, so these figures are underestimates.  If we want to address drug approvals in the manner that we have done in the past as single species drug approval, then it is going to take us anywhere from 25 years to somewhere into the 22nd century.  This estimate is based upon what we know now, and doesn’t include the shifting sands of regulation.  It also doesn’t include the changes in disease pattern or the addition and loss of minor species.  So the regulatory problem is big; the time frame is unacceptably large.  In my opinion, the only effective answer is species grouping or clustering.
  (Slide.)
  
Now this information is from the NRSP-7 database where we have represented 10 aquatic species, six avian species, five ruminant species, and three fur-bearing species.  So we want to start somewhere and want to be most cost-effective.  What we will be looking at the possibility of species grouping in these avian and aquatic species.
  (Slide.)
  
Now in the approaches to species grouping, here are some new words that cover some old technology or technology that has been invented by the bioinfomatics people; and they are derived from the words genomic or genome.  Okay.  So we all know what the genome is.  It is our DNA sequences.  The characterization and clustering of metabolites of representative drugs and species of interest.  Now we have seen the literature.  I have been to presentations, and this type of characterization or clustering technique is being applied, and it is one of the most familiar to most of us.  That is where we take a drug, a model drug, and look at the range of metabolites among various species.  Then we try to look at the similarities of those metabolites and cluster or group those species with similar metabolites.  That is now called metabolomics.  Then there is the characterization and clustering of zenobiotic transformation capacity in the species of interest.  We can do that now with genomics, proteomics, and model metabolomics.  The advantage of model metabolomics is that from looking at the range of metabolites or the metabolism of a model substrate you can then begin to ask what if questions and you can characterize the biotransformation system in a general or way that allows you to do what if, and it is not drug specific.
  (Slide.)
  
This is just a quick review of biology.  Here we have the genomics area covers not only the DNA sequencing, but what is most important is not DNA, but transcription of the DNA.  It is not your genome that is important, but your transcriptome.  That is what gives you your individuality.  Your transcriptome is the message, the messenger RNA.  That can be done in a high throughput bases with RTPCR, real time polymerase chain reaction.  That way you can quantify the message.  Proteomics looks at the active protein or primary protein products.  Then these proteins, active proteins, produce metabolites, which are measured through metabolomics.
  (Slide.)
  
Now sounds pretty complicated, and it is real complicated and real expensive when you are doing things like trying to understand various cancers or when you are doing whole genome analysis.  Whole genome analysis will look at 20- to 30,000 genes of expression under various conditions, either normal or diseased states, and in these situations you have a series of impossible questions with no real answers.  However, we want to apply this technology to species grouping, and species grouping is specific for looking at the ability to metabolize drugs, absorb drugs, distribute drugs, or excrete drugs.  The question becomes relatively simple with respect to bioinfomatics.
  (Slide.)
  
Here is a schematic of what we want to look at.  What we want to look at is determining drug action across species, and that can be divided into mechanism of action, absorption, distribution, metabolism, and excretion.  For all of these processes, mechanism of action falls out on a single-drug basis; but for the other processes we can identify those genes, transcripts of those genes, and proteins associated with those transcripts that relate to the possibility of drug metabolism in a particular species and among species.  For our example today we will take metabolism, and within metabolism we will examine the phase one transformation system. 
  (Slide.)
  
The question and the range of the question become even more straightforward when you look at drug bio-transformation by the cytochrome P-450 system.  You can see here in this is a pie chart for human drugs, the extent their metabolism is associated with individual isozymes of
P-450.  You can see here that the CYP3A, sometimes known as the garbage disposal of drug metabolism, accounts for more than half of all the drugs that undergo biotransformation.  Looking at CYP3A expression and protein in the species of interest and grouping on those drugs that are metabolized by CYP3A is a very important part of one of our variables.
  (Slide.)
  
This is just a breakdown again.  The cytochrome P-450 system is very, very well described in the genetics field.  There are 230 genes identified as cytochrome P-450 genes.  Those genes are divided into families.  The families are indicated by the arabic number, 1, 2, and 3 here, and then the subfamilies of P-450 by the capital letter; then individual isozymes within the subfamily are designated by the letter following the subfamily.  Here we see those that are important, and we have characterized what types of molecules each one of these members of subfamilies can metabolize.  Now that is important as you will see at the very last slide when we get to a process called annotation of these variables.
  (Slide.)
  
Here we have an example of a proteomic and metabolomic study that was conducted by Dr. Bowser at Cornell.  What he has been doing in his species grouping program is to characterize the cytochrome P-450 system by western blotting and model substrate enzyme analysis.  The example that we will go through today will include the tilapia, the summer flounder, the hybrid striped base, and the channel catfish.
  (Slide.)
  
The study that I am going to show you some of the results from included treating 50 fish.  The fish were treated with oxytetracycline for 10 days, and then on the post-treatment days one, six, 10, and 21, body weights, liver weights, and hepatic P-450 1A1 and 3A4 were assessed.  The 3A family is the garbage can of P-450 isozymes, and the 1A2 very selective.  Now oxytetracycline is not metabolized, but the intent of this study was to look at the effect of oxytetracycline administration on the amount of these isozymes that are present as well as their activity.  That in effect would influence the metabolism of other drugs or chemicals by the fish.  These 3A and 1AZ activities were monitored predosing and on day one, six, 10, and 21.
  (Slide.)
  
Here is a bar graph indicating the 95-percent confidence interval of three model substrates.  One of them, CEC, represents the 1A2; BFC and benzyl resorufin, both represent 3A4.  What you see here is that with the hybrid stripped bass following oxytetracycline treatment there are no differences at all in the post-observational period.  You see in the hybrid stripped bass that the administration of oxytetracycline does not change drug metabolism of CP3A or CYP1AZ in the post-observational period.
  (Slide.)
  
In the channel catfish undergoing the same protocol, we do see differences in the 3A4 isozyme.  Those differences are reflected early and then they become positive or they are significant only with the BCF model substrate. 
  (Slide.)
  
Immature tilapia undergoing the same protocol of treatment, now do show a dramatic decrease in activity of 1A2; showing that after an oxytetracycline treatment tilapia would metabolize drugs by 1A2 at a much slower rate.  Many environmental contaminants are metabolized by 1A2, and this may have consequences for the growth of the fishin the presence of these contaminants.
  (Slide.)
  
Now that is an example of metabolomics.  Now what is different is that Paul also looked at proteomics.  He looked at the expression of CYP1A2 again during this post-observational period, and here you can see on a western blot those results.  This is where the hepatic proteins are put into a polyacrylamide gel that separates them on the basis of molecular weight, and then you visualize them with an antibody to a particular protein.  The density of the band is proportional to the amount of protein present, so here you can see how much CYP1AZ is present.  One of the things we look for is how many bands are visualized, and with fewer bands you infer a good association of the antibody to the protein.  This antibody is against a human 1A2, so it is interesting and it is also nice to see that you get clean bands here.  This growth shows the relative density of those blots with particular outlier fish indicated.  The point here is can we can start using materials that are made for human isozymes against other species.  In this case in the hybrid stripped bass it is a very clean blot.
(Slide.)
  
However the 3A4 gives us a relatively nice ban where we would expect the protein but also shows a lot of other bands.  So this antibody is not as specific as the 1A2 antibody was.  The point of this is to say that in measuring the transcriptome, the proteome and metabolome there is a validation process that has to be performed across all the species.  This is done in spades for the human and for mice and rats.  The zebra fish is becoming a popular model in toxicology, so the data on the zebra fish are being developed.
  (Slide.)
  
How do we put this all together?  Well, the ultimate in species grouping can occur when you make a gene expression database from the transcriptome, and a protein expression database.  In this side you have both a 2D gel and a 1D gel.  This one is from my laboratory.  Finally, you need a database from the metabolome.  The metabolome can best be measured by model substrates, and the information on gene expression from messenger RNA.  You can also get information from animal tissue banks, containing  historical histology.  Histological slides can be probed with antibodies or can be probed for message and that information can be put into a database.  Here is where the bioinfomatics people come in.  There are any number of algorithms that will allow you to cluster gene expression the basis of whatever theory you wanted.  That depends, however, on gene annotation, protein annotation, and the knowledge of protein interactions.
  
This process becomes very difficult when you are trying to understand the mechanism of cell transformation, a normal cell transforming into a tumor cell.  The process can also be used to identify gene function.  When you are looking at the expression of 30,000 genes you really don’t know what most of them do.  So cluster analysis is used and gene annotation is applied to establish why genes cluster into particular groups.  For example, we were working on breast cancer development and looked at cell cycling in a breast cancer cell, and clustered genes to find functions of several unknown genes.  This is a typical example of he use of cluster analysis.  It is a very difficult process with a lot of unknowns, and annotation becomes very important.  Here the problem is relatively simple, even if you don’t believe it at this point in the presentation.  The process is simple because annotation is known.  You know the genes you are going to characterize.  They represent a relatively small subset of the 30,000 in the genome.  At the most may as a group come up with 500 genes and transcripts that we are interesting.  Gene annotation for all of those would be related to physiological processes, biotransformation, or exclusion processes.  So this, the gene annotation, which is a very, very difficult part in cancer research, becomes a relatively simple and straightforward step.  A second point on this is that many algorithms are available for clustering gene expression, protein expression and activity.  But there are very few that can cluster all three, and what we have been working on is the clustering of the phospho-proteome.  This -ome consists of phospo-proteins that are expressed in various disease states by the transcriptome.  This clustering becomes difficult, but clustering of any selected proteins metabolite or transcripts is a very straightforward process.
  (Slide.)
  
So in conclusion -- how is that for timing?  Alistair is thankful.  Species grouping, through characterization and clustering biotransformation process or excretion, all of the variables associated with the determinance of drug activity, is a relatively straightforward process and can be more rapid and cost- effective than simply doing metabolite characterization.  Metabolite characterization is important and we have a lot of valuable information right now, but by adding a clustering algorithm we can move forward much more rapidly.  Model substrates and model inhibitors are better than specific drugs for comparing species due to their selectivity in characterizing specif isozymes of cytochrome P-450. 
  
The first impulse for a clinician or even a researcher is to use the drug of choice, but it is much better to use a series of model substrates, and there are well-characterized, highly-fluorescent model substrates that can be used to characterize the proteome in biotransformation in determinance of drug activity.  Antibodies for characterizing the proteome are more problematic.  They would have to be developed.  Similarly primers for the message and RT-PCR methodology would have to be developed on a species-by-species basis.  But the addition of genomic and proteomic characterization could allow for the development of a query database that can be cost-effective and the only solution to a problem that will stretch across into the 22nd century.  I want to be the first one to mention problems extending into the 22nd century in a public forum; and I thank you very much for your attention.
  (Applause.)
  
DR. WEBB:  Thank you, Dr. Babish.  Our next speaker is Dr. Ronald Baynes.   In his bio, he didn’t give away he trained -- I believe it was Georgia.  I apologize, ---.  Anyhow, he currently is an associate professor of pharmacology at the North Carolina State University.  He is also an assistant director of the Eastern Regional Access Center of FARAD, which stands for Food Animal Residue Avoidance Databank, and his main focus of research today is on the drug chemical food safety aspects of strategies extrapolating drug residue withdrawal intervals for extra-label drug use in food animals.  With that, we will --.