2008/07/14 10:45 David C. Schwartz, "The New Biology", ISSS Madison 2008

ISSS Madison 2008, 52nd Annual Meeting of the International Society for the Systems Sciences

This digest was created in real-time during the meeting, based on the speaker's presentation(s) and comments from the audience. The content should not be viewed as an official transcript of the meeting, but only as an interpretation by a single individual. Lapses, grammatical errors, and typing mistakes may not have been corrected. Questions about content should be directed to the originator. The digest has been made available for purposes of scholarship, posted on the ISSS web site by David Ing.

Intro by Gary Metcalf

David Schwartz

  • Professor of Chemistry and Genetics
  • Gary went to TED conference, and heard Craig Ventor mention David Schwartz, some history that they worked on
  • Research on development of new genomic models:  nano and micro fluiditics, bioinformatics ...

20080714_1045_ISSS_Schwartz.jpg


Want to enable universal human genome analytics

Problem #1:  Maybes

  • Information technology, simulations, analytical tools, can generate thousands of maybes per day
  • Findings in cyberspace that need experiments to become stories that become real scientific findings

Example:  a spreadsheet with thousands of entries

  • Out of thousands, some genes may be associated with cancer
  • Pick a few findings, working a lab is really slow

Second problem:  biological space

  • 1/3 computer science, math and statistics
  • 1/3 chemistry, physics and engineering
  • Balance is biology and genetics
  • At Wisconsin, training the "new biologist"
  • Train them to move around the disciplines
  • Teach them to embrace team work
  • Use sophisticated instrumentation to plumb biological complexity

The new systems

  • To a few years ago, could write a Ph.D. thesis on one gene
  • Over the last few years, need to think about many different genes, think of whole genomes, e.g. 6 feet of DNA for humans
    • Instead of dealing with one gene, work with many genome
    • When multiple genomes, then can deal with whole populations
    • Rapidly getting ability to sequence everyone
  • Many cells:  cell biologist, measure many cells, sensitivity is important, so need lots of cells
    • Have been doing a bulk sample, now can do on individual cells
    • To get statistical significance, have to take many measurements
  • Lots as individual molecules
    • e.g. PSA test, want sensitivity
    • Now, ability of ultimate sensitivity, at the single molecule level
  • Single amino acid, from mass spectrometry:  measure weight to down to an hydrogen atom
    • Weigh and then weigh again, could lose an eyelash, flake of dandruff

Still have too many measures

Old approach:  tubes of DNA

  • DNA samples, go into a test tube with a robot
  • Suppose a million samples, then need a huge room just to house the samples
  • Brute force approach

New approach:  Take single DNA molecules

  • Trivial to do sophisticated measurements on each DNA molecule

Modes of inquiry

  • Have been able to do discovery science as large scale screens, based on chance
  • Also hypothesis-driven theories, based on mind
  • Combine discovery and hypothesis, so chance favours the prepared mind

New biology:  single cell, single molecule systems, high multidimensional databases

Biological impedance matching

  • Given advances in IT, the amount of maybes has increased, while the number of stories are about the same
  • Are massive candidates possible? No

The loop:  hypothesis generation by an individual or small group, produce candidates

  • For each candidate, do an experiment
  • Get results, which rarely have expected results, so leads to more experiments, leading to exponential
  • This is not sufficient, now

Article from Wired magazine, The End of Science

  • Don't think it will capture everything
  • Biology will remain an experimental science
  • Can imagine infinite complexity

Look at physicists

  • There are theoreticians, and then those who do the experiments (e.g. CERN)

Large data sets, e.g. CERN can detect 800 million proton-proton collisions a second, as much as entire European telecom network, can pick up one collision

Biology doesn't work at such high energies

Generating large datasets involves automation, multiplexing and parallelization

  • This is a hard problem
  • Need to be able to rapidly put together complex, multidimensional experiments

Have engineer envy:  love their tools, CAD/CAM

  • e.g. creating a plastic money clip
    • After engineer finishes design, then manufacturing, a solid printer on a layer of polymer
    • At end, get a set of money clips
    • Gives detailed iteration, could test as an object
    • Fast from cyberspace into physical space
    • Why can't we do this in biology?

Would like biological CAD/CAM

  • From database, create simulations
  • Have visualizations, to create an experimental assembler
  • The components of cells, peptides, nucleic acids
  • Compare results against hypotheses, continue until satisfied

Gutenberg's time, 1400s, invention of movable type

  • One page, carved out one plate
  • This is how we do experiments today
  • Gutenberg came up with the idea of moveable type, can change, and don't have to throw away

Movie:  water droplets

  • Green experimental protein, red experimental protein, could view each
  • Bring these together in juxtaposition, look at interactions

Approach the ability, in biological experimentation, to do movable type

What's missing:  operating system

  • What do do with the measurements
  • The new biologists have to handle this

Questions

Component-based programming and interfaces.  Movable type simple, in biology want interactions.  Interesting, but a long ways away.

  • Mindful of pitfalls
  • When we put systems together, if we don't get answers, we don't get funding
  • There are some emerging companies, e.g. microdroplet is getting commercial, think may become universal
  • Other ways to represent experimental motifs
  • Can represent gene sequences on optical fibres
  • IBM to hack circuit boards, could be used

Epigenetics, autogenetics?

  • Epigenetics is the great frontier
  • Environment and parts that aren't genetic
  • Biological field is in infancy
  • DNA bases, ACTG as describing a blueprint isn't correct
  • Little notations on DNA molecule, DNA manipulation
  • Up to 2 months ago, didn't have proper means to see where the manipulation takes place

Molecular biology are filled with proteins affecting cellular process.  New causation?  Network causation, how do we review these papers?

  • Everything is in turmoil, similar to change in 19th/20th century from regular mechanics to quantum mechanics
  • Think it's real, doesn't fit in frameworks
  • Systems biology, pharmaceuticals have been doing this from 20th century
  • Problem, too much interpretation and too few measurements, trying to fit into old paradigms

Network theory has involved system biology, but there's other systems processes that other scientists aren't taking advantages.

  • Directly contact investigators and make suggestions
  • Then how to write a grant review

New experiments.  Hierarchies and modularies.  Conceive of experiments as hierarchical as modularity.

  • Thinking in terms of experimental motifs