The man whose code – and attitude to life – brought much happiness to chemists

An image showing Dave Weininger

Dave Weininger (1952–2016)

Source: Courtesy of Johanan Weininger

American cheminformaticist who invented the Smiles chemical line notation

How do you become a legend? Intellectual brilliance is certainly part of it. But having wide interests and immense energy can really help to swell the reputation. The chemical informaticist Dave Weininger is just such an example of one who acquired a larger-than-life aura.

Weininger was born in Brooklyn, New York, the eldest of three sons. His father, Joseph, was an Austrian-born electrochemist and chess master who worked on polymeric electrolytes for batteries at General Electric. Growing up the children sang science songs from vinyl LPs that their uncle gave them, and their mother, Marion, took them to lots of extracurricular activities. They spent many days at a camp in a forest in nearby Rensslaerville conducting science investigations and exploring the woods. When Dave was about 15, his mother enrolled him on a computer programming course at a local technical college. This led to him tinkering with computers at home and becoming an early digital native.Tragically Weininger’s mother died in 1976, leaving a huge hole in the family.

He would joke he was too lazy to fish all the trout from Lake Michigan

The legend is that Weininger didn’t finish high school, but in reality he was offered an early place at the University of Rochester, US. He spent his junior year abroad in Bristol, UK, where he rode a Triumph Trident called Nigel that he may, or may not, have ridden in the Isle of Man TT. He later did a PhD in environmental engineering at the University of Wisconsin, Madison, studying the bioaccumulation of Monsanto’s extremely long-lived and lipophilic polychlorinated biphenyls by various species of fish in the Great Lakes. He would later joke that he was too lazy to fish all the trout from Lake Michigan and developed a computational model instead, using data provided by the newly-established Environmental Protection Agency (EPA). Crucial to his models were the partition coefficients for pollutants. Looking at his thesis today, one is struck by his complete mastery of computer graphics in an age when punched cards and tape were still used for data entry and storage.

An image showing Dave Weininger

Source: Courtesy of Johanan Weininger

On graduation he was hired by the EPA to develop similar models for other chemicals. Weininger began to work with the chemicals databases that collected structure-activity relationships, an idea that had been developed by chemist Corwin Hansch at Pomona College in California. Hansch had started to relate the octanol/water partition coefficients of molecules to their chemical structure and, in turn, to their biological activity.

At the EPA Weininger became extremely frustrated by the difficulties associated with entering chemical structures into the databases and searching for them. The Iupac nomenclature system is OK for a human to use for small molecules but rapidly becomes impenetrable past about eight carbons. It was also impossible for computers to deal with. On the other hand the Wiswesser line notation, developed in 1949, was compact and worked fine for computers, but its complex code of letters and numbers was hard for a person to learn and difficult to read on the page.

’He won’t win the Nobel prize but it’s big’

Weininger decided to develop his own system that represented each atom with a letter in upper or lower case to indicate aromatic or aliphatic character. The text strings were based around a central chain with branches and ring connections indicated with numbers. Most importantly, it was based on a linguistic system with a tiny vocabulary and only six rules for the grammar. Molecules were written as a word that embedded chemical information; it could be read as easily by a computer as by a human. Dave thought that a high school student could learn it in about 15 minutes; experienced chemists were slower as they need to unlearn traditional chemical nomenclature.

Weininger called the system Smiles: Simplified Molecular Input Line Entry System, in part because it seemed to make his colleagues so happy. After a telephone call with his son, Joseph turned to Dave’s younger brothers and said ‘Your brother is on to something. He won’t win the Nobel prize but it’s big.’ But Weininger didn’t write up his work – he simply put Smiles out there for others to use. Eventually, his father got so frustrated by his son’s failure to publish the scheme in an academic journal that he wrote the paper for him. The acknowledgments thank Joseph L Weininger for ‘editorial assistance’.

An image showing a molecule and its SMILES code

Source: Courtesy of Wikimedia Commons

Ciprofloxacin written as a Smiles string (bottom) 

Weininger would not stay with the EPA for long. Career progression meant doing more management, that he hated, and less programming, that he loved. After three years in Duluth he moved to California, joining Hansch and Albert Leo at Pomona College to code their algorithm that predicted log P for a molecule. A software package cLogP, released in 1983, started from Smiles to calculate log P and draw the structure on the screen, an amazing innovation. Weininger and his brother Art also added Thor and Merlin, fast and elegant substructure lookup routines. The software began to sell so well that Pomona College worried that their charitable status might be in jeopardy.

Dave and Art then set up a company called Daylight with a business associate, Yosi Taitz. Annual user group meetings in Santa Fe were informal gatherings where the larger-than-life Dave, burly, smiling and charming, injected joy and energy into everything. When he spoke to someone they knew that they had his full attention; nothing else mattered. Ideas and collaborations flourished, with Dave always pushing for openness and sharing.

Around Daylight numerous other informatics businesses flourished and Weininger would coin the phrase ‘Info Mesa’ to describe this cluster of creativity. His interests are hard to believe. He was an outstanding chef, played folk banjo and guitar, built a small public observatory at his home, and collected and flew several aircraft, one of them a jet. He died of pancreatic cancer in 2016, leaving his family and many friends bereft. He left a huge legacy in chemical informatics and myriad stories. What is wonderful is how many were true.

Acknowledgments

This article would not have been possible without help from Yvonne Martin, Ant Nicholls, Jeff Blaney and Johanan Weininger, each of whom offered stories and reminiscences.