Chemists and biologists have different expectations when it comes to data, says Derek Lowe

How much do medicinal chemists and their biology colleagues really trust each other’s data? In the end, they have to, because drug discovery is a team sport. But the interpretations that each group lends to the available numbers can vary widely. And in this case, I’m going to have to fault my fellow chemists (and myself) for being more often in the wrong.

There are mitigating circumstances. The biggest errors in this area are typically made early in a person’s career, and by chemists who are not used to working with biological assay data. If you’re accustomed to the level of detail that’s possible in chemistry, such as high-res mass spec data down to four decimal places, you may be set up for two complementary mistakes. The first one is to overestimate your own precision and accuracy, and the second is to believe that everyone else attains that same high level, too.

Paul Boston represented by Meiklejohn.co.uk

Chemists tend to think of themselves as able to work to pretty exacting levels when the conditions call for it. Some readers may be familiar with a particular (and extremely annoying) style of writing up experimental details for publication: ‘1.06 equivalents of the base was added dropwise at –53°C, and the resulting mixture stirred for 47 minutes. The temperature was raised to –28°C, and the reaction quenched by addition of pH4.16 buffer solution (freshly prepared)…’ That sort of thing. The implication is that the authors are capable of working to these standards, and that you obviously aren’t. That has to be the case, since you didn’t get the reaction yields they claimed, now did you? Clearly, no other explanation is possible. Write to the authors and ask them, if you’d like to be told that straight from the source.

But reaction yields vary, of course, and there are lots of factors to chase down. Process chemists appreciate this, as the economics (and regulatory status) of large-scale drug synthesis depend on these things being well understood and controlled. Standard bench chemistry, though, tends to be quite a bit less robust, which is worth remembering, especially for a new member of the medicinal chemistry department encountering screening data for the first time. They will see that compound 23204 has a half-maximal inhibitory concentration (IC50) of 50nM, while its close analogue, compound 23205, comes back as 100nM. Obviously, they will note, the second compound is half as potent.

Lack of reproducibility is the price of working at the edge of human understanding

Ah, but not so fast. Screening data can come out without error bars on it, so you’ll need to supply them yourself, on the fly. A good rule of thumb, for most in vitro protein assays, is that everything is the same within a factor of two (at the very least). So there’s no particular reason to assume, in the absence of other data, that ’204 and ’205 are any different at all. If you really want to infer anything about structure–activity relationships, you’ll want to have them both tested again.

The situation gets even fuzzier when compounds are evaluated in a cellular assay, and if you think those numbers are woolly, then just wait for the whole-animal data. The number of variables increases, and increases again, and sometimes things just don’t work for reasons that are hard to clarify. The best hope is to control for everything reasonable, and always have a standard (in every new experimental run) for comparison. By the time you get to rats and mice, the variability really does surpass anything that an organic chemist will have been used to. In case you’re wondering, the situation does not improve on moving to humans, as any clinician will tell you in a resigned tone of voice.

Biology really is less precise than chemistry; there’s no escaping it. There are more variables, and too many of them are just unknown. We chemists will always do well to keep that in mind, but that doesn’t give us the right to look down on our colleagues, either. We can be glad (or even proud) that our own science is more easily controlled – well, for the most part – but a living cell is more complicated than anything that almost any other science has to offer. The biologists aren’t thrilled by lack of reproducibility, either, but they know that it’s the price of working at what is often the edge of human understanding.

Derek Lowe (@Dereklowe) is a medicinal chemist working on preclinical drug discovery in the US and blogs at In the pipeline