Artificial Intelligence. Image: iStock
Major tech companies and government agencies continue to enhance their computational power. Image: iStock

Artificial intelligence is moving into all areas of engineering, science, business and industry; indeed, AI is now the dominant approach, pushing others to the background.

Recently, DeepMind, owned by Google, demonstrated an algorithm called AlphaFold to predict the three-dimensional structure of a protein from its amino-acid sequence. This is a fundamental problems in biology. Laboratory methods are laborious and therefore progress has been slow. AlphaFold would make the process very fast and thereby greatly accelerate important applications such as discovering new drugs.

While protein-structure prediction grabs one’s attention, AI is being used to address many practical problems, such as text mining and facial recognition. The basic building block of AI is a deep neural network.

These are extremely complex mathematical structures consisting of very simple computational elements that are assembled into numerous layers to produce computational machines. Data are input on one end and the variables determining the desired structure are output at the other.

A single neural network can have hundreds of billions of parameters. These act like many billions of numerical knobs to turn, thereby allowing neural networks to represent extreme complexity.

None of this would be possible without high-performance computing. Major tech companies and government agencies such as the US Departments of Energy and Defense continue to enhance their computational power, so that ever larger (deeper) networks will be built.

We are looking at controlling highly complex systems, such as climate and cells, and developing new sources of energy and military weapons, so that one should not be surprised that China is investing heavily. AI is being used to study Covid-19 and other diseases, such as cancer.

OK, but the picture is not totally clear. AI is bypassing the classical scientific method of trying to understand physical mechanisms and mathematically representing those mechanisms to build a mathematical model that can be used to represent nature. Given such a model for protein structure, one could in principle re-create the structure of an individual protein from the amino-acid sequence.

Such profound understanding of nature at multiple levels has generally not been forthcoming. The experimental and theoretical difficulties are prohibitive. For instance, with the amino-acid sequence identified, protein structure is found by laboratory methods.

On the other hand, a neural network “learns” from data. Given data arising from previous work that has produced the structure of some proteins, the network is “trained” on these, which means its parameters (billions) are tuned so that the network produces the desired structures from the amino-acid sequences.

Finding an appropriate neural network and a satisfactory learning process requires substantial skill on the part of the computer scientists designing the network. 

Once designed, a neural network is tested on other (test) data. It has to “generalize” from the data on which it was trained to data it has not seen before.

Many problems arise during training and testing. A deep network is extremely complex and can “overfit” the training data, meaning that it works well on the training data but does not generalize. Even if it works on the test data, does it work on data that is structurally different from the test data?

In classical science, the scientific model is tested by random sampling data from the population of interest (in practice, the full model is not tested but only components of it). Such testing is impossible if one is confined to using existing data. 

Derek Lowe, a drug researcher, recently commented in Science,

“I hate to be this blunt about it, but things like coding and hardware design are almost completely human-bounded activities.

“Now, I appreciate that mathematics sets up fences in algorithmic optimization and the like, and that chips are constrained by things like die size, photolithography technology, heat dissipation and so on. But once inside those sorts of things, you have relatively free rein for your imagination and talents, at least compared to what happens in drug discovery and its allied fields.

“It’s a playground compared to the trackless swamps of a living cell or a whole organism, most of which we don’t even understand in enough detail to even grasp why we failed that last time.” 

But the point of big-data-driven AI is that we do not need to understand this detail. We will never fully understand such complex systems, and we desire to control disease and affect the climate, so we bypass the understanding by cleverly using data. Is this reasonable? A more relevant question may be: Is there a practical alternative, especially when we have access to such massive computational power?

There is certainly a discussion to be had here. It is not pleasing to discuss giving up on scientific knowledge or to apply tools whose domain of application is not known. To address these issues it is important to develop new theories concerning big data and AI that will provide an epistemic foundation for the kind of knowledge that these provide.

For instance, probability theory is applied in classical science to characterize the domain of application to which a model applies. Often this takes the form of testing on a random sample from a population of data. Can we develop a theory that applies when a neural net is tested on some previously collected data sets that have not been drawn according to any statistical protocol? 

It is also critical that existing scientific knowledge be integrated with data so that the AI machinery is constrained by what we do scientifically know and can be applied fruitfully to situations where data are limited by availability and cost. Here one might think of having only 50 tissue samples or where a single data point costs $1 million.

As an example, we may have substantial knowledge of gene regulation in a complex genetic model and it would behoove us to utilize that knowledge in conjunction with new data to design a network. In this sense, we are joining two types of knowledge: scientific knowledge and data. This is another place in which substantial new theory is needed. 

We should not be bypassing the traditional scientists, because skilled experimentalists are needed to obtain the most useful data and when understanding can be applied it greatly enhances the AI computational machinery.

Unfortunately, Lowe’s opinion gets little traction as classical science and engineering, which require both bench scientists and mathematicians, are being starved for funding. If you are a young person with strong computer-science skills, are you not likely to be tempted by the gigantic salaries and resources provided by big tech to work on big data and ever deeper neural networks?

Does AI deserve all the hype? Yes, but with reservations.

Edward Dougherty is distinguished professor of engineering at Texas A&M University.