Infectious Diseases

and the Evolution of Virulence

Elliott Sober

We all are anxious to know when the COVID-19 pandemic will end, but what will happen to the disease’s virulence between now and then? Do infectious diseases inevitably decline in virulence? The answer is no. Scientists have observed some infectious diseases whose virulence has held steady, and others whose virulence has increased. These observations are fine as far as they go, but they fail to describe the causal factors that govern how virulence evolves. You need to grasp what those causal factors are if you want to predict what will happen. I’m going to present a simple mathematical model that describes those causes. The simplifications make the logic clearer, but they also make the model unrealistic. The lack of realism is harmless, however, since the conclusions I’ll draw concerning the evolution of virulence remain in place when the needed complexities are taken into account.

R and Epidemics

R-zero (R0) is a fundamental quantity in the epidemiology of infectious diseases; it’s the average number of individuals that someone with the disease infects when the disease is new in the population. Although R0 characterizes only what is true when a disease is new, there is a more generally applicable quantity whose value can change as the disease spreads. It’s called R. The R value of a disease in a population at a given time is the average number of individuals that someone with the disease infects at that time. As a disease spreads, many individuals may die, many may become immune after they are infected, and many may increasingly avoid crowds. If so, the R value will decline, since there now are fewer potential hosts. R can start off large and become small.

There is a simple rule of thumb for epidemics: a disease with R > 1 will spread, and a disease with R < 1 is on its way out. To eradicate an infectious disease, you have to drive its R value below 1, and keep it there. Keeping R below 1 is key. Crossing from 1.1 to 0.9 doesn’t throw a magic switch from on to off; it merely signals that the number of new cases, at the moment, is shrinking.

I so far have talked about an infectious disease as if a disease is a single entity. In fact, an infectious disease is a population of parasitic individuals. Those parasites are not carbon copies of each other; they differ from each other owing to mutation and genetic recombination. This variation is a necessary condition for natural selection to cause the infectious disease to evolve.

Virulence Defined

To think clearly about whether the virulence of a disease will increase or decline as it spreads through a population of hosts, we need to define our terms.  I define virulence as the average effect the disease has on the host’s lifespan after the host is infected. Individuals infected by a more virulent disease die sooner, on average, than individuals infected by a less virulent disease. This definition is pretty standard in epidemiology, in part because it is useful in calculating R values, but outsiders to the science sometimes think that a disease’s virulence is simply the number of people the disease kills. This other definition does not distinguish a disease that kills a total of 500,000 people, who die the day after they’re infected, from a disease that kills the same total number of people after they’ve had the disease for a decade. I mention this other definition so you’ll be clear on the definition I’m discussing. Note also that virulence as I define it doesn’t have to do with how many people get infected; the question is what happens to people once they are infected.

High- and Low-Virulence Strains and the Three Factors That Determine a Strain’s R Value

In order to construct a simple model that describes how a disease’s virulence evolves by natural selection, I’ll suppose that the disease has two strains, one more virulent than the other. I’ll call these the high-virulence and the low-virulence strains, though please keep in mind that virulence is a matter of degree. To keep my analysis simple, I’ll assume that a host infected by the disease either has the high-virulence strain or the low strain, but never both. This simplifying assumption is harmless, since the take-home lessons remain the same when the assumption is dropped, as I’ll explain.

Virulence is one of three factors that influence a strain’s R value, as shown in Table 1. If Dhigh is the average number of days a host lives after being infected by the high-virulence strain, and Dlow is the average number of days a host lives after being infected by the low-virulence strain, then, by definition, Dhigh < Dlow. From now on, I’ll restrict my attention to viruses, even though there are many infectious diseases that aren’t.

Table 1. The R values of two viral strains, one more virulent than the other

Table 1 describes how you can calculate the R value a strain has at a given time. For example, suppose that, at a given time, the average host, once infected by the high-virulence strain, will live another 30 days. Suppose further that this average host, once infected, sneezes and coughs 1,000,000 high-virulence organisms into the air each day. That means that in 30 days the average host has sent 30,000,000 high-virulence organisms into the environment. Finally suppose that 1 out of every 5,000,000 of these airborne organisms infects a new host. This means that the average host, once infected by a high-virulence strain, will infect 6 hosts. The R value of this high-virulence strain at the time in question is 6. A similar calculation can be carried out for the low-virulence virus strain, once values for the three quantities represented in Table 1 are supplied. The low strain will have its host stay alive for more than 30 days (on average), but its host may expel fewer than 1,000,000 virus particles per day (or more), and low-virulence and high-virulence airborne viruses may differ in their probability of infecting a new host.

The example just described takes you from the infected hosts at one time to the individuals those hosts infect a bit later. Those newly infected hosts begin a second step in the process, and their values for D, E, and F can be used to calculate what happens next. The outcome of this second step can be used to calculate what will happen in the third. In this way, the model can cover stretches of time of any length you choose to consider, since the long run is a sequence of short runs. However, there is a complication: the D, E, and F values may change as the process unfolds.

How the Virulence of a Disease Evolves by Natural Selection

When I introduced R, I said that the fate of a disease is determined by whether its R value is greater or less than 1. Here ‘fate’ means whether the disease spreads to more and more hosts or gradually disappears. I now am discussing the fates of two strains of a disease, but the present question is not whether the number of individuals infected by each strain will increase. Rather, the question is whether the frequency of infected hosts who have the high-virulence strain will increase (and so the frequency of infected hosts with the low-virulence strain will decline). To see how the mix of high and low strains will change, we need the following two-part criterion:

The high-virulence strain will increase in frequency (and the low strain will therefore decline) if Rhigh > Rlow.

The high-virulence strain will decline in frequency (and the low strain will therefore increase) if Rhigh < Rlow.

Notice that these two inequalities say nothing about whether the R values are greater or less than 1. A host population can change its mix of high and low strains while the number of new cases is growing, or shrinking, or staying the same size. The disappearance of a disease and a reduction in its virulence need not go hand-in-hand.

Although it’s true by definition that Dhigh < Dlow, this doesn’t answer the question of whether Rhigh > Rlow.  The question is left open because the number of days a host lives after being infected is only one of the three factors that determines a strain’s R value. That’s the simple reason why there is no unconditional guarantee that a disease will evolve a greater virulence, or that it will do the opposite.

That said, the fact remains that the high-virulence strain automatically starts the process depicted in Table 1 with one strike against it. Strike one is the fact that Dhigh < Dlow. If the high-virulence strain is to have the bigger R value, then either Ehigh > Elow or Fhigh > Flow. If neither of these inequalities holds, there are three strikes against the high-virulence strain. In baseball, three strikes means you are out. Here, three strikes does not mean that the high-virulence strain instantly disappears; rather, it means that the strain declines in frequency—it is on its way out.

Virulence and Expulsion

I introduced D (the average number of days a host is alive after it is infected), E (the average number of particles expelled by the host per day), and F (the fraction of emitted disease particles coming from a host that infect new hosts) as separate quantities, but the fact of the matter is that the first two are often associated, and this is especially pertinent to virus infections. A more virulent virus often reproduces more rapidly inside its host’s body than a less virulent virus does, and a host infected by rapid reproducers will often send those particles into the air at a higher rate than a host infected by slow reproducers will do. In this case, Dhigh < Dlow and Ehigh > Elow. This pair of inequalities is illustrated in Figure 1. For simplicity, I’m assuming that the rates at which high and low viruses are emitted by their hosts don’t change in the host’s lifetime.

Figure 1. An example in which a host infected by a high-virulence virus has a shorter subsequent lifespan than a host infected by a low virus, but high-virulence viruses are emitted into the air at a greater rate than low viruses are. The area of a rectangle represents the total number of viruses emitted by a host during its lifespan.

I said that hosts infected with a high-virulence strain ‘often’ emit particles into the air at a higher rate than hosts infected by a low strain. Why isn’t this relationship always true? The reason is that there are causes of high virulence that don’t entail a higher emission rate. For example, suppose that the high-virulence strain is better able to disarm the host’s immune system. Successful sabotage doesn’t always increase the rate of sneezes and coughs.

Virulence and Host-Finding

In order to characterize the role played by the third factor that affects a strain’s R value—the fraction F of the viruses emitted by a host that infect new hosts—let’s focus on the case in which the high and low strains send the same total number (n) of particles into the air. This means that the two rectangles in Figure 1 have the same area. Under what circumstances will Fhigh be greater than Flow, and in what circumstances will the reverse be true?

Viruses launched into the air have a time-bomb ticking. If they don’t find a host fairly quickly, they die without reproducing. Hosts housing the high-virulence strain send their n particles into the air over a shorter time span than hosts housing the low-virulence strain. This fact about temporal spread also applies to space. Since hosts infected with a low-virulence strain live longer on average, their sneezes and coughs will often be dispersed over a wider territory than the emissions coming from a host with the high-virulence strain. The high-virulence strain puts all of its n ‘eggs’ into a narrow spatiotemporal basket; the low strain spreads its n investments across a wider chunk of space and time.

When potential hosts are few and far between, a spatiotemporally diffuse cloud of n low-virulence viruses will be more successful in finding new hosts than a more compact cloud of n high-virulence viruses. For example, recall the hypothetical example described earlier of a high-virulence strain that kills its host after 30 days. If new potential hosts appear in the host’s vicinity only every 40 days, the high-virulence virus cloud has a higher risk of extinction than a longer-lived cloud of low-virulence viruses that is spread over 50 days.

On the other hand, if there are lots of potential hosts around, the advantage goes to the high-virulence group of viruses. This may sound puzzling. After all, if there are lots of hosts, won’t both groups of airborne particles easily find hosts to colonize?  Indeed they will, but the high-virulence group does so more quickly, and that represents an evolutionary advantage. Virus clouds that infect 6 new hosts every 30 days will outcompete clouds that infect 6 new hosts every 50 days.

This analysis is illustrated in Figure 2. Each viral strain has a higher F value the higher the spatiotemporal density is of potential hosts. That’s why both lines in the figure have positive slopes. Notice, however, that the two lines cross. A drop in the density of potential hosts does not automatically signal that the low strain now has the higher F value. What is true is that if the potential host density drops and keeps on dropping, eventually the low strain will have the higher F value.

Figure 2. If a host with the high-virulence virus and a host with the low virus emit the same number of particles into the air (so they have the same D x E value), their F values depend on the density of potential hosts.

There’s more to infecting hosts than bumping into them. Once bumping occurs, a virus needs to stick to its host and then get inside the host’s cells. High and low strains can differ in both these ways; if they do, these post-bump considerations will affect which strain has the higher F value.

Removing the Main Idealization

The model I have described involves the assumption that an infected host houses either the low-virulence or the high-virulence strain, but never both. This assumption is unrealistic because viruses mutate, and a host can be infected multiple times. To remove the idealization, we need to consider what happens inside a host that contains a mixture of high- and low-virulence particles, and we also need to compare hosts that house different mixes of high- and low-virulence particles. I addressed part of this second topic when I compared a group of viruses that is 100% high with a group that is 100% low, but these are the extreme cases; we additionally need to compare groups of viruses that are internally heterogeneous.

If a host is infected by a mix of high- and low-virulence particles, the D value for that host is lower the more high-virulence particles the host initially contains. In addition, during that host’s lifetime, high-virulence particles may replicate faster than low, in which case the high-virulence strain that lives in the host will increase in frequency as the host ages.

As noted, when each host contains just one of the two virus strains, DlowDhigh > 0 (by definition). This inequality remains true when hosts contain both strains, as long as the different hosts don’t all have the same mix of high and low. However, the size of the difference between Dlow and Dhigh declines the more mixing there is.  This means that the first strike against the high-virulence strain remains in place, but its magnitude is smaller when there are mixed groups.

Similar points apply to E and F.

Host Response

The model I’ve presented shows how parasite virulence can evolve, but it says nothing about how hosts may respond to parasite invasions. They too can evolve, but hosts typically evolve much more slowly than parasites. This is especially true when the host is human and the parasite is a virus. The human generation time is measured in years; by comparison, virus replication is lightning fast. In addition, virus populations are much bigger than human populations, so virus populations will have more mutations and recombination events. The resulting genetic variation provides the raw materials on which natural selection goes to work.

If genetic evolution is slower in humans than it is in viruses, it may seem that we humans are sunk. Not so, since we have a second line of defence. Hosts can change their behaviours quickly, even though their genes evolve slowly. An increase in social distancing doesn’t require a new gene to evolve that makes us stay away from each other. Ideas can spread fast. Ideas also can lead to the invention of vaccines and cures. Social distancing and vaccines reduce the number of infections, but it is a further question whether they will also reduce virulence.

Four Conceptual Comments

1. According to the model just described, a disease can change its virulence even if the different strains of the disease never change their levels of virulence. This is possible because the virulence of a disease is an average of the virulence of the different strains, and strains can change their frequencies.

2. Although the model concerns the evolution of a virus’s virulence, the model focuses on counting hosts, not counting viruses. This point was visible from the start, when R was defined. The R value of a viral strain is the average number of people that a person housing the strain infects, and a change in strain frequency occurs when one strain has a higher R value than the other. We’re not counting the number of low- and high-virulence virus particles and comparing them; we’re counting the number of new hosts that have each infection and comparing them. To visualize this point, consider a simple example. Suppose that one viral strain has an R value of 3 while a second has an R value of 2, but the hosts housing the former strain have a much smaller number of virus particles inside them than the hosts housing the latter.

This point has an analogue in the biological and philosophical literature on the units of selection problem. To conceptualize group selection, you need a notion of group fitness. The question is whether the fitness of a group of organisms is measured by the number of individual organisms the group produces, or by the number of new groups that the group founds (Okasha [2006]). Discussion of group selection has mostly focused on the evolution of altruism and selfishness, which are traits of individual organisms, so the first measure of group fitness is the one that usually gets used (Sober and Wilson [1998]). Even so, the second measure is interesting and may have important applications.

3. The model I’ve described provides a lesson concerning the following line of reasoning: ‘If a virus is maximally virulent, it will kill its host before the virus has a chance to infect new hosts. Therefore, an infectious disease must evolve in the direction of reduced virulence.’ This argument is fallacious because it overgeneralizes. It’s true that a maximally virulent virus will promptly disappear from the host population. However, it does not follow that a higher-virulence strain will automatically be replaced by a strain of lower virulence when the higher-virulence strain is less than maximally virulent. The mistake involved here might be called the fallacy of fixating on the most extreme case.

4. It’s one thing to describe what it takes for the virulence of an infectious disease to increase or decline; it’s something quite different to predict what will happen in a given epidemic. The reason for this gap is that the values of D, E, and F for the different strains of an infectious disease can be hard to estimate, and those values can vary in space and time, so that good estimates for here-today may be way off when it comes to there-tomorrow.

An Empirical Example

The evolution of the myxoma virus in Australian rabbits has been an enormously influential example in epidemiology. It has been a poster-child for the idea that virulence inevitably declines. Europeans brought rabbits to Australia in the nineteenth century, with disastrous ecological consequences. In the 1950s, the government tried a remedy; it introduced an extremely virulent strain of myxoma, which, as expected, more than decimated the rabbit population. The virus spread rapidly and inflicted prompt and grisly deaths in more than 99% of infected rabbits. It is no surprise that rabbits evolved a greater resistance to the virus. A second effect was more remarkable—myxoma evolved a lesser virulence. Frank Fenner and Francis Ratcliffe ([1965]) were key figures in this epidemiological experiment.

The case has recently been re-examined, using gene-sequencing techniques that were unavailable when the decline in virulence was first observed. The reported decline did occur, but virulence then increased in several myxoma lineages. The average virulence curve for the virus’s different strains plunged from its initial high value in the 1950s to something very low in the 1960s, and then gradually climbed back up to a middling value by the 1990s (Geoghegan and Holmes [2018]).

Concluding Comments

Next time you hear someone assert that the virulence of infectious diseases always declines, beware! And also beware of the opposite assertion, that infectious diseases always become more virulent. Both these unconditional pronouncements are wrong. Reduction in virulence is a possible evolutionary outcome, but so is increase. The devil is in the details. These details are summarized in Table 2.

Table 2. A scorecard comparing high and low strains for the three factors that together determine which strain’s R value is higher.

The strains of an infectious disease compete with each other, and there are three arenas (D, E, and F) in which that competition takes place. If one strain beats the other in all three arenas, it increases in frequency and the other declines. Alternatively, if one strain beats the other in a given arena, but the reverse is true in another arena, you need more quantitative information about what happens in the three arenas to determine which one wins overall. In either case, winning today does not guarantee winning tomorrow, when the three-fold competition recurs.

Infectious diseases don’t just spread or disappear. They also evolve. To think about how they evolve, the indispensable first step is to see that infectious diseases are populations that contain variation for many characteristics, virulence included.


I thank Ethan Bier, Craig Callender, Hayley Clatterbuck, Daniel Hausman, Stephen Hedricks, Don Moskowitz, Samir Okasha, William Roche, Larry Shapiro, Alan Sidelle, Athena Skaleris, Jacob Stegenga, Michael Stern, and Eric Winsberg for comments.

Elliott Sober
University of Wisconsin-Madison


Fenner, F. and Ratcliffe, F. [1965]: Myxmatosis, Cambridge: Cambridge University Press.

Geoghegan, J. and Holmes, E. [2018]: ‘The Phylogenomics of Evolving Virus Virulence’, Nature ReviewsGenetics, 19, pp. 756–69.

Okasha, S. [2006]: Evolution and the Levels of Selection,  Oxford: Oxford University Press.

Sober, E. and Wilson, D. [1998]: Unto Others: The Evolution and Psychology of Unselfish Behavior, Cambridge, MA: Harvard University Press.