# 11.3 Random Networks

## MY BRAIN IS OPEN

• There are multiple ways to define a random network.
• As edges are added randomly to a collection of nodes, groups of connected components become larger, resulting in a "connectivity avalanche."

Paul Erdös was a Hungarian mathematician famous for both his exceptional mind and his rather extensive list of collaborators. After receiving his doctorate in the 1930s, he proceeded to work diligently throughout much of the 20th century until his death in 1996. He was famous for his habit of showing up on colleagues' doorsteps, with a suitcase that contained all of his worldly possessions, and greeting his future collaborator by proclaiming, "My brain is open." This was his way of letting colleagues know that he was interested in collaborating with them on some difficult problem of the day. Erdös was sort of an itinerant mathematician, hopping from one collaboration to the next, connecting to many in the math world. Because of his ability to work with people and forge numerous connections, it seems fitting that some of his most influential work was in the study of networks and their graphs.

Erdös was one of the most prolific mathematicians in history, authoring or co-authoring more papers than anyone except Euler. One of those collaborations, with Alfréd Rényi, resulted in one of the key ideas in modern graph theory, the random graph.

As mathematicians work to model real-world networks, an issue that arises is that of determining a general taxonomy of networks. Perhaps there is a hierarchy of structure, and if so, where do real-life networks fit in the hierarchy? Is there some attribute that characterizes real networks? Some real-life networks, such as those that make up crystals—physical structures in which atoms are connected by chemical bonds—are extremely ordered. Such regular networks can be modeled by graphs known as lattices.

Other networks exhibit very little regularity, their connections seeming to be haphazard and unplanned. When we call such groupings "random networks," what exactly do we mean by that term?

Erdös and Rényi gave two different definitions of a random network. An action-oriented description of their first definition is: for a given number of elements, N, imagine the set of all the possible ways in which they could be connected and select one of these at random. To figure out how many graphs there are to choose from, we can use the C(n,k) function from our previous unit on combinatorics. Because an edge is a connection between two nodes, the number of possible edges between N nodes is equal to the number of ways to select two out of N things, C(N, 2).

To figure out how many possible graphs can be created involving C(N,2) or fewer edges, we can treat each of the possible edges as either present or not present. This is the exact same logic we applied in the unit on combinatorics when defining a bijection between the number of subsets of N elements and the number of binary strings (000101, 011110, etc.) of length N. In the case of the binary strings, we found that there are 2N strings. Because we have C(N,2) edges, the number of possible graphs is 2C(N,2). Each of these graphs has a chance of being randomly selected via this method.

The second method that Erdös and Rényi described for constructing a random graph is an incremental process. We consider each of the potential edges between N nodes in turn. For each edge, flip a coin. If the coin lands heads up, we make the connection; if it lands heads down, we leave the pair of nodes unconnected and move on to the next pair.

This second method of construction provides a good way to glimpse what happens as a random network is constructed. A useful question to consider is: When does the network become connected? Let's explore this process by imagining a bunch of buttons strewn about on the floor.

We can use strands of thread to connect buttons, and we can use the coin flip method of determining whether or not to connect a pair of buttons. Early on in this process, we will likely have a bunch of pairs of buttons, mostly disconnected from each other. Gradually, as the process continues, many of these connected pairs will become connected to each other, forming connected components. One can think of the connected component as all of the buttons that would be attached to a certain button if you were to pick it up. Usually, before we have attached too many threads, each button will be a part of a connected component, and there might be several connected components among the whole system of buttons and threads.

At this stage, the network of buttons as a whole cannot be said to be connected. Their grouping into multiple connected components represents an intermediate stage between utter isolation and complete connectedness. The size of the largest among the connected components depends on how many threads have been attached thus far. The nature of this correspondence is quite interesting.

When we first add a thread, the largest connected component consists of just two buttons. As a fraction of the total possible connections, this is close to zero. As we add a few more threads, any system of connected components that arises will most likely be a tree, and there will still be a fair amount of isolated buttons. This type of structure arises due to the high probability that, in the early stages of network evolution, each new connection is either with a previously isolated button or with a button that has, at most, one other connection. Eventually, as the number of connecting threads increases and the number of isolated buttons decreases, the odds shift so that we are more likely to connect two buttons that already have connections to others. When we reach this stage of growth, the addition of a new thread is likely to join connected components, thereby creating ever larger components, the largest of which is sometimes called the giant component. As we approach the situation in which the average button has at least one connection, the giant component grows quickly to incorporate nearly the whole system.

The rapid transformation from a few separate connected components to the giant component is sometimes called a "connectivity avalanche," and it is an example of a phase transition. Phase transitions occur all the time in nature, such as when water turns to ice, or when a material becomes magnetized—any time the condition of a system changes almost instantaneously.

## AROUND THE WORLD

• The average distance is one way to classify different types of graphs.

Recall from earlier in this unit that distance on a graph is a measure of the least number of edges needed to get from one node to another. Average distance is the mean of all the individual distances. In a random graph, we can assume that, given a certain number of average links per node, each node is just as likely to be directly connected (i.e., connected by only one edge) to one node as any other. Therefore, we should be able to come up with a relationship that represents the average distance between nodes in a random graph.

Suppose we have a graph with N nodes, each of which has k links, on average, to other nodes. This means that from any starting node, we can, again on average, get to k other nodes within one step. It also means that we could get to k(k−1) nodes within two steps.

Continuing this thinking, we could get to k(k−1)2 nodes within three steps, k(k−1)3 within four steps, and so on until we have k(k−1)(d−1) nodes at a distance of d steps. In a connected random graph, the maximum number of accessible nodes, k(k−1)(d−1), at a distance d must be equal to the total number of nodes, N. We therefore get:

N = k(k−1)(d−1)

Solving this for d, the average distance between nodes, we get:

This formula gives us the average distance between nodes on a random graph in which each node has k connections. We are able to do this only with random graphs because we require any two nodes to be equally likely to be directly connected. This makes for convenient mathematics, but how applicable is it?

Let's say that the six billion or so people of our world were randomly connected, with each person having 1,000 acquaintances. Using these figures, each of a person's acquaintances would have a chance of knowing any of the other acquaintances of that person. This might seem odd, because most people have friends who are friends with one another. This suggests a level of structure in human connections that is more than random. Obviously, our connections are not as regular as a lattice; no one is assigned a given number of acquaintances from birth. The random meeting on the street, or the friendships that develop out of any number of unforeseen difficulties, suggest that the networks that we experience as humans are not overly-structured and yet not completely random either; they fall somewhere in between. This type of network is significantly more difficult for mathematicians to explain, but meaningful progress has been made. What mathematicians have found, which we might intuitively guess were we to run into a classmate from kindergarten while on vacation in Antarctica, is that we live in a small world.