Information about the Erdös Number Project

We are pleased to announce a source of information for research mathematicians and others interested in the phenomenon of collaboration in mathematical research.

Our primary data are several fairly comprehensive lists of certain coauthor relationships. These lists can provide fun, as well as a vehicle for more serious studies of the dynamics involved and a “real-life” fairly large graph for combinatorialists to study. These text files are available on this site.

The files will be updated about once every five years to reflect corrections and additional information as it becomes available. The current version is dated August 7, 2020, and is intended to be fairly complete through mid-2020. Various analyses of these and related data are also provided.

We have also accumulated a wealth of material relating to mathematical collaboration (papers, references, links, etc.), as well as a lot of related information, especially material about Paul Erdös.

Most practicing mathematicians are familiar with the definition of one’s Erdös number [that is actually a long Hungarian umlaut over the “o” but we will represent it here by the ordinary two-dot umlaut widely available in html, since the true Hungarian umlaut isn’t visible in some browsers — here is what it looks like if your browser supports it: Erdős.]. Paul Erdös (1913–1996), the widely-traveled and incredibly prolific Hungarian mathematician of the highest caliber, wrote hundreds of mathematical research papers in many different areas, many in collaboration with others. (The first published source about this idea is a 1969 article by Casper Goffman in The American Mathematical Monthly, volume 76, page 791.) Erdös’s Erdös number is 0. Erdös’s coauthors have Erdös number 1. People other than Erdös who have written a joint paper with someone with Erdös number 1 but not with Erdös have Erdös number 2, and so on. If there is no chain of coauthorships connecting someone with Erdös, then that person’s Erdös number is said to be infinite.

In graph-theoretic terms, the mathematics research collaboration graph C has all mathematicians as its vertices; the vertex p is Paul Erdös. There is an edge between vertices u and v if u and v have published at least one mathematics article together. (There is no reason to restrict this to the field of mathematics, of course.) We will usually adopt the most liberal interpretation here, and allow any number of other coauthors to be involved; for example, a six-author paper is responsible for 15 edges in this graph, one for each pair of authors. Other approaches would include using only two-author papers (we do consider this as well), or dealing with hypergraphs or multigraphs or multihypergraphs. The Erdös number of v, then, is the distance (length, in edges, of the shortest path) in C from v to p. The set of all mathematicians with a finite Erdös number is called the Erdös component of C. It has been conjectured that the Erdös component contains almost all present-day publishing mathematicians (and has a not very large diameter), but perhaps not some famous names from the past, such as Gauss. (We have some information about the conjecture on this site.) Clearly, any two people with a finite Erdös number can be connected by a string of coauthorships, of length at most the sum of their Erdös numbers.

While there had been much informal discussion of the properties of the collaboration graph [see, for example,“On Properties of a Well-Known Graph, or, What is Your Ramsey Number?” by Tom Odda (alias for Ron Graham) in Topics in Graph Theory (New York, 1977), pp. 166–172], there had been no comprehensive set of data gathered prior to our work. As we compiled our lists, it became evident why this is so. For one thing, the database is quite large. For another, until fairly recently, most of the information has not been available electronically. Even more of an obstacle, however, is the serious problem of identity — determining whom a given character string (such as “J. Smith”) really represents.

Further information is contained in Grossman and Ion’s paper, “On a Portion of the Well-Known Collaboration Graph”, Congressus Numerantium 108 (1995) 129–131; Grossman’s paper, “Paul Erdös: The Master of Collaboration”, in The Mathematics of Paul Erdös (R. Graham and V. Nesetril, eds., Springer, 1997); and De Castro and Grossman’s paper “Famous Trails to Paul Erdös”, The Mathematical Intelligencer 21, no. 3 (Summer 1999), 51–63. The Springer book is a two-volume collection that also includes an updated list (as of 1996) of Erdös’s publication (numbering over 1400). Further updates to this list are posted on this website; the total now stands at 1525.

We provide six lists:

    • Erdos0 is a list of the (currently 511) people with Erdös number 1, one name per line, single-spaced, last name first, in alphabetical order, ALL CAPS, followed by an asterisk if the person is known to be deceased. The name occupies the first 40 characters of each line (including trailing blanks if necessary). The rest of each line contains the year this person’s first joint paper with Paul Erdös was published. If they have published more than one joint paper, then the number of joint papers is also given.

    • Erdos0d is similar to Erdos0, except that the date comes first and the list is sorted by year of first joint publication (alphabetical within the same year).

    • Erdos0p is similar to Erdos0d, except that it is sorted by the number of joint papers and contains only those 202 people with more than one joint paper with Erdös. Secondary sort is by year of first paper, most recent first.

    • Erdos1 contains the same information as Erdos0, together with a list of each author’s collaborators following his or her name. These coauthors are listed one per line, single-spaced, each indented by a tab, last name first, in alphabetical order; those who have Erdös number 1 are in ALL CAPS, and those who have Erdös number 2 are in Normal Capitalization. A blank line follows each such sublist. Again, an asterisk following the name of an Erdös coauthor indicates “no longer alive”, but no attempt is made to use this convention on the people with Erdös number 2.

    • Erdos2 is a kind of inverse of Erdos1. It is an alphabetical list of the (currently 11,002) people with Erdös number 2, left-justified, each followed by a sublist of his or her coauthors with Erdös number 1 (each line indented by a tab). The capitalization convention explained above is maintained. Note that only those coauthors with Erdös number 1 are listed for these people.

    • ErdosA is simply a list of all 11,514 people with Erdös number less than or equal to 2, in alphabetical order, one per line, with the same capitalization convention (with Paul Erdös listed in spaced caps, as well).

One more note about notation: Numbers preceded by carets follow the convention used by Mathematical Reviews in MathSciNet to distinguish people with the same names.

Users unable to download these files from the Erdös Number Project website may e-mail Jerry Grossman and arrange an alternative means of obtaining them.

Here are the procedures, rules, conventions, and assumptions we used in creating these lists. In most cases, our source is MathSciNet the database of the American Mathematical Society’s Mathematical Reviews (MR). Secondary sources include the Mathematics Genealogy Project, zbMATH, the Electronic Research Archive for Mathematics Jahrbuch Database, the Computer Science Bibliography (DBLP), and the Hypertext Bibliography Project. In some cases we have used obituary articles in mathematical journals or similar sources. Finally, we thank the countless mathematicians, especially the coauthors of Paul Erdös, for providing information for this site.

Our criterion for inclusion of an edge between vertices u and v is some research collaboration between them resulting in a published work. Any number of additional coauthors is permitted. Not normally included are joint editorships, introductions to books written by others, technical reports, problem sessions, problems posed or solved in problem sections of journals, seminars, very elementary textbooks, books on history, memorial or other tributes, biography, translations, bibliographies, or popular works. Pseudonyms (such as Mutt and G. W. Peck) are usually taken at face value, as if they were real people. When MR lists two people with the same name using superscripts, we follow this convention, using a caret, as in Liu, Zhen Hong^1. (Indeed, there are actually two Paul Erdös’s, the other being a physicist who has published mathematical papers. “Our” Paul is Paul Erdos^1 to MR. Also, one must not confuse Paul Erdös with Peter L. Erdös, who sometimes publishes under P. L. Erdös; he has Erdös number 2.) We have tried to include as full a name as possible in all cases. As for spelling, all accents are ignored and omitted, but apostrophes and hyphens are included.

There are bound to be mistakes in our data. We urgently request people who know of mistakes to report them to us so that the errors can be corrected in subsequent versions. Please tell us of incorrect or incomplete names (we want as full a name for each individual as possible), coauthorships we have missed, entries that should be modified or deleted, including those caused by confusion over distinct people with the same or similar names or initials. Conversely, note that names that identify the (known to us) same person are identical in these lists; if you have information that, say, Jones, Albert is the same person as Jones, A., then please bring it to our attention, since we do not know this and are assuming that they are separate people. When sending us information, please provide citations or other documentation. As in the past, we will forward information as necessary to Mathematical Reviews so that they can correct their database.

As a corollary to our work, we issue a plea to authors: please use as complete and consistent a name as possible when you publish a paper. Too many people have too many similar names and initials, and confusion reigns! Mathematical Reviews has an interesting explanation of how it identifies authors amid all this confusion. (A beneficial side effect of our project has been the correction of hundreds of author-identification errors in the Mathematical Reviews database.)

Finally, let us suggest a few uses for these lists. Many of them require that the lists be downloaded or scanned electronically with a word processor or editor.

One obvious thing to do is to compute your own Erdös number. If you are on the list, there is no problem. If not, then perhaps one of your coauthors is on the list, giving you an Erdös number of 3. Otherwise, you can look in electronic versions of MR or other databases and compile a list of the coauthors of your coauthors, and repeat the process until you find a name on the list. If you have been thorough, then you will have an exact value for your Erdös number. For example, Andrew Wiles has Erdös number at most 3, because he is a coauthor of Chris M. Skinner, who has written with ANDREW ODLYZKO, who has written with Erdös. (Warning: your number, if greater than 2, might decrease over time, especially if you or your colleagues write more papers!) If you want help computing your Erdös number, contact Jerry Grossman and provide your name and the names of your collaborators who might have written papers in mathematics or related areas (theoretical physics, statistics, etc.). For an even faster way to compute an approximation to your Erdös number, see the suggestion on our Computing Your Erdös Number page using MathSciNet’s automatic collaboration distance calculator.

A more casual thing to do is simply to read through Erdos1, noting the wide range of collaboration that exists (we were surprised by its extent). For example, Paul Erdös is not the only person presented here with more than 100 coauthors. Paul Erdös has made contributions in many different areas of mathematics; and by the time you go one or two more levels down the tree, essentially all areas of mathematics are represented (as well as computer science, physics, and other natural and social sciences).

Finally, we offer our data as a fairly large graph on which to test algorithms, in the spirit of Donald Knuth’s The Stanford GraphBase (Addison-Wesley, 1993). (For this purpose it is probably best for now to restrict oneself just to the people with Erdös number 1, because our data do not show coauthorships between people with Erdös number 2.) Perhaps there is not as much intrigue in the relationships shown here as in, say, Professor Knuth’s graph of encounters between characters in Tolstoy’s Anna Karenina (or maybe there is . . .), but connectivity, covering, clique, or other analyses may yield some interesting insights. We would be interested in hearing of any results you obtain.

initial version: May 25, 1995

latest revision: August 7, 2020

This page was last updated on September 15, 2020.