Past Meetings

Talks presented at RSS West Midlands Local Group meetings have been summarised by volunteers and archived on this page.

Stats in the Pub: The 'Hot Hand' in Darts

Marius Oetting – April 23, 2020

The West Midlands Local Group didn’t let the Covid-19 lockdown stop us having a talk. Thanks to the initiative of our group secretary and the tech savvy skills of his predecessor we were able to host a talk live via our You tube channel. Present were attendees were from as far apart as Los Angeles and Wolverhampton. Our original plan was to hold this meeting in a pub (the subject matter was relevant! But the lockdown made this impossible, so we created a virtual pub on Youtube called the “Box and Whiskers”

Marius Oetting of Bielefeld University in Germany presented a very informative talk on determining whether such a thing as a “hot-hand” existed in darts; that is a run of successful dart throws in a row (a winning streak). A successful throw is defined as a triple or the inner or outer bullseye, for roughly the first part of the game where the primary objective is to score as many points as possible with each dart.

A successful throw is designated with a 1 and an unsuccessful throw a 0. Hence the set-up falls naturally into a binary time series with Markov chain modelling (transitioning from a successful state (throw) to a non-successful state. A hot-hand can be measured by serial correlation in the sequence of 1’s and 0’s.The talk was based on Marius’ recent paper in Series A of the Society’s journal. Some authors dismiss the hot hand theory as a cognitive illusion in most sports. Darts was chosen for the study here because of the relative lack of physical interaction between players. The data for the study was obtained from http://live.dartsdata.com which records the throwing sequences of all major darts tournaments

In summary Marius uncovered some evidence of hot-hand sequences in darts. Improvements to the model which are now under consideration is the idea of “marker darts” – here the first throw of three has the aim of locating the following two throws.

A question and answer session via the chat window in the You tube resulted in some thoughts on improving the model.

Finally the You tube recording can be viewed at below, for those that wish to check the accuracy of this written report.

Report by Tim Davis | Chair, West Midlands local group |

Data Science at the National Physical Laboratory

Kavya Jagan - National Physical Laboratory

The West Midlands Local Group are this year celebrating our 75th anniversary. Our first meeting in 1945 enjoyed a talk from the National Physical Laboratory, and so it was with particular pleasure that we welcomed back NPL in our 75th year to help us celebrate in the form of their senior data scientist Kavya Jagan. NPL are the custodians of looking after calibrating how we measure the seven fundamental units of the natural world, so ensuring that different facilities around the country get the same answer when measuring the same thing, in part to maintain a regulatory system to control trade based on defined quantities, and developing new measurement standards and techniques. These seven fundamental quantities (with units) are mass (kilogram - kg), length (metre - m), time(seconds - s), temperature (Kelvin - K), amount of substance (mole - mol), luminous intensity (candela -cd), electric current (Ampere - A). Every derived quantity is a power function of these fundamental units (for example the volt has units kgm2s-3A-1). Previous employees of NPL include Alan Turing where he was instrumental in developing the world’s first Automatic Computing Engine in 1946, and radar was co-developed at NPL in 1935.

There is much more to NPL than just measuring stuff, and they work with a number of outside organizations (including industry), in areas such as advanced manufacturing, energy & environment, and life sciences & health, where the work is focussed on measurement traceability and uncertainty analysis. Kavya illustrated this extracurricular activity via a number of case studies including:

  • Decision making on the degree of severity of myocardial blood flow abnormalities, utilizing techniques in image analysis and machine learning, for better diagnosing heart disease
  • Clustering satellite data of ground movement, whereby the aim is to preserve important features in the data in a lower dimensional space, with the aim of monitoring air quality
  • Noise monitoring using data from acoustic sensor networks using Guassian Process modelling to model acoustic noise in urban environments.

Wednesday 19th February 2020, University of Warwick.

Event summary by Tim Davis (timdavis consulting ltd.)

Statistics Education in the Social Sciences

Presentation 1:

Number is not enough; the analytic problem in UK social science

Professor Malcolm Williams - Cardiff University (Q-Step)

Two cultures have developed in British social science, particularly sociology; the analytic and the critique with research showing that critique is the larger of the two. A 2013 survey showed that 77% of sociology students would rather write an essay than analyse data. This does not appear to be due to a fear of number, but predominantly due to the cultural traditions in which they are taught (which have generated an epistemological relativism and can include a degree of animosity from the critique culture towards the quantitative culture) with many students not being exposed to numerical / analytical approaches.

A response to this is Q-Step which is a programme aimed at developing analytical approaches in social science education in a number of British universities. It is too early to tell what the impact of the programme has been, but there is a distinct need to engender a pluralist approach to research within sociology.

Questions and answers:

The main lines of questioning revolved around what Q-Step is trying to achieve and what is meant by a pluralist approach. It was answered that the aim is to achieve a plurality of methods within British social science (to make social scientists more like scientists) and that by plurality is meant that the most appropriate research method is used for the research question at hand.

Presentation 2:

Approaches to research in social sciences

Dr. Phillipe Blanchard - University of Warwick (Q-Step)

Dr. Blanchard agreed with many of the points raised in the first presentation. He noted that many of the critique research approaches referred to originally arose from good solid work, but have been used incorrectly to promote epistemological relativism. Research methods are subject, through teaching, to long term processes which creates an inertia leading to a reluctance to use analytical methods (with the critique tradition not necessarily having a good understanding of quantitative methods). Whilst there has been some small rise in the use of analytical methods, including mixed methods, the divide is still very much there with the increased use of mixed methods not necessarily equating to plurality. This continuing divide is more evident in sociology than politics / political science. Overall, a similar picture is present in the French social sciences, although French sociology is often more ‘radical’ and there is less of a drive to increase the use of analytical approaches. Suggestions for engendering change (in British social sciences) included changing the marking scheme to include marks for analytical analyses.

Questions and Answers:

Questions included “are these religious wars not seen in other disciplines” and would some of the suggested approaches (e.g. changes to marking schemes) not form an imposition of a particular viewpoint upon others? It was answered that the approaches outlined would hopefully speed up the process of change and lead to greater plurality in the British social sciences.

The speakers were thanked for their presentations, including for providing a very honest auto-critique of their discipline.

Wednesday 5th February 2020, University of Warwick.

Event summary by Davin Parrott, West Midlands Police.

The Statistics and Metaphysics of Causation

Professor David Papineau - King's College London & City University New York

Professor Papineau began by discussing the history of the study of causation in statistics, highlighting the more recent notoriety of Judea Pearl, especially since the publication of The Book of Why, but also prior work of HA Simon, HM Blalock, Ronald Fisher, Sewall Wright and Tinbergen, Frisch and Haavelmo. His claim was that this work in recovering causal structure from correlational (by which he took a broader meaning of dependence) structure pointed towards a metaphysical moral, that causal structure just is correlation structure.

He sought to explain what he described as the ‘reductive basis’ for which he was arguing, explaining that he considered there to be two levels of causal inference. First the inference of ‘lawlike’ population level correlations from sample data, and second the inference of causal structure from these lawlike correlations. He made clear that it was the second of these with which he was interested.

In this context he claimed that what statistics showed was that the philosophical claim “no causes in, no causes out” was simply incorrect. He used some simple examples of directed acyclic graphs to explain how causation could be inferred from correlation. A murmur was heard from the statisticians in the audience when he went on to claim that “B can only be spuriously correlated with C if it is itself an effect of one of C’s true causes.”

He then returned to explaining that philosophers predominantly think of causes and effects independent of these considerations of correlation. But he claimed that these considerations make for cause and effect, and that this was a reduction of causation. Returning to Pearl, he noted that Pearl had described the statistical signatures of causation as “a gift from the gods”, but he expressed that with his reductive view of the metaphysics of causation then no such divine recourse was necessary. His challenge to philosophers was as he put it “to explain why the statisticians’ arrows coincide with the causal arrows?; why are event types probabilistically independent if and only if they are causally independent?” with his own conclusion being that an answer to the question must hinge somehow on the nature of causation.

The engagement of the audience was expressed in the numerous questions that Professor Papineau fielded. These included queries on topics including the role of agents, on the role of randomness and uncertainty, and on the implication that observation alone (as opposed to designed experiments) might be the best means of inferring causation.

Thursday 5th December 2019, University of Warwick

Event summary by Ian Hamilton (Warwick)

Monitoring the World - the sustainable development goals and what they mean to statisticians

Phil Crook - International Development Section, RSS

Phil Crook began by explaining how he came to be working as a statistical consultant within the field of International Development via the Navy, the Seychelles and DFID. In explaining the SDGs themselves, he broke down his analysis into a period of “unmeasured aspirations” from 1946-1990; the Millenium Development Goals; the launch of the SDGs; and the current opportunities presented by the SDGs.

The UN Declaration of Human Rights in 1948 was identified as a key point in defining the aspirations of International Development work. He explained how over the following decades the UN produced several other declarations identifying particular areas (racial discrimination, gender discrimination, rights of the child, education, the environment etc.) but none of these included measurable targets. Meanwhile, in the 80s and 90s, there had been efforts towards statistical capacity building globally, particularly led by the former colonial powers - UK, France, and Portugal - and some of the northern European countries – Germany, Sweden, Norway – along with the World Bank and IMF, with the latter making loans conditional on good data proving an important incentive.

With the Millenium Development Goals in 2000, measurable outcomes were finally included alongside the aspirations, with the eight Goals being broken down into Targets and Indicators. The MDGs were widely seen as successful both in motivating and focusing attention, and ultimately in contributing to positive changes. However it was subject to some criticism, with for example the Indicators often being those that could be readily measured rather than those that were the most relevant, and the Goals and Targets being imposed in a top down manner.

With this in mind the Sustainable Development Goals were launched in 2015 with heavy involvement from statisticians, especially those from the UK. These addressed many of the concerns but at the cost of having an increase to 17 Goals, 169 Targets and 244 Indicators. Phil gave some examples of these, including some where the committee decision-making progress was evident. He also discussed the relevance to the UK, which will also be required to make efforts to meet the SDGs.

He finished by discussing the opportunities that the SDGs represented for statisticians, and answering questions on subjects including on the extractive dimension of statistics in Africa, and the relative roles of statisticians and economists in the development of the SDGs.

Thursday 14th November 2019, University of Warwick

Event summary by Ian Hamilton (Warwick)

Judging a book by its cover - How much of REF research quality is really journal reputation?

Professor David Firth, David Selby - University of Warwick

Professor Firth began the talk by explaining what the REF is and the way it is assessed. He explained that there are three areas of assessment: quality of output, impacts outside academia, and research environment, of which it was the quality of output that would be investigated in this talk. These were assessed across 36 ‘Units of Assessment’ in 2014, with the talk focusing on four: Mathematical Sciences, Economics & Econometrics, Physics, and Biological Sciences, selected for this study because their main unit of publication is papers (rather than books). It was explained that the quality of output measure was made by attributing each submitted paper to one of a 4*, 3*, 2*, 1* or unclassified bucket, with the aggregate proportions in each bucket being published, along with the journals of the papers submitted, but that the categorisation at the paper level was not published.

At this point David Selby picked up the baton. He chose to focus on Economics and Econometrics, as this was a field that was relatively self-contained, had a widely-recognised ‘top 5’ highly prestigious journals, and had been the subject of previous study. He explained that the goals of the study was twofold: to measure the extent to which REF2014 output profiles are associated with journals; and to make an inference about each journal’s REF star rating. To do this ecological inference techniques are required. A simple version of these was explained using an election example, and the history of these techniques goes back to at least the 1950s. The model used in this case was a Poisson-Binomial, with the parameters referring to the probability of a paper from a particular journal being rated 4*. An appropriate prior was chosen for these probabilities and the model fitted using Stan.

These probability estimates showed wide estimate intervals but with means in reasonable agreement to the anecdotal reputations. In order to assess the fit of this model, it was then used to predict the REF2014 output and funding profiles for each institution based just on the journals of their submissions. These showed a good fit. In Mathematical Sciences the output profile was less close but the funding profile prediction was very accurate. The results across the four Units of Assessments were compared with Biological Sciences showing notably more uncertainty.

A discussion was provoked, with the presenters noting that there was good reason to believe that the quality of output measure could not be reproduced based solely on the journals, with reputations of journals changing over time and the large uncertainty in the estimation being particular challenges. A number of audience participants were interested in whether the preferences and backgrounds of the assessor panel could be discerned through related methods, perhaps with more data than this study considered.

Wednesday 30th October 2019, University of Warwick

Event summary by Ian Hamilton (Warwick)

Rugby vs Statistics

Stuart Farmer - Rugby statistician, SFMS Ltd; Marc Turner - Senior Performance Analyst, Harlequins RFC; Professor Phil Scarf - University of Salford

Stuart Farmer began the event talking about his experiences of the development of the use of data within rugby union. He explained how his own involvement had begun when he had persuaded a manager of the company where he was working at the time that rugby data might be a good way to test their new machine - an early and very large, computer. Over time he built up a database of results information, originally consisting of just the most basic information, match dates and scores. He explained that during the mid-80s, newspaper coverage started to standardise so that further details such as team line-ups, scorers, times of score, referees, attendance etc. became more reliably available. He contrasted this however to the vast amount of data now available at top end rugby with second-by-second data available on in-game events (tackles, breaks, passes etc.), and highlighted some of the areas that all these different forms of data are used by governing bodies, media, insurers, performance analysts, and gamblers. He finished by highlighting some statistics from the world cup, how the number of tries in the group stage matches has remained reasonably constant over the last few world cups, but the number of penalties has reduced, and questioned Ireland’s ability to beat New Zealand, noting that they had never come back from more than three points down in a rugby world cup match.

Next Marc Turner spoke on his work as a performance analyst at a professional rugby club. He began by describing the sources and nature of the data, explaining that companies, principally Opta, collect the data and then sell on to clubs and others. He showed some of the high levels of detail that are available in these data and explained how they are used to influence decisions, for example highlighting strengths or weaknesses of an opposition and what that can mean for strategy and team selection for any given match. This opposition analysis was broken down further into a consideration of Who (did) What, Where (on the pitch), When (during the match), and Why. He also presented some headline findings from a principal component analysis approach to working out which of the high number of factors on which they had data were most influential, with some of the findings obvious and some less so. Finally he spoke about match day, and the high pressure environment and quick turnarounds involved. Throughout he emphasised the importance of thinking about how the data and conclusions were communicated, given the highly varying levels of engagement of the players and coaches.

Finally Phil Scarf spoke about his work on the “Uncertainty of Outcome Hypothesis”. Drawing on his Welsh background he engaged the audience with a clip of the great Gareth Edwards Barbarians try and went on to explain that in general there are more upsets in soccer than in rugby and a primary cause for this is the lower frequency of scoring in soccer. He widened his scope to also include an analysis of netball and Australian rules football. Netball proves to be particularly interesting as most possessions result in scores and restarts alternate independent of who last scored, so it represents a sport where there can be high competitive balance despite high scores. For the statisticians in the audience he explained how he modelled the sports using Poisson matches, and used this approach to calculate a measure of the separation of team abilities. He discussed how competitive balance may be encouraged extrinsically with measures such as salary caps and drafts, but concluded that it was desirable to promote competitive balance intrinsically in rugby union, and to do that, in contrast to administrators who tend to wish to increase scoring, he proposed that rugby should revert to a system where points were awarded only for kicks, and tries became an opportunity for a kick (in line with the original meaning of the term ‘try’).

A number of questions were asked. A particularly interesting one touched on how rugby compared to other professional sports in its analysis of data. Marc explained that one big constraint in rugby union was the availability of spatio-temporal data. In soccer, in top leagues and internationals, the position of all players at all times is recorded by cameras, and the data for all players (their own and opposition) is available to analysts at all clubs (if they pay). It is expensive to collect this data in any case and the technology to be able to interpret individual player position in the melee of a tackle area on a muddy pitch is not well developed. Thus it is not available to rugby analysts. It was also noted that, even if it were, given the more lateral nature of the sport due to the rules of the game, the spatial element may not be as prominent as in sports such as soccer.

The event was well attended, with a notably diverse audience, representing fifteen different affiliations, including local schools and rugby clubs.

Wednesday 16th October 2019, University of Warwick

Event summary by Ian Hamilton (Warwick)

Anthropomorphic Learning

Professor Ganna Pogrebna - University of Birmingham, Alan Turing Institute

Understanding and modelling human behaviour is one of the major tasks facing industry and academia of the future. This task is especially important when we consider interactions between humans and technology. Decision support systems, suggestion systems, automation, etc. – all these technologically intense aspects of human life require accurate predictions of what people like, what people prefer, and where people need help of automated agents. Under these circumstances, recent advances in computer science, statistics, and mathematics offer several methods which try to model human behaviour. Specifically, the methodology of machine learning and, more recently, deep learning allows us to generate predictions useful for many different facets of human life. Yet, there are many aspects of human life and decision making where machine learning and deep learning fail to provide reliable and accurate results. One of the most notorious examples is suggestion systems: many of us regularly shop online using different platforms (such as Amazon) and receive suggestions for future purchases. Yet, very few of us find these suggestions helpful. One of the reasons why AI fails in many cases to correctly anticipate human behaviour is that AI algorithms tend to ignore existing insights from decision theory and behavioural science. By combining behavioural science models with AI algorithms, we are able to significantly improve and simplify predictions of human behaviour in a wide variety of contexts. The resulting methodology which we label anthropomorphic learning allows us to develop more functional systems which better understand humans. This methodology is explainable, traceable, requires smaller training sets and, generally, outperforms existing algorithms by generating more accurate predictions.

Thursday 20th June 2019, University of Warwick

Abstract supplied by Professor Pogrebna

How are we doing? - Measuring personal and economic wellbeing

Sunny Valentineo Sidhu & Ed Pyle - Office for National Statistics

The talk began with Sunny Sidhu explaining the genesis of the interest in measures of wellbeing. He described what GDP is, what it’s deficiencies are and how, through the work of the Commission on the Measurement of Economic Performance and Social Progress and others, there has been a recognition of the need for alternative measures, highlighting in particular some studies on the trends in mortality in the USA. He explained that in the UK the process of addressing this challenge began in 2010 under the coalition government with the Measuring National Wellbeing programme being established with the aim of monitoring and reporting on “how the UK is doing”. For this purpose both objective (e.g. unemployment) and subjective (e.g. happiness) measures are utilised, and he introduced the dashboard developed by the ONS for monitoring the various aspects, with both personal and economic wellbeing measures.

At this point Ed Pyle took over to give an insight into what the future of measuring wellbeing might look like, some of the methodology used, and what the results showed. He began by introducing ‘the capitals’ – human capital, social capital, and natural capital – all of which have value to society but may not be captured in classic economic measures. He gave a quick coverage of equality, showing a chart with various measures of inequality for the UK that showed steep ascent from the late 1970s until 1990 but have followed a slightly declining though volatile pattern since, and the current decile levels of income and wealth in the UK. He then moved on to describe some of the ONS work on personal wellbeing, showing an example of the question and answer format used as well as the characteristics considered. From there the audience were asked to guess what seemed to be the major determinants of life satisfaction in the UK, eliciting a large spread of votes for the categories. He answered this question by showing a chart by different factors, highlighting the large impacts due to age, economic activity, and marital status, but most of all, of self-reported health. Some of these were looked at in more detail, and the impact of age in particular provoked discussion from the audience around the impact of the wealth of the baby boomer generation on this analysis.

The talk finished with a caution over the numbers presented, that they appeared to account for only around 20% of the variation in life satisfaction that is seen, and so it is an area of ongoing interest and research.

Monday 17th June 2019, University of Warwick

Talk summary by Ian Hamilton (Warwick)

Modelling genes: the backwards and forwards of mathematical population genetics

Professor Alison Etheridge OBE FRS - University of Oxford

When Mendelian genetics was rediscovered at the beginning of the 20th Century, it was widely believed to be incompatible with Darwin's theory of natural selection. The mathematical sciences, in the hands of pioneers such as Fisher, Haldane and Wright, played a fundamental role in the reconciliation of the two theories, and the new field of theoretical population genetics was born. But fundamental questions remained (and remain) unresolved. The genetic composition of a population can be changed by natural selection, mutation, mating, and other genetic, ecological and evolutionary mechanisms. How do they interact with one another, and what was their relative importance in shaping the patterns that we see today? Whereas the pioneers of the field could only observe genetic variation indirectly, by looking at traits of individuals in a population, researchers today have direct access to DNA sequences, but making sense of this wealth of data presents a major scientific challenge and mathematical models play a decisive role. In this lecture we'll discuss how to distill our understanding into workable models and then explore the remarkable power of simple mathematical caricatures in interrogating modern genetic data.

Thursday 16th May 2019, University of Warwick

Abstract supplied by Professor Etheridge

Real-time monitoring of vaccine effectiveness for foot & mouth disease control

Professor Chris Jewell, University of Lancaster

Chris gave an informative talk on controlling foot & mouth disease through vaccination. Setting the scene, he provided some data on the 2001 epidemic in the UK, which cost an estimated £8 billion to the economy as a whole and resulted in 10% of the UK cattle herd being slaughtered (about 6.5 million animals). There was another smaller outbreak in 2007 confined to Surrey, although this was orders of magnitude less than the 2001 crisis, and vaccination was discussed at the time as a possible disease control option, but it wasn’t until a 2010 outbreak in Japan that vaccination was deployed during a foot & mouth outbreak.

In the 2010 Japanese event, vaccination was not considered until after the outbreak was underway, so was used in a reactive, not preventative way. Animals vaccinated in a reactive way fall into two categories: either the animals are vaccinated to live beyond the end of the epidemic or they are vaccinated to be culled after the epidemic is over. But vaccines are not robust in the sense that their effectiveness depends on many variables outside the control of the veterinary authorities trying to manage the spread of the epidemic (for example animal-to-animal contact and local spatial dispersion of farmyard locations). And so statistical models are required to monitor and assess the effectiveness of the vaccination program.

A good starting point is the S-I-R compartmental model (Susceptible, Infected, and Removed or Recovered): each member of the population progresses through three “compartments”. This progression can be modelled with differential equations describing the rate at which individuals progress through each stage. Analysis of these differential equations leads to an estimate of the proportion of animals that need to be vaccinated, which is itself a function of the rate at which the disease progresses from a single animal. To be effective the vaccination program has to control this rate so that the epidemic dies out. The practical difficulties of doing this meant that in the case of the 2010 Japanese outbreak the vaccination strategy was to create a “firewall” of vaccinated animals so that the spread was contained within relatively small geographic area. The disease then should not be able to progress beyond this firewall (a rule of thumb being determined to be vaccinate 10 km out from the initial outbreak and move in).

Chris showed an animation of the effectiveness of the vaccination strategy in containing the Japanese epidemic (which lasted a relatively short 3-month period from March to June 2010). He also discussed extensions the S-I-R model to include a Detected or Noticed compartment. This additional state or compartment allows for the modelling of the gestation period of the virus, because symptoms can take a while to show after infection, and additionally the effectiveness of a vaccination is highly sensitive to the delay between receiving the vaccine and becoming immune. Readers who are interested in following up on some of the mathematical details can learn more here.

Thursday 8th March 2018, University of Warwick

Talk summary by Tim Davis (Tim Davis Consulting)

Alt-3: The Real League Table

Professor David Firth, University of Warwick

David Firth gave a characteristically elegant talk. Alt-3 is a league table that works by adjusting the positions of football teams in the Premier League by taking into account the strength of the teams played against so far.

The mathematics required are based on the Bradley–Terry model—modified to allow for draws and a home advantage effect. David emphasised that the method is not predictive but rather retrodictive .

One consequence of the analysis is that Everton have had a relatively tough first part of the season, and so David suggested that the former Everton manager (Ronald Koeman) has been hard done-by by being sacked. Presumably Sam Allardyce will reap the benefits.

The method has some particular mathematical interest because the points structure is asymmetrical (three points for a win, one for a draw and zero for a loss: 3-1-0) compared to 2-1-0 as the Premier League used to be. The three points for a win was initially championed by Jimmy Hill, and as a former manager of Coventry City, due homage was paid to him in the talk. There is a website dedicated to the alternative league table at alt-3.uk.

David supports West Bromwich Albion, but in spite of that he remained cheerful enough to take some interesting questions at the end of the talk.

Thursday 7th December 2017, University of Warwick

Talk summary by Tim Davis (Tim Davis Consulting)

How Forensic Statistics Nailed the Identity of the “Last Plantagenet”

Professor Kevin Schürer, University of Leicester

Richard III was the last king of the House of York and the last English monarch to die in battle. Until recently, the location of his remains was a mystery that historians thought they might never solve.

Historian Philippa Langley campaigned for the 2012 excavation of a Leicester car park, thought to be the site of the friary where Richard was buried. Archaeologists found a skeleton, but it was not immediately identified. Remarkably, in 2013 the University of Leicester confirmed the bones belonged to King Richard III, citing almost indisputable probabilistic evidence from DNA testing and historical records.

The king was killed at the Battle of Bosworth in 1485. Records stated that after the battle, his body was carried off and unceremoniously buried at Greyfriars Friary in nearby Leicester. Following the dissolution of the monasteries in 1538, the friary was sold off and demolished, with subsequent development on the site leaving no trace of the original building above ground.

Archaeologists had a good idea of the area where the friary had been located in the city, but they were only allowed to excavate 17% of this, due to the surrounding buildings. Time constraints further limited the permitted search area to trenches covering just 1% of the whole site. As luck would have it, the first trenches established the exact position of the friary. Conditioning on this, the archaeologists excavated the choir—the most likely burial site for a an important person. Beyond all expectations, the team discovered a skeleton with battle injuries and deformities consistent with those described in historical texts. The next problem was proving beyond reasonable doubt that these bones in fact belonged to Richard III.

Using contemporary sources, the team established a number of facts about Richard, including his appearance, build, age and genealogical history, aiming to test the hypothesis that the skeleton matched these characteristics and was therefore that of the king. Mitochondrial DNA extracted from the remains provided considerable evidence that the skeleton was at least related to Richard III. This, combined with additional information about the body's scoliosis (curved spine), carbon dating and osteological analysis, gave very strong evidence in support of the hypothesis that the bones were Richard's.

Professor Schürer gave a fascinating insight into the role that that statistics played in identifying Richard III, while presenting a captivating overview of the gruesome history of the Wars of the Roses.

Thursday 12th October 2017, University of Warwick

Talk summary by Nick Tawn (Warwick)

From Lotteries to Polls to Crimes to Monte Carlo

Professor Jeff Rosenthal, University of Toronto

There was an excellent turnout for Jeff's entertaining and captivating talk, which highlighted some of the misconceptions people have about unlikely events. The idea of Monte Carlo algorithms was introduced and motivated in a way that was accessible to the whole audience.

Jeff began by comparing the probability of winning the lottery to other unlikely events — it is about as likely as the person sitting next to you being a future prime minister! This was then followed by discussion of why certain rare events, often seen as highly remarkable or even fate, can actually be far more likely than they first appear.

Coin flip simulations were used to motivate the Law of Large Numbers. Using this, Jeff discussed some recent notable election results and how the pollsters managed to get their predictions so wrong in each case, giving insight into the response biases and non-independent nature of polling.

A highlight of the talk showed the real impact that statistics can have: Jeff described some work that exposed fraudulent lottery claims in the “Ontario Lottery Scandal”. This being a “statistical win” according to Jeff, he then followed with a “statistical fail”, recounting the sad case of Sally Clark and her wrongful conviction for infanticide. This exemplified the pitfalls of incorrectly assuming statistical independence and the so-called Prosecutor's Fallacy.

The final part of the talk gave a nice insight into the casino game, craps. Though the game seems reasonably fair, Jeff showed that even just a tiny bias in favour of the house yields a long-term win for the casino, with gambler's ruin becoming more certain as stakes increase.

The talk concluded with a description of Monte Carlo techniques and their wide range of uses in complex situations: simulations act like polls; as we collect more and more of these samples then we can become increasingly certain of the true result.

Wednesday 3rd May 2017, University of Warwick

Talk summary by Nick Tawn (Warwick)

Measuring Sustainable Development

Matthew Powell, Oxford Policy Management

This RSS meeting, led by Matthew Powell, senior economic statistician at Oxford Policy Management, was devoted to the Sustainable Development Goals (SDGs), officially known as “Transforming our world: the 2030 Agenda for Sustainable Development”. Before discussing the SDGs in more detail, Matthew alluded to their concept and origin: in 2015, the 194 countries of the UN General Assembly adopted the 2030 Agenda for Sustainable Development. Following the adoption, UN agencies decided to support a campaign that introduced 17 aspirational “Global Goals”, stretching from “No Poverty” and “Climate Action” to “Peace Justice” and “Strong Institutions”.

The discussion continued as follows. Matthew explained that the aforementioned goals are accompanied by 169 targets and 230 global indicators. Whereas targets are unambiguously defined (e.g. “to reduce child mortality by a third in the next 15 years”), the indicators are split into three different categories (tiers) depending on the methodology and availability of data. The global indicators were developed following a series of Inter-Agency and Expert Group (IEAG) meetings that took place between 2015 and 2017. Interestingly, the IAEG meetings resulted in a call for a “Data Revolution” in order to devise and improve instruments to tackle the indicators system.

In his talk, Matthew related the SDGs to another set of UN goals, the Millennium Development Goals (MDGs) set in 2000. Some MDG achievements are impressive: for example, during 2000–2015, the amount of people living in extreme poverty decreased by nearly a half. At the same time, the MDG experience revealed various issues. Many of them were political obstacles, but on the other hand there were outstanding statistical challenges along the way. For instance, it was only in 2014 that the number of observations for some indicators exceeded 16 for the majority of countries. In addition, some developed countries such as the UK have a “leave no one behind” policy. Under this policy, collected data should include information about every group of people in the area or survey where the survey is conducted. No segregated community should be failing to meet the Global Goals.

Giving his concluding remarks, Matthew emphasised the importance of global partnership, which has not lost its role in a world of advanced statistical methods and increasing amounts of collected data. How do we write a summary of 200 different indicators? How can one exploit the results on the national and global scales? How can we control for attempts by certain governments to game the system? How do we know that everything possible is being done in order to achieve SDGs in societies where many indicators are improving anyway, regardless of policies undertaken? These and many other issues can be resolved only through worldwide collaboration of governments and internationals organisations.

Thursday 9th March 2017, University of Warwick

Talk summary by Cyril Chimisov (Warwick)

The Arctic tragedy of the Franklin expedition of 1845: statistical and forensic insights

Professor Keith Millar, University of Glasgow

Keith's talk shared the story of a national tragedy: the disastrous 1845 Royal Navy expedition by Captain Sir John Franklin to discover a North-West Passage through the Canadian Arctic. Just decades before Scott's ill-fated Antarctic journey, the Franklin expedition resulted in the deaths of 129 men, though it is a tale perhaps less well-remembered by the public today.

In the mid-19th century, the Arctic was uncharted and unknown. Sir John Barrow, 2nd secretary to the Admiralty, made it his life's work to discover a North-West Passage, which would greatly shorten journey times for shipping. Captain Sir John Franklin was greatly respected at the time, famous as "the man who ate his boots" on an earlier expedition. Two steam auxiliaries, HMS Erebus and HMS Terror set sail on 19 May 1845. They were last sighted in July 1845. Nothing more was heard from the expedition for the next five years.

The Admiralty, facing growing pressure from Lady Franklin, Parliament and the media, eventually offered a £20,000 reward and dispatched several search parties. In 1850, it was discovered that the expedition had wintered on Beechey Island in 1845–46, leaving behind the graves of three crewmen who had succumbed to tuberculosis. It was not until 1854 that Dr John Rae returned from the Arctic with grim news: the entire expedition was lost, and there were even signs the crew had resorted to cannibalism. A note found on King William Island in 1859 revealed the ships had become trapped in ice, and abandoned, with Franklin himself dying on 11 June 1847. It is assumed that all of the crew perished by the winter of 1848.

Several theories have been posited over the years to explain why the expedition failed so catastrophically. High concentrations of lead in skeletal remains point to lead poisoning, perhaps from tinned food, as a cause. However, at the lead levels experienced by Franklin's crew, their mental competency would not be affected significantly more than the general population, which in industrial Victorian Britain was exposed to large amounts of lead. Moreover, the search parties were identically provisioned and faced no such calamity, as analysis of their medical logs (stored in the National Archives) shows. See Swanston et al. (2016) and Christiansen et al. (2016).

Other proposed explanations are scurvy (but there were no signs in recovered skeletons, and few cases among the search party), tuberculosis (again, not a major cause of deaths among the search party) or polar bears (the crew were extremely well armed). Perhaps a more likely cause was the climate: ice core analysis shows 1846–1850 was exceptionally cold, with much less summer melting of the ice than usual.

No satisfying conclusion has yet been reached, but the discovery in 2014 of the wreck of HMS Erebus, and in 2016 of the Terror, provide the possibility of recovering paper records that could settle the mystery once and for all.

Thursday 8th December 2016, University of Warwick

Talk summary by David Selby (Warwick)

Improving statistical quality in published research: The clinical experience

Professor Martin Bland, University of York

On Wednesday 9th November we were delighted to be joined by the eminent Emeritus Professor Martin Bland who gave a talk entitled ‘Improving statistical quality in published research: The clinical experience’.

Martin began his talk by explaining the state of medical research in 1972 by summarising the findings of reviewing articles published in the Lancet and British Medical Journal. In a month in 1972, Martin found the median sample size of studies published was approximately 30. Martin also looked at the number of studies from this period in 1972 using hypothesis tests and reporting p-values, approximately half of these studies were reporting p-values but the use of confidence intervals was rare.

Martin then told the audience of the situation when reviewing papers published in the Lancet and British Medical Journal in 2010, again for the period of one month. The situation had changed somewhat with fewer medical research studies being published. Those published in 2010 had dramatically increased sample sizes compared with 1972, for the Lancet papers the median was approximately 1,600 and for the British Medical Journal papers the median sample size was in excess of 10,000. Also for the studies identified in 2010, most reported confidence intervals alongside estimates with some reporting p-values.

Martin then reflected on the important work that had potentially contributed to this improvement in quality. Of note, was the suggestion of Peto in 1981 to concentrate on large and simple trials; the work of Gardner and Altman in 1986 advocating the use of confidence intervals rather than p-values and journals including this in their guidance to authors; and, the work of Altman et al. in 1981 and many others subsequently which involved quality assessment of journals. In addition to these Martin recognised the potential for improvement in quality to be due to the ability of readers to critique published work, the use of statistical referees by journals and the impact of the CONSORT statement and similar initiatives.

Martin did, however, warn this is not the case in all areas of medical research. The Lancet and the British Medical Journal are leading general medical journals. The quality of research in specialist and non-clinical journals is still poor, as demonstrated by a review carried out by Kilkenny et al. 2009. It is the duty of statisticians to challenge this poor research by taking part in peer review, challenging published research and educating other researchers.

Wednesday 9th November 2016, University of Birmingham

Talk summary by Alice Sitch (Birmingham)

Getting the numbers out there: the public role of statisticians

Professor Sir David Spiegelhalter, University of Cambridge

This often laugh-out-loud funny talk had a serious key message: that statisticians have an important role to play in public life.

David, who holds the position of Winton Professor of the Public Understanding of Risk, gave several examples of using analogies and easily explainable concepts to help the general public grasp risk, such as the concept of a micromort (a one-in-a-million chance of sudden death). He also showed how clearly drawn expected frequency trees can help people make medical decisions, such as whether or not to undergo screening for breast cancer. Statisticians have an important role to play in working with organisations such as the NHS to give the public the tools they need to make these kind of decisions.

David also highlighted another big role for public statisticians: stop silly stuff getting into the news. I'm sure it will not surprise readers of this summary that the newspaper headlines "'Bacon, ham and sausages have the same cancer risk as cigarettes' warn experts" and "Why going to university increases risk of getting a brain tumour" are not accurate representations of the studies being reported. Public statisticians have a job to do in helping journalists understand the statistics in such studies, and also in debunking such reporting whenever it does appear.

Concluding his talk, David pointed out the increasing demand for insight into numbers. In order to get the message out there, there need to be statisticians with media training. Developing contacts with policy makers and the media is also important (though this takes a while). As incoming president of the RSS, he affirmed the role that that organisation has to play, and steps being taken in that directions, such the ambassador programme, responding to consultations and being a conduit for helping journalists.

As a final note, those who weren't there to see the (unfortunately) discarded draft cover for his book Sex by Numbers missed a treat. I will never be able to look at a three category bar graph in the same way again. And, as he recalled, even the Winton Professor for the Public Understanding of risk is not immune from questionable press coverage, as demonstrated by the hilarious story about how a joke he made when promoting the book made international headlines as "'Britons are having less sex and Game of Thrones could be to blame' warns Cambridge professor"!

Wednesday 26th October 2016, University of Warwick

Talk summary by Ella Kaye (Warwick)

Statistics: a data science for the twenty-first century

Professor Peter Diggle, Lancaster University

This talk shared Peter’s vision and thoughts on the discipline of statistics based upon his experience working with the Faculty of Health and Medicine at Lancaster University, the Institute of Infection and Global Health at the University of Liverpool and as one of the founding co-editors of the journal Biostatistics. Peter discussed the rise of data science and its relationship with statistics, indicating that although data science may be seen by some as a threat to statistics, it may also provide an opportunity for statisticians to promote the importance of the correct application of statistical methodology for data interpretation. This has become of particular importance in recent years as the links between information technology and statistics have strengthened, giving rise to new fields such as electronic health, or e-health, research.

Following this, Peter provided examples of some recent e-health research projects within which he has been involved, including real-time spatial surveillance of gastroentritic illness, National Health Service prescribing patterns for Ritalin, monitoring of long-term progression to end stage kidney failure and a programme for onchocerciasis (also known as river blindness) control in Africa. These examples highlighted areas for which statistical methodologies may be applied to interpret data and influence policy, but for which skill in informatics is necessary to first acquire the data and provide a useful solution.

In his concluding remarks, Peter highlighted the importance of interdisciplinary work which includes input from statisticians and how a University may be structured to facilitate and encourage interdisciplinary collaboration. Finally, as University students and researchers are the product of a number of education systems before they reach University, Peter also gave topic suggestions which he believes should be taught at each stage of education.

Wednesday 26th October 2016, University of Warwick

Talk summary by Alejandra Avalos Pacheco (Warwick)

Data Science: the Nexus of Mathematics, Statistics and Computer Science?

Professor Mark Girolami, University of Warwick

The RSS West Midlands Local Group invited Prof Mark Girolami to speak at the University of Warwick Department of Statistics on 5th May 2016. Mark's talk, entitled "Data Science: the Nexus of Mathematics, Statistics and Computer Science?" explored the meaning of the popular phrase "data science". Is it a hip new title, or merely a buzzword? Mark argues that unlike the mostly meaningless term "big data" espoused by marketeers, data science has a historical basis and describes the interplay between theory, application and translation of statistical, mathematical and technological ideas.

Mark posits the first "data scientist" to be Karl Pearson (1857–1936), as an enthusiastic user of (manual) computing, who applied mathematics and statistical analysis to many areas. Other early data scientists include Ronald Fisher (1890–1962). Both developed tools to analyse data and answer questions, relying on the latest contemporary technology such as the Brunsviga mechanical calculator.

Fast-forward to the modern day, where Mark demonstrated his own forays into data science problems: one example involved developing an adaptive, inexpensive process for assessing the authenticity and quality of banknotes deposited in NCR cash machines in the Far East. A large number of indicators were available to detect counterfeit or worn-out notes, including light sensors, note thickness, magnetic properties and logs of ATM usage patterns. The solution made use of feature extraction and probabilistic multiple kernel learning (pMKL) to classify the banknotes. While these may sound like bleeding-edge techniques, the technology is no different in substance to Fisher's discriminant analysis from the 1930s. Clearly, many data science problems are not new, but they are not resolved either.

Mark also highlighted the role of mathematics in uncertainty quantification in models describing natural phenomena such as groundwater flow, circadian rhythms and the human heart. Such models are so complex that making inferences and quantifying uncertainty is hugely challenging: even a finite-dimensional representation can take hours to simulate.

In the twenty-first century as in the 1920s, data science problems involve the development of theory, the involvement of domain experts and the utilisation of modern computing technology. There are opportunities for mathematicians, statisticians and computer scientists in each of these areas.

Thursday 5th May 2016, University of Warwick

Talk summary by David Selby (Warwick)

Universities Superannuation Scheme: some statistical reflections on pensions

Professor Jane Hutton, University of Warwick

On 3rd March 2016, Jane Hutton spoke at the Royal Statistical Society West Midlands local group meeting. Jane Hutton is a Professor at the University of Warwick's Statistics department and a recently appointed non-executive director (NED) of the Universities Superannuation Scheme (USS).

Professor Hutton's talk was entitled: 'USS: some statistical reflections on pensions'. Professor Hutton began with the importance and role of pensions in old age and retirement. Then, she gave a broad overview of pensions, e.g. pensions depend on and influence family and social structures and traditionally, provision depended on children.

Professor Hutton posed the following questions: 'How much should I save?', 'How much should be saved?', and 'Who should save: the state, employer or individual?' She suggested thinking about: 'how long will I work and live in retirement?' This provided the motivation for the next part of her talk on mortality.

Mortality was a central theme in Professor Hutton's talk. She displayed graphs of gender differences in life expectancy at birth in England and Wales, 1841-2009, and mortality by social class. She also displayed and questioned the Employers Pension Forum's claim of life expectancy and compared it with actual data. She explained how actuaries used "Continuous Mortality Investigation 39" to determine longevity risk and expressed concern about discrepancies in the data and how it has been interpreted.

Professor Hutton also compared her experience as a NED of the USS with her experience in medicine as a statistician. For example, she has been concerned by the initial reluctance to share the data completely for her to examine and mentioned that she gets especially concerned when people do not share their data and respond with "trust me". Despite her concerns, she commented how impressed she was with the professionalism and dedication of many of the USS's staff.

Thursday 3rd March 2016, University of Warwick

Talk summary by Kenneth Lim (Warwick)