At the IATUL follow-up workshop on library statistics in Africa, on April 19, 2013, Tord Høivik prepared three sessions - with background papers. The sessions covered three different aspects of the subject:
All materials are also published, with a CC licence, on the web sitehttps://sites.google.com/site/practicalstatistics/2-events/cape-town
This document includes all three papers.
Change is the key to knowledge
My three axioms are:
Many people speak about EBLIP, or evidence-based librarianship and information practice. But EBLIP and library statistics are only different names for the same animal.
The "evidence movement" developed in medicine about thirty years ago. Since then, the principles have expanded into many practical professions: nursing, social work, teaching, librarianship.
Evidence-based practice means that decisions should be based on systematic evidence
Change from below
For many years we have tried to change the statistical practices of librarians through committees, concepts and proposals from the top. This approach does not work.
Librarians are not willing to change their routines just because committees, without power or money to impose their views, say so. We have to start at the bottom. That means to improve existing data and current practices, year by year, in cooperation with the libraries that do the actual work of collecting, interpreting and applying statistical data.
We have to change from a top-down to a bottom-up approach.
Top-down work is easy to organize. You gather ten people around a table and ask them to make proposals. After one or two years the committee is finished. Implementation is left to the libraries. If they do not want to do what the committee proposes, the process stops here. The committee draws up a plan of work, but leaves the work itself to the library community.
Bottom-up work is hard to organize. Organizations don’t enjoy change. They resist change. The change agent has to find libraries and library organizations that are willing to cooperate. We have to form networks and production teams rather than committees. We have to test, to train and to argue our case. We have to struggle with the intellectual and material difficulties of statistics production. This is not for the faint of heart. It takes years of commitment and thousands of hours of work.
Change can be encouraged at the top, but must be realized at the bottom. That’s the way the world works.
The world has about two hundred countries. Less than twenty collect good library statistics at the national level.
This does not mean a total lack of statistics. Nearly all countries have universities. These universities have libraries. Today, most of the libraries operate digital systems. The systems can be used to generate a variety of statistical reports.
But potential access to statistical does not, by itself, lead to data-driven decision making. Decisions combine political and rational elements. Evidence-based practice requires an evidence-based culture. Managers must respect, promote and integrate statistical data into their daily work. Staff must accept and share statistical arguments.
Statistical literacy is the ability of organizations and individuals to understand and to argue with statistics - without losing sight of other important considerations.
Statistical literacy is not widespread. More important: practical statistics is seldom recognized as a professional skill. The first step towards statistical literacy is to accept that amateurs differ from experts.
Typical amateur practices are:
I am not saying that this is the answer. But statistics is about change. The intention is to look at possible actions- to help the discussion along.
The job title was information reporter, but the job description shows that the university wanted a "library statistician". Academic libraries in the United States would probably use the title assessment librarian.
The main purpose was statistical reporting. The candidate would
The candidate should be able to
Since I believe in the re-use of good stuff, I have copied the introduction - but added my own sub-headings.
Students are not staff. Visits are different from loans. The number of seats, the number of volumes and the actual size of the building show size from different perspectives.
Single indicator values are hard to judge.
To get a balanced picture of a library we must combine several indicators into groups or sets of indicators.
It is easy for a single person to construct a nice set of indicators. It is harder to achieve consensus within a committee. It is nearly impossible to get hundreds and thousands of librarians to actually use the well-intentioned proposals.
The reasons are pretty obvious.
Statistical indicators are similar to exams. Only the best students enjoy them. The rest would rather do something else.
Most librarians dislike numbers. Their culture is partly literary and partly control oriented, but always qualitative. They tend to be great talkers, but not so great calculators. They avoid numerical arguments, since these reveal their lack of skills and interest in quantitative reasoning.
It is also the case that dfferent libraries have different interests. Indicators are similar to sports. We prefer to compete in our favorite event. We want indicator sets that focus on our best qualities - and avoid the dark secrets that lie behind them. Doing well with a balanced set of indicators requires a broad focus. It becomes a decathlon struggle rather than a long jump or throwing the javelin.
But let us take a look at two indicator sets: the ISO standard 11620 and the German library index BIX.
ISO in Italy
The newest version of ISO 11620, from 2008, comprises forty-five indicators. I have listed them here.
The total is rather overwhelming. The ISO committee does not insist that libraries should use all the indicators, or that public and academic libraries should use the same subset. Librarians are asked to use their own judgement.
But this standard has a top-down character. It is not supported by a network of practitioners who are actually using the indicators - and reporting on their use.
Let me illustrate by a case from Italy. This is one of the few reports I've seen that describes actual use.
In 2002, Paolo Bellini published an article on performance measurement as a marketing support for libraries. He wrote:
At the Library of the University of Trento, we have been applying performance indicators since 1998 as a planning and evaluation tool and to indicate the working of the library. The ISO standard 11620 was chosen from the outset for various reasons: it was specifically designed for libraries and is the most recently developed method. ...
The main phases of the project have been:
The major difficulties encountered in carrying out the project were:
This seems to be a realistic assessment of the situation. It confirms a study I did last year, on the use of indicators in Norwegian libraries.
That paper explored the great discrepancy between recommended library indicators, on the one hand, and the actual use of statistics by libraries and librarians, on the other.
Norway has three official indicator sets: a thirteen indicator set for public libraries, another thirty indicator set for public libraries and finally a set of twenty-four indicators for academic libraries. But the recommended sets are hardly used at all. The proposals are not taken seriously by the intended users.
BIX in Germany
The German library index BIX is, in my view, the most successful library indicator system in the world.
There are several reasons for this
Indicators are technical tools.
ISO and BIX
ISO 11620 includes two indicators based on the number of visits:
The first one [LVPC] is widely used. The second is seldom used - and not very informative, I would add.
In either case we have to specify what we mean by a library visit.
Personally I would define a genuine library visitor as a person who enters the library premises in order to use the library's professional facilities as a user. This concept excludes:
When we set up a system to register visitors it may be impractical to distinguish customers from people who just happen to enter for other reasons. But that is not a reason to be vague about the concepts involved.
Do not confuse meaning with measurement.
In ISO 2782 a visit is simply defined (2.2.40) as a person (individual) entering the library premises.
But the procedure (6.2.10) adds: where necessary, the count should be adjusted to deduct entrances and exits of library staff, and of any persons visiting other institutions or departments situated within the library building.
This seems unclear. I would rather define a visit as
a person who enters the library premises in order to use the library's professional facilities as a user.
If the actual measurement is carried out by turnstile count or electronic counter, there are various sources of errors. For instance:
There may be additional errors due to the actual location of the turnstile or the counter. With manual counts, the errors mentioned can mostly be avoided.
If the library suspects substantial errors of measurement (more than +/- 5%, say) , the counter should be calibrated (by running a manual count in parallell for a suitable period of time).
Visits per user
The first indicator can be improved by dividing the population (target group, size N) into users and non-users.
LVPC and LVPU are related by a simple formula:
The total number of visits = A * N = B * (P * N) + 0 * (Q * N) = (B* P) * N
Divide by N on both sides
This distinction is often fruitful in library planning. It helps us differentiate between users and non-users. Library managers would normally want
To be statistically literate you have to be comfortable with fractions and percentages.
Indicators are ratios. When we calculate visits per capita, we divide the number of visits (the numerator) by the size of the target population (the denominator).
To understand the concept of visits per capita we need to understand the denominator as well as the numerator. What do we mean, or what should we mean, by target population?
I will not go into this discussion, but simply say: this is not an easy question. Libraries often serve many different groups, in different ways. Public libraries are used by non-residents as well as residents. Academic libraries are used by staff, on-site students, part-time students, distance students, and sometimes by the local community.
It is also clear that defining the target population tends to be more difficult in the South than North.
My final example is a bit more technical. I refer to the role of statistical sampling in evidence-based practice.
Sampling is a dangerous area. Sampling is not difficult - if you follow the rules. But people are constantly tempted to break the rules. Most researchers are lazy. It is much easier to work with the data at hand than to collect data systematically.
Convenience sampling is the best example. I'll simply cite Wikipedia:
Accidental sampling (sometimes known as grab, convenience sampling or opportunity sampling) is a type of non-probability sampling which involves the sample being drawn from that part of the population which is close to hand. That is, a sample population selected because it is readily available and convenient. The researcher using such a sample cannot scientifically make generalizations about the total population from this sample because it would not be representative enough.
I want to measure what is happening at my library. I want to know, say, about:
Librarians cannot monitor and register everything that happens on a continuous basis. The cost would be too high.
The methods below will work whether the library works on a digital or a manual basis: no automated catalogue and no electronic counter at the entrance. The idea of sampling is the same.
Masters of the universe
Data collection is work – hard, disciplined work – and should be kept to a minimum. Statistical sampling is a technique for collecting data that minimizes the work, while providing the answers we need. When we sample, we take a selection from a larger total – using a particular (and strictly enforced) technique – and treat the sample as if it were the total.
The total is is often called the universe or the population.
Perfect accuracy is seldom needed. It does not really matter whether the library had 13.415 or 13.615 visitors in 2006. But the difference between thirteen and fifteen thousand visitors matter.
As a rule of thumb, I would say:
We can usually get information that is good enough for practical decision-making, from a sample of a few hundred items. The greater the sample, the greater the accuracy.
A very basic, and also very surprising, statistical rule is: The size of the original population does not matter.Accuracy only depends of the size of the sample.
First example: selecting books
Let me apply the idea of sampling to the book collection.
My library has, say, ten thousand books. I want to know how up-to-date my collection is, by looking at the year of publication.
Checking ten thousand cards and writing down ten thousand numbers does not appeal to me. I appeal to statistics and take a sample of – say – two hundred cards instead. This could, for obvious reasons, be called a two percent sample.
The big idea in statistical sampling lies in the way you go about selecting the sample from the total.
You should not pull two hundred consecutive cards from the nearest drawer. Nor should you rummage around, taking one here and one there as the mood takes you. The sample should
There are many ways of achieving this. The simplest is probably to take a look at every fiftieth card and write down the year of publication.I have to choose my sample from the “whole population” and in a proper “mechanical” way.
The distribution of these two hundred numbers will provide a good approximation to the true distribution baed on all ten thousand publication years.
Second example: selecting users
Concepts are important. If we want to study users, we must first decide the limits of the population.
For instance, do I mean:
Let me define user as a person that is registered as a user. My population will then consist of a set of registration cards.
I want to understand the social impact of the library by looking at the demographic distribution of users in the local community.
The selection procedure is simple. Since 6.000/200 = 30, I may simply start with a number between 1 and 30 and look at every thirtieth card.
Let me define user as a person that has borrowed materials during the last year? My population will then consist of a subset of the registration cards (or database posts).
I want to understand the relative use of the library by different groups of students.
The selection procedure is simple. Since 6.000/200 = 30, I may simply start with a number between 1 and 30 and look at every thirtieth card/post.
If the card/post belongs to a staff member, or to a student who did not borrow anything during the last year, take the next (or the third, fourth, ...)
Third example: selecting days
My library is open – say – six days a week. We open at 9 am, take a break from noon til 2 pm, and open again from 2 till 6 pm. On Saturday, there is no afternoon session. The library is also closed for a total of four weeks during holidays.
This means that the library is open 6 * (52 – 4) = 288 days a year.
I want to know the number of visitors we have in a year. I know, from expeiernce, that library use tends to vary systematically during the day, during the week and during the year.
If I want to know the true number of visitors,
There are, as before, many ways of achieving this. The easiest is probably to select a small number of “counting days” throughout the year. On these days all visitors are counted.
You may, for instance start with the first Monday in January – and continue with the first Tuesday in February, the first Wednesday in March and so on. This approach will give you 12 days, or two full weeks, covering the whole year. Since the library keeps open 48 weeks a year, you find the total number of visitors by multiplying the observed number with 24.
The author, Tord Høivik, is a former library teacher from Norway. He has a professional background in statistics and sociology and is active on the web.
Let us move from literacy to advocacy.
My friend Ray Lyons is a specialist in library statistics. A real expert. A genuine, honest-to-goodness professional. In his blog, and other publications, he concentrates on the United States and American library statistics. He is a strong library advocate - and a severe critic of weak library statistics. We cannot defend our libraries with plastic swords.
Some good advice:
Only Trust Numbers
In Kahneman's famous book, Thinking, fast and slow, hunches belong to system 1 (fast), while quantitative analysis belongs to system 2 (slow).
Never Trust Numbers
Before we reconcile our apparently inconsistent advice, first let us explain why numbers are not worthy of your trust:
Numbers are Answers
A number only gets to be useful when considered as the answer to a question. To be a good consumer of numbers, the reader must constantly ask himself:
Inductive or descriptive
A final point.
In library schools, statistics have usually been taught as a tool for research. This means an emphasis on probabilistic models and hypothesis testing (inductive statistics). I believe in a different approach. Most librarians need management data rather than research. This requires a good understanding of the descriptive statistics that are produced on a regular and repetitive basis by ILS and other administrative systems.
I fully agree with his description of the situation with respect to advocacy:
BSLA activities in Africa include
Such a course can not be fully standardized, however. The levels of library development, and of official statistics, differ too much. Each course will take place in a particular setting and must be aimed at participants from a particular working environment. Africa is different from Europe. Northern Europe is different from Southern Europe.
Adapting to local conditions means, concretely, to adapt the course materials to:
A. Relevant conditions in the national or regional library environments, such as:
Based on our experience, advocacy training for library staff should build knowledge and skills in the following areas to be most effective.
We recommend that advocacy training for library staff follow these principles to have the greatest impact on library staff behavior.
The principles of high quality graphical data presentation have been articulated by William Cleveland, Edward Tufte, Howard Wainer and others. Good graphing practice is based on these three rules:
The actual graphics are in colour and will be shown on screens.
This page is an example of the actual teaching material for IFLA Statistics for Advocacy training course.
Through this module the participants should gain a basic understanding of:
This page is another example of the actual teaching materials for IFLA Statistics for Advocacy training course.
In statistics, the group we want to study is often called the universe or the (statistical) population.
For most public libraries, the population (of the community) is so large that we cannot contact every single individual. Instead we have to take a population sample.
A sample should represent the whole population. It should therefore be as similar to the population as possible. The sample should have about the same percentage of men and women, of young and old, of workers and non-workers, of literates and illiterates, as the population as a whole. Such a sample would be a representative sample.
It is not easy to select representative samples. Unless you follow some rather strict rules, your sample is likely to be biased. The values that you find in a biased sample can not be generalized to the population. Biased samples give skewed results.
Biased samples are very common. It is very tempting to select people that are easily available. Such samples are sometimes called convenience samples. Data from convenience samples tell very little about the population as such.
They are convenient for the "researcher", since they reduce the amount of work. They are inconvenient for the reader, who is presented with false, misleading or irrelevant pieces of information.
Define your target group
There are many different methods that may be used to gather new data about users in single libraries.
The first thing you have to decide, however, is which group of persons you want to study. The target group may for instance be be:
Control your sample
To avoid bias, your sampling procedure should be based on statistical guidelines.
In academic libraries the relevant populations tend to be much smaller than in community libraries.
In small institutions, it may be possible to send questionnaires to all students, or at least to all staff members. In larger universities, sampling is still a very useful tool.
Librarians have, in general, very little systematic information about activities inside their libraries. CounT The Traffic (TTT) is a cheap and simple method to gather such data. It gives a good numerical picture of how library users actually use the various parts of the library.
TTT reveals both the quality - or the kinds of activity - and the quantity of use. Combined with data on the number of visitors it will also indicate the average length of stay.
libraries are organizations that produce and deliver media-based
services to their local communities.
From a production point of view our statistical data can be roughly divided into four main categories:
Everybody is interested in the budget. Budgets are treated as a zero-sum games.
The Global Libraries program of the Gates Foundation have pointed to six areas in which libraries can make a difference.
In a global perspective there are many positive signs of growth.
A fair number of countries now make detailed library statistics available on the web. I am aware of the following – but there may be others:
Indicators and assessment
For more information about individual countries see Plinius Data.
countries, like Denmark and New Zealand, publish their primary data as
spreadsheets. They offer access - which is good, but are large and
difficult to work with. Other countries, like Finland, Norway and the
Netherlands, provide structured databases. These are easier to manage,
but actual usage still seems to be low. Librarians are more comfortable
with words than with numbers.
The road from potential access to actual usage requires statistical literacy.
In Germany, the library index BIX is well established both in the public and in the academic library sector. Libraries participate on a voluntary basis. But steady work with real data over many years has made an impact. In the United States, the new LJ Index for public libraries is professionally designed and very well presented.
In academic libraries, the LibQual survey of user satisfaction is well established – and has contributed to a culture of assessment. Like BIX, LibQual is also moving into other countries. The web facilitates horizontal interaction and can support the social (as well as the centralized) approach to indicator development.
In its Global Libraries Programme, the Bill and Melinda Gates Foundation insists on systematic data collection and evaluation. A dozen countries have benefited from this hard-nosed approach - and could be used as models by other library communities. See Sawaya (2009) for an overview.
In the United States, a coalition of library and local government organizations, including the Gates Foundation, launched the Library Edge initiative in 2011. Edge is developing a rating system comprised of benchmarks and indicators designed to work as an assessment tool. The instrument looks very well designed: balanced, clear and comprehensive.
Tools for observing user behavior inside libraries have started to appear – in Canada (seating sweeps), in the US, in Sweden and in Norway (CounT The Traffic). SeeTTT: Bibliography for details.
In public libraries, workable indicators of web traffic have started to appear, with Denmark as the front runner (Danmarks Biblioteksindex). In academic libraries, standardized measures of database use are being developed (COUNTER).
IFLA has taken a strong interest in statistics for advocacy and has adopted a statistical manifesto. The IFLA/FAIFE World Report series is a biennial publiocation that reports on the state of the world in terms of freedom of access to information, freedom of expresion and related issues. Includes statistics about the no. of libraries The reports are available online.
Libraries represent a small sector within the big picture. In Norway, which has a well-developed library system, its share of employment is about 0.2 percent. If you select one thousand workers at random, you would find one librarian and one library assistant. We can not expect politicians and managers to invest heavily in library statistics unless they invest in statistics in general. If the context improves, libraries improve.
We should therefore support the development of educational and cultural statistics. This is already happening. Better educational statistics will help us document the work of academic and school libraries. Better cultural statistics will help us document the work of public libraries.
In the South we should also encourage better community statistics. The more we know about local communities, the better we can document the need and assess the impact of school and public libraries. Much is happening in the wider field of statistics - see Global stats for some examples.
Some of the problems we face are:
Libraries must compete for attention. Our access to verbal and visual resources is overwhelming. We live in an attention economy. There is no shortage of information. There is a shortage of eyes.
Within the library sector, we who work with statistics must also compete for attention. Academic libraries are beginning to see the need for systematic data. But public libraries are less committed.
We want to change habits. That requires a combination of training, marketing and politics. We have no direct power. The customer is always right.
Creative use of social media like blogs, twitter and even flickr is one approach.
At the moment, the quantity and quality of statistical information on libraries varies enormously between countries. A handful of countries run advanced statistical systems with comprehensive coverage of their public, academic and special libraries. School libraries, which tend to be small and poorly staffed, is still a statistical problem, however.
Efforts have been made to establish global library statistics. I do not believe, however, that a centralized approach, with one data base, standardized reports, a fixed set of indicators, an elected governing board, and so on is sustainable under today’s conditions. A distributed network of volunteers, with a bit of coordination, is something else.
For empirical evidence I refer to the section onOCLC. Let me also add that failure is useful. Unless we accept the risk of failure, and are willing to learn from failure, we are stuck.
The most advanced countries (based on web information) seem to be Finland, Norway, Denmark, the Netherlands and New Zealand. Sweden, Canada (some states) and Australia (some states) could be added. Let me call them group 4.
This group consists of countries with well-developed library systems and some good statistics at the national level, like Great Britain, Germany, Italy, much of Eastern Europe, the United States, Chile, Singapore and a few others. The main problem, seen from abroad, is the lack of extensive, user-oriented web publishing of the data.
This group consists of countries with more uneven and fragmented library systems. The public library sector tends to have greater difficulties than academic and special libraries. But even the latter library types may suffer from a lack of national-level coordination. Most countries in the world belong to group 2.
The least developed countries (library-wise) have no national statistics whatsoever. In a few cases they publish some scattered library data in statistical yearbooks or reports on cultural statistics. But it is hard to find and hard to use such information. People do not have time to visit the few big libraries that receive such publications in order to dig out a few numbers.
The quality and quantity of library statistics depend heavily on the size and strength of library organizations at the national level. Data must be collected, processed, published and applied by someone. That someone may for instance be
In this situation information at levels 1-3 must be gathered locally, by people who live in the country, who are familiar with its statistical system and who can follow the constant changes that are likely to occur. Alle these countries are trying to improve their systems, from wherever they happen to be.
Level 4 is different. Information from level is available on the web to all interested parties. There is just so much data available. What we find in these countries is not a lack of data per se, but a shortage of statistical analysis, debate and practical use. There is too little processing going on.
This morning I said:
That has both a negative and a positive side:
Support from the top is welcome, of course. But the idea of uniformity is wrong.
cannot move forward at the same speed, one step at a time, like
soldiers on parade. Start with networks and alliances among those who
are capable and willing to invest in statistics.
As the field develops, weaker libraries can follow in their path and learn from their practices.
Library statistics are rudimentary in most countries.
Improving this situation, through workshops, lectures, publications, web debates, and so on, must also be done through local initiatives. Each country has its own library community. It is possible and useful to visit the discussions that go on in neighbouring countries.
But in Europe, at least, it is almost impossible to be an active participant in several discussion communities at the same time.
The strategy, or set of actions I believe in, are:
A population census counts the people that live in a particular country and register some of their characteristics. A library census does the same for libraries.
Censuses describe the whole universe and provide a general basis for economic and social planning. They are large projects and are usually conducted every ten years.
Library censuses can be very useful in low- and middle-income countries ("the South"). Let us take a look at a well-designed census. In the Dominican Republic the National Library carried out a major study of the country’s public libraries.
The first national census of Dominican libraries was carried out in 1999. The investigation was carried out by the Oficina Nacional de Estadística with the assistance of the national library. It was an important step ahead, but the study suffered from many methodological weaknesses. The second census, about ten years later, was much improved.
The long-term, or strategic goal of the project, was to establish a
sound empirical basis for library planning, so that libraries could become
tools for social development in a wide sense. The goal of the first phase of the study
(“the pre-census”) was to identify and briefly describe all existing
information provision units in the country. This investigation, which
covered school, public, special and academic libraries, was completed in
The goal of the second phase was to analyze the conditions and the situation of the public libraries, including their relationship with the local community.
Data collection combined qualitative and quantitative methods
Some technical details
National Information System
The data were entered into a National Register and Information System for Libraries (Sistema Nacional de Información y Registro Bibliotecario – SINIREB), which described
in each library.
NoteMost of the information above is based on correspondence between the Statistics and Evaluation Section (IFLA) and the organizers of the Dominican census. The organizers presented their work at the 2012 IFLA conference in Helsinki.People who work with library statistics can support the field by sharing their data.
The organization of library statistics differ from country to country. Mapping, using and comparing such statistics is often time-consuming.
Since I have an interest in comparative statistics, I have tried to document my own work in this field. The basic idea is simple. When I study Norwegian, or German, or South African library statistics, I must spend some time to explore the statistical system and additional time to process some of the data. If I publish my map and my processed data on the web, others can build on that.
The map may be sketchy and the data may be limited, but a small slice of the pizza is better than nothing. Some relevant statistics from neighboring fields – like media, culture and education – may also be included.
Library statistics can not stand alone.
When we work with statistics about libraries, we will often need statistics from other fields. Population data are needed to calculate some of the most basic library indicators - for instance loans and visits per capita. Statistics about literacy and reading, health and education, infrastructure (water, roads, electricity, ...) and culture can often be used to
Since Global statistics for advocacy is an international rather than a national project, we have to do the same at the global level. Since the Second World War, lots of good work has been done, by the UN and others, to collect, systematize and present comparative national statistics.
During the last fifteen years, internet has facilitated the development of sophisticated, user-friendly and interactive statistical delivery systems. The most impressive one to date is Gapminder - with the slogan Unveiling the beauty of statistics for a fact-based world view. Google has used Gapminder as a model for in its own platform for Global Public Data.
The United Nations Population Information Network has created an excellent guide to population information on UN system web sites. The World Factbook - a well-known source among reference librarians - is still going strong. The data are good though the publisher (CIA) may be spooky. GeoHive - with detailed population data - is maintained by a single person (Johan van der Heyden).
In 2009 Elisha Chiware described the main problems in South Africa as follows:
He noted, however, that many academic libraries have undertaken LibQUAL surveys to solicit, track, understand and act upon users’ opinions of service quality.
A global trend
Networks are part of global development
Three examples of statistical networks are
Other potential networks of strong libraries are
as well as
The Edge initiative was launched in 2011 by a coalition of library and local government organizations, including The Bill and Melinda Gates Foundation. The goal is to develop a suite of tools that support continuous improvement and reinvestment in public technology. Thia approach was probably inspired by the Gates Foundation, which has always emphasized evidence-based planning and assessment.
Edge is developing a rating system comprised of benchmarks and indicators designed to work as an assessment tool - will help library staff understand best practices in public access technology services for their communities and determine what steps they need to take to improve their technology services.
The whole system looks very well designed: balanced, clear and comprehensive. The instrument has been developed for a US context, but the basic approach could well be a model for similar intitiatives in other countries.
Since we are looking at statistical indicators rather than evaluation and assessment in general, I have extracted the more statistical components. Version 1.0 of the benchmarks asks for three types of statistics: survey data, book-keeping data and web traffic data,
The library surveys patrons annually about public technology use and outcomes in the following purpose areas:
The following metrics are tracked on an ongoing basis:
Today I have looked at statistics, as applied to libraries, from three points of view:
In these three sessions, and the papers that underpin them, I have tried to stay close to the evidence.
Statistics is usually taught as a set of technical instructions, which people are just asked to follow. I am more interested in the actual statistical practices. To what extent do people follow the statistical advice?
If we look at what libraries actually do, we find a surprising discrepancy between statistical norms and statistical behavior. I have tried to document and to explain this gap in a couple of conference papers (Høivik, 2003; 2012).
I am not pessimistic about statistics. We are moving forwards on many fronts. But we need to understand that statistics is a hard subject. Descriptive statistics is not technically difficult. Statistics is hard because it is counterintuitive. The results often contradict cherished beliefs.
To understand the resistance to statistics I'd like to end with a reference to Kahneman's model of human reasoning. We have two thinking systems. System 1 is fast, intuitive and effortless. It is usually right, but occasionally dead wrong.
System 2 is slow, intentional and hard-working. If it involves a community of practice rather than a single practitioner, it tends to be right. Statistics, as well as scientific thinking, belongs to system 2.
System 1 (fast)
System 2 (slow)
Statistics is very practical. But it is a practical discipline that demands intellectual discipline.