Here are our three different current themes in the sampling area:
1. How many times should we try to call someone?
It has become harder to get hold of people who have been selected to participate in Statistics Sweden's surveys – this is a known fact. Making repeated attempts to get hold of people is also expensive, as the interviewers' time is spent trying to get hold of them or find their phone number instead of being able to do the interviews.
Anton Johansson, The Department of Statistics’ externally employed doctoral student, is currently working on this problem at the interview unit at Statistics Sweden.
The aim of the research is to try to find a model that keeps the budget, while the estimate will be good. One of the issues that he researches into, is to estimate the probability that you will get to do an interview with a person who has been drawn randomly.
If we see that the probability of getting hold of a person having already made, for example, eight attempts at contact is very low, the focus will be to direct the resources to some other type of activity
Another research project is to investigate the so called dentist strategy. That is, sending a text message in advance that Statistics Sweden will contact the people the next day, with the goal of more people to respond.
2. Can balanced sampling replace random sampling?
Dan Hedlin, professor of statistics, researches together with Beat Hulliger, professor at the University of Neuchâtel on balanced sampling.
The idea is to draw a balanced sample, rather than a random sample. The goal is to get responses from a group of people that are as similar to the population they want to draw conclusions on, as possible.
The research is close to Peter Lundquist and Carl-Erik Särndal’s research, both tied to Stockholm University, but with the important difference that theirs is based on a random sample.
3. How safe is it to extrapolate?
Dan Hedlin also researches into so-called "cut off"-sampling. One application is business statistics, where you want to examine a sample of, for example, different companies with more than five employees.
You will not examine the companies that have up to five employees, because it would cause small businesses too big of a response burden.
A regression model (ie a model of the relationship between, for example, the variable number of employees in the company and the variable orders, or whatever the study involves) is then fit to the data from companies with more than five employees.
Since you are not taking a sample from the companies with up to five employees, you will apply the relationship between the variables of the larger companies straight on to the smaller.
This means that you simply assume that the relationship looks the same for smaller companies as for bigger, and this is called to extrapolate or to estimate extrapolated lines. There is very little literature on this, even though it wildly used in practice and although there are some unresolved problems.
Dan Hedlin’s idea is to test how secure this model is, by using the so-called bootstrapping. It involves taking a thousand different samples of the larger companies, and retrieve a regression line for each of these samples. After that you will in turn estimate an extrapolated line 1000 times; thus you will be able to investigate the uncertainty of the model.