Send us your feedback

Thank you for your feedback. An email has been sent to the ESRC support team.

An error occured whilst sending your feedback. Please review the problems below.

STRUCTURES for Building, Learning, Applying and Computing Statistical Models

Grant reference: RES-576-25-0003

« View grant details

Impact Report details

Impact report for LEMMA II (aka STRUCTURES for Building, Learning, Applying and Computing Statistical Models)
Impact report for project evaluation

Primary contributor

Author Fiona Steele

Additional contributors

Contributor William Browne
Contributor Harvey Goldstein
Contributor Jo-Anne Baird
Contributor George Leckie
Contributor Sally Barnes


The project has made significant contributions to the development and application of statistical methods for the analysis of complex social processes, the development of user-friendly software to make advanced statistical methods accessible to social researchers, and the provision of face-to-face and online training. METHODS AND SOFTWARE - Further enhancements to the MLwiN software (e.g. to improve data entry and efficiency of model estimation). 2630 users have taken advantage of the free UK academic download since the start of LEMMA 2. - Development of multiple-imputation methods for multilevel data structures which can handle missing data on mixtures of continuous and discrete variables defined at different levels, and the free REALCOM-Impute software. - Development of the runmlwin Stata program. - Development of the beta version of the Stat-JR software system which allows model fitting in multiple software packages through a common interface, allows greater flexibility for estimation of new models, and improves computational efficiency of existing methods. - A multilevel longitudinal structural equation model for estimation of reciprocal effects among individuals in a social group. APPLIED RESEARCH - Significant contribution to UK and international debates on the use of school league tables for evaluating school performance and informing parental choice, and on the relative effects of school, area and family on pupil progress. - Further development of a new approach for modelling trends in social segregation. TRAINING - Delivery of 25 days of training, including two 3-day intensive research workshops which supported researchers in analysing their own data. We also contributed to workshops in Belfast, Edinburgh and London. - Further development of the LEMMA Virtual Learning Environment which now has 8519 registered users. - Provision of support and materials to enable other academics to run their own courses.

Outputs include 24 journal articles, 5 articles under review, 3 book chapters, 2 reports, the beta-release of the Stat-JR software system, the runmlwin Stata command, further development of MLwiN and REALCOM-impute (with documentation and web forums), and 8 new online training modules (4 awaiting upload onto the VLE). - Goldstein’s work on modelling multivariate data led to development of a highly flexible framework for multiple-imputation of missing data. The methods have been made accessible through the REALCOM-Impute software (Carpenter et al. 2011, J. Stat. Software), which is integrated with MLwiN and Stata, and has been partially implemented in Stat-JR (Charlton et al. 2012) to improve efficiency. - Leckie and Goldstein’s research on school league tables (JRSSA and Fiscal Studies) demonstrates their serious limitations for evaluating school effectiveness, e.g. because tables relate to performance of a cohort who started secondary school 7 years earlier. Other research extended traditional models of school effectiveness and made innovative use of the National Pupil Database to allow simultaneously for school, area, family and pupil characteristics on pupil progress (Rasbash et al., JRSSA 2010). - runmlwin (Leckie & Charlton 2012), released October 2011, has already been downloaded over 2300 times and cited in 14 journal articles. It combines the best features of Stata (extensive commands for data manipulation, simulation and post-estimation predictions) and MLwiN (computational efficiency and wide range of multilevel models). - Training workshops and online materials that cater for all levels have been provided, allowing progression from simple regression to complex multilevel models. Research workshops bridge the gap between learning concepts using example datasets and applying multilevel models to one’s own data. These provide individualised support to researchers who have mastered the basics but do not yet feel able to work independently.

Impact was achieved through presentations of research and software demonstrations, journal articles, software documentation, regular updates on the CMM website and via email to users of our software and VLE and past workshop attendees. Project research has been disseminated in over 60 presentations, including: various Royal Statistical Society meetings, the International Amsterdam Multilevel Conference, a symposium on panel data methods in Lisbon, a meeting on advances in family research in Groningen, the Social Research Association, conferences of the Americian Sociological Association and Modern Modeling Methods, Stata Users’ Group meeting, British Educational Research Association, Association for Educational Assessment, and international seminars (e.g. Cornell, Berkeley, Toronto, Queensland). Leckie and Goldstein organised and participated in a 1.5 day research workshop (with ADMIN and cemmap) on Measuring School Effectiveness at the 2010 Methods Festival. Presentations include awareness-raising overviews of multilevel modelling and other advanced methods, e.g. for the Advanced Quantitative Methods Network (AQMeN), the Scottish Social Survey Network and ‘What Is?’ talks at Methods Festivals. MLwiN is used in our training workshops and online modules. It is free to UK academics; others can download a training version that works with sample datasets. Technical support is provided to paying users, and others can send queries to the MLwiN user forum. There are also user forums for REALCOM and runmlwin. All software has extensive free documentation. The LEMMA VLE was promoted among MLwiN users, Stata and R user groups, and our networks of UK and international quantitative researchers. This has led to individuals and organisations adding links to the VLE from their websites (e.g. Sage Methodspace, Andrew Gelman’s “Statistical Modeling, Causal Inference and Social Science” blog at Columbia, Judith Singer’s website at Harvard, Stata, AQMeN).

Methodological research, software and training has benefited academics from across the social sciences and beyond in the UK and overseas. 2630* UK academics have obtained MLwiN free of charge. Examples of its use include school effectiveness research (e.g. P. Sammons and S. Strand, Oxford; S. Thomas, Bristol), survey research (see 2A for details), criminology (I. Brunton-Smith, Surrey; P. Marchant, Leeds Met), psychology (E. Flouri, IoE), epidemiology (K. Tilling, Bristol; A. Leyland, MRC Glasgow). 3940 MLwiN licenses were sold to international researchers. MLwiN and its documentation have been cited in over 2500 journal articles (over half published since 2008). Runmlwin has been downloaded over 2300 times and cited in 14 journal articles. Examples of its use are in epidemiology (J. Nazroo, Manchester; S. Subramanian, Harvard; G. Davey-Smith, Bristol) and psychology (N. Beretvas, Texas). There has been over 600 downloads of REALCOM-Impute. It has been used especially with large-scale longitudinal datasets where there are extensive and complex patterns of missing data; for example, to adjust for differential dropout and item nonresponse in a multilevel longitudinal survey of the effectiveness of special centres for helping overweight children (C. Law, UCL). LEMMA workshops and online training have benefited researchers from across the social sciences, especially students (55% of the 8519 VLE users and 59% of the 296 workshop participants). The VLE is also widely used outside the UK (70% of all users). 29 researchers received individual support at our research workshops (14 PhD students) on topics such as hospital effects on length of stay, evaluation of a teaching training intervention, and effects of judge characteristics on court judgements. Our training has also had an indirect impact on attendees at courses run by ‘training the trainer’ collaborators at UCL and AQMeN. *All figures refer to the period from 1 October 2008.

Methods, software and training resources developed under LEMMA have achieved economic and societal impacts through use by academics engaged in policy research and in research commissioned by government and third sector organisations. Work on league tables has raised awareness of their limitations for measuring institutional performance. Specifically: - Research on league tables encouraged debate on their appropriateness for assessing performances of schools and other institutions, and usefulness for informing parental choice. Goldstein is co-author of a major British Academy policy report on league tables in the public sector (Foley and Goldstein 2012). The research has also contributed to debates in Australia and New Zealand (and discussions with the main teaching union in NZ). - MLwiN has been used in survey methods research, including ESRC Survey Design and Measurement Initiative projects on nonresponse (G. Durrant with Steele, J. Bynner with Goldstein, and P. Lynn) and interviewer effects (P. Sturgis). This work has involved Natcen and ONS and informed survey practice on strategies for reducing nonresponse and interviewer bias. MLwiN is also widely used in research on educational assessment conducted for assessment agencies and examination boards (e.g. AQA and QCA). MLwiN and REALCOM-impute were used in analysis of a longitudinal survey of prisoners for the Ministry of Justice (I. Brunton-Smith, Surrey). - Since the project start, MLwiN has been purchased by 572 non-academics. New users include the Scottish Government, Surrey Country Council, Matrix Knowledge Group, Statistics Norway, UNESCO Chile and Philips Electronics. See the End of Award Report (3b) for other examples of non-academic use of MLwiN. REALCOM-Impute and runmlwin are free to all. - The LEMMA VLE provides free training to all, reducing R&D and training costs for academics and non-academics alike. Uptake among non-academics has been high (1258 users, 15% of total).

The findings of the league tables research are summarised in 1B. The British Academy report is part of the BA’s publication series on policy research. This report, on the uses of league tables in education and policing, had a major press conference in March 2012. Its recommendations draw heavily on the LEMMA work by Leckie and Goldstein (see above). The LEMMA VLE currently contains 9 modules (with 4 further modules completed and awaiting upload), starting from ‘An Introduction to Quantitative Research’ and progressing to ‘Single-level and Multilevel Models for Ordinal Data’. Most modules include practicals in MLwiN, Stata and R which guide users through the application and interpretation of quantitative methods, and quizzes for self-assessment.

The school league tables research has been widely disseminated to non-academic audiences, including articles for the Significance, Britain in 2010 and Society Now magazines, articles in national newspapers, and at the University of Bristol’s Festival of Education (an annual series of public engagement events, attended by headteachers and parents). A large press conference was held to launch the British Academy report in March 2012, which led to Goldstein being interviewed on the Today programme and an invitation to meet with the head of Ofsted to discuss this and related research. The high non-academic use of the LEMMA VLE is likely to have led to higher non-academic use of MLwiN and runmlwin. Runmlwin was also used in 2-day courses for the Royal College of Surgeons in Ireland in Dublin (May2012) and OECD in Paris (June 2012). The LEMMA VLE and other free training resources available from our website are always promoted in our training workshops and research presentations, especially ‘awareness-raising’ events. We regularly give research presentations at conferences attended by government researchers (e.g. meetings of the British Educational Research Association, Association for Educational Assessment, and the Royal Statistical Society). We have also given seminars for various non-academic audiences, including the Northern Ireland Statistics and Research Agency, the DfE-funded Pupil Level Annual School Census (PLASC) User Group, and the Social Research Association.

Software development, methodological research and training carried out under LEMMA directly benefited researchers and practitioners from a range of sectors, including government, international organisations (e.g. OECD, and international statistical agencies), and the private and voluntary sectors. Research on league tables had an impact on two major groups: (i) national and local government and schools with interests in measuring the performance of schools (and other institutions), and (ii) parents who use league tables to choose schools. Following the British Academy report it has also attracted interest from policing organisations and the Home Office.

Following LEMMA work on league tables, Goldstein will meet with the head of Ofsted (along with others) to discuss this and related research. Goldsten is also in discussions with the Bristol local authority about future collaborations. This will involve working with local schools to help understanding of their data and raise expertise levels in handling data such as performance measures. There are plans for a workshop late 2012. We expect that use of the LEMMA VLE will continue to grow. Four new modules are awaiting upload, another on missing data has been written by colleagues at LSHTM, and new modules on longitudinal data analysis will be written under LEMMA 3. Impact of other methodological research will take longer to achieve. Our usual approach is to disseminate research widely at conferences and seminars, but to wait until the research has been through peer review before incorporating into training workshops and online materials. To maximise impact, we always implement methods in existing software (including MLwiN) and often provide online appendices showing how to carry out the analysis. The Stat-JR software is our solution to difficulties in extending MLwiN. To date, as we have been starting from scratch, our main focus has been on creating a product that can estimate models available in other packages (including MLwiN). We are now working on extending its scope, and can already fit models that are not available elsewhere. For example, we have converted and extended the pioneering missing data procedures in the REALCOM-impute program to Stat-JR templates that run over 100 times faster than REALCOM-impute. The software also has unparalleled interoperability with other software packages. We began with a low-key beta release in May 2012 and will market Stat-JR along with our work on the electronic book interface (prototyped in our e-STAT DSR node).

Use of the LEMMA VLE has been much higher than expected. This has been achieved through a combination of promotion via our own networks, word of mouth (including links from websites of respected methodologists), and efforts to make the materials as accessible as possible. High uptake of introductory modules shows that there was a need to provide an entry point for beginners in statistics, rather than focusing only on advanced users. While we expected high uptake of the VLE among students and international users, non-academic uptake has been much higher than anticipated. Cost is likely to be a factor as course fees, travel and accommodation can make face-to-face training expensive. It is also easier to fit online learning around work, and resources can be accessed when there is a need rather than waiting for the appropriate course. The runmlwin Stata command was not in the original proposal, but was developed in the last year of the project in response to: (a) increasing feedback from Stata users that Stata’s own multilevel modelling commands are limited in range of model types and are often computationally slow; and (b) feedback from advanced MLwiN users who find MLwiN’s point-and-click interface limiting when carrying out extended analyses. The rapid growth in downloads (2300+) and citations (14) of runmlwin likely reflect not only its utility, but also the provision of extensive high-quality documentation and a dedicated responsive user forum (93 topics, 423 posts). We have also extensively promoted runmlwin through conference and seminar presentations (8), multilevel and other statistical mailing lists and by emailing key individual researchers in the Stata community. Late in the project it was realised that the REALCOM approach to missing data could be extended to provide new methods for record linkage of large and complex datasets. This is being developed under LEMMA 3 with ONS and researchers on the new British Birth Cohort Study.

As noted in the End of Award report, the original PI, Jon Rasbash, died in March 2010. Although we have been able to achieve the original aims and objectives of the research, his death led to project delays and a reduction in dissemination activities which have had a knock-on effect on the impacts that could realistically be achieved within a year. As Professor Rasbash had responsibility for software development, this strand of research was most affected. This primarily resulted in delays in the release and marketing of the Stat-JR software, which has affected uptake. We are actively working on marketing the software as part of the e-STAT DSR node which is still active. Steele took over as PI of the project as well as the day-to-day running of the Centre for Multilevel Modelling, and these increased commitments in Bristol reduced the amount of time available for dissemination and networking, especially at international conferences.

Cite this outcome


Steele, Fiona et al. STRUCTURES for Building, Learning, Applying and Computing Statistical Models: ESRC Impact Report, RES-576-25-0003. Swindon: ESRC


Steele Fiona et al. STRUCTURES for Building, Learning, Applying and Computing Statistical Models: ESRC Impact Report, RES-576-25-0003. Swindon: ESRC.