Living Labs for Information Retrieval Evaluation
Evaluation is a central aspect of information retrieval (IR) research. In the past few years, a new evaluation methodology known as living labs has been proposed as a way for researchers to be able to perform in-situ evaluation. The basic idea of living labs for IR is that rather than individual research groups independently developing experimental search infrastructures and gathering their own groups of test searchers for IR evaluations, a central and shared experimental environment is developed to facilitate the sharing of resources.
Living labs would offer huge benefits to the community, such as: availability of, potentially larger, cohorts of real users and their behaviours, e.g. querying behaviours, for experiment purposes; cross-comparability across research centres; and greater knowledge transfer between industry and academia, when industry partners are involved. The need for this methodology is further amplified by the increased reliance of IR approaches on proprietary data; living labs are a way to bridge the data divide between academia and industry. Progress towards realising actual living labs has nevertheless been limited. The most notable contribution being that of Azzopardi and Balog at CLEF 2011. There are many challenges to be overcome before the benefits associated with living labs for IR can be realised, including challenges associated with living labs architecture and design, hosting, maintenance, security, privacy, participant recruiting, and scenarios and tasks for use development. In this workshop we seek to bring together for the first time people interested in progressing the living labs for IR evaluation methodology.
Some of the presentations made at the workshop are now available, as indicated below.
- IR Evaluation: Perspectives From Within a Living Lab. Georg Buscher, Microsoft Bing [slides]
(see Schedule for full details)
- Using CrowdLogger for In Situ Information Retrieval System Evaluation. Henry A. Feild, James Allan [slides]
- FindiLike: A Preference Driven Entity Search Engine for Evaluating Entity Retrieval and Opinion Summarization. Kavita Ganesan, ChengXiang Zhai
- Lerot: an Online Learning to Rank Framework. Anne Schuth, Katja Hofmann, Shimon Whiteson, Maarten de Rijke [slides]
- Evaluation for Operational IR Applications -Generalizability and Automation. Melanie Imhof, Martin Braschler, Preben Hansen, Stefan Rietberger
- Factors Affecting Conditions of Trust in Participant Recruiting and Retention. Catherine L Smith [slides]
- A Private Living Lab for Requirements Based Evaluation. Christian Beutenmüller, Stefan Bordag, Ramin Assadollahi
- A Month in the Life of a Production News Recommender System. Alan Said, Jimmy Lin, Alejandro Bellogín, Arjen de Vries
Best Demo Award - 750 EUR
The best demo award went to Kavita Ganesan and ChengXiang Zhai for FindiLike.
The Best Demo Award winners received an award of 750 EUR, offered by the 'Evaluating Information Access Systems' (ELIAS) ESF Research Networking Programme. The award can be used to cover travel, accommodation or other expenses in relation to attending and/or demo'ing at LL'13.
The workshop is supported by the 'Evaluating Information Access Systems' (ELIAS) ESF
Research Networking Programme.