AG DANK/BCS Meeting 2013 in London

English Deutsch

Department of Statistical Science , University College London, 8/9 November 2013, Galton Lecture Theatre. The meeting will probably start around 1:30pm on Friday and end at about 1:30 pm on Saturday.

Focus topic: variable selection and dimension reduction in clustering and classification.

Local organisation: Christian Hennig. (c.hennig (at)

Information on the societies:
British Classification Society,
AG DANK - working group on data analysis and numerical classification of the GfKL (German Classification Society).

The meeting is hosted and funded by the UCL Centre for Computational Statistics and Machine Learning
and supported by Chapman and Hall/CRC.

Invited speakers:


Participation is still possible, as long as the capacity of the lecture theatre is not exhausted. If you want to participate, please write to the local organiser at c.hennig (at)
Further presentations can unfortunately no longer be accepted.
There will be no fee for participation.


The meeting will take place in the Galton Lecture Theatre, Room 115, 1-19 Torrington Place. Here is a map. More information is here. The closest Underground station is Goodge Street on the Northern Line; the stations Warren Street, Euston and Euston Square are 10-15 minutes away on foot; King's Cross/St. Pancras (Eurostar terminal and connection to Luton Airport) a bit more than 20 minutes.

Datasets for analysis by participants

Some time will be reserved for participants to present analyses of the following data sets at the meeting. Please restrict your presentations of analyses to 5 minutes at the very most.

Spike sorting

The dataset was made available by Kenneth Harris, UCL Neuroscience (presentation is here) . It comprises 20000 observations on 96 variables and an unknown number of clusters; only some features are expected to be informative for each cluster, but different feature combinations will be relevant for different clusters.
Dataset (ASCII text; it is recommended to save the link)
Informations about the dataset (ASCII text)
Illustration for the informations about the dataset (pdf; see informations-file for an explanation)


In the dataset there is a known (artificial) true cluster (along with potentially several other real clusters). You can take part in a competition by sending, by
Tuesday 5 November 18:00, an email to
c.hennig (at) with an ASCII text file with 20000 cluster memberships. See the information file for details. For the book prizes you can win, see below.

Bat species

The dataset was made available by Veronica Zamora-Gutierrez, Cambridge University (presentation is here). It comprises 2678 observations on 73 variables. There are eight known classes (species of bats), so this can be interpreted as a supervised classification problem, with focus on which variables discriminate the species in the best possible way. However, it is also of interest to find a clustering of the eight species into fewer clusters, which could be used as a first step for better classification.
Dataset (ASCII text; it is recommended to save the link)
Informations about the dataset (docx file)

Book prizes

You can win the following book prizes donated by Chapman and Hall/CRC: The two winners of the Spike Sorting competition can pick their books first, the other two prize winners are drawn at random from those who present an analysis of the Bat Species dataset at the meeting.


Friday 8 November

Saturday 9 November


There are many hotels and bed and breakfasts around UCL (as search terms you could use Bloomsbury, Russell Square or Euston).
A rather good value one is the Crescent Hotel.
More possibilities are some Grange Hotels, e.g., Lancaster Hotel or Langham Court.

More information will be added later.