ICSI
The ICSI Meeting corpus is a collection of 75 meetings collected at the
International Computer Science Institute (ICSI) in Berkeley during the years
2000-2002 and released under the license CC-BY-4.0. The meetings included
are "natural" meetings in the sense that they would have occurred anyway:
they are generally regular weekly meetings of various ICSI working teams,
including the team working on the ICSI Meeting Project. The dataset includes
the English audio, as well as transcripts and summaries written by humans.
In the textual summarization task the audio portion of the dataset is not
used.
The dataset is a split of 6 meetings extracted by the Meetween project
partner Zoom.
A. Janin, D. Baron, J. Edwards, D. Ellis, D. Gelbart, N. Morgan, 2003,
The ICSI Meeting Corpus, 2003 Proceedings of the IEEE International
Conference on Acoustics, Speech, and Signal Processing (ICASSP '03), Hong
Kong, China.