We compare the use of edited text in the form of newswire and unedited text in the form of discussion forum posts as sources for training material in a self-training experiment involving the Brown reranking parser and a test set of sentences from an online sports discussion forum. We find that grammars induced from the two automatically parsed corpora achieve similar Parseval f-scores, with the grammars induced from the discussion forum material being slightly superior. An error analysis reveals that the two types of grammars do behave differently.
Ireland ->
Dublin City University ->
DCU Faculties and Centres = Research Initiatives and Centres: Centre for Next Generation Localisation (CNGL)
Ireland ->
Dublin City University ->
DCU Faculties and Centres = DCU Faculties and Schools: Faculty of Engineering and Computing
Ireland ->
Dublin City University ->
DCU Faculties and Centres = Research Initiatives and Centres
Ireland ->
Dublin City University ->
Status = Published
Ireland ->
Dublin City University ->
DCU Faculties and Centres = DCU Faculties and Schools: Faculty of Engineering and Computing: School of Computing
Ireland ->
Dublin City University ->
DCU Faculties and Centres = Research Initiatives and Centres: National Centre for Language Technology (NCLT)
Ireland ->
Dublin City University ->
Subject = Computer Science
Ireland ->
Dublin City University ->
DCU Faculties and Centres = DCU Faculties and Schools
Ireland ->
Dublin City University ->
Subject = Computer Science: Computational linguistics
Ireland ->
Dublin City University ->
Publication Type = Conference or Workshop Item
Josef VanGenabith,
Joachim Wagner,
Ozlem Cetinoglu,
Jennifer Foster