Natural Language Annotation for Machine Learning: A guide to corpus-building for applications
James Pustejovsky, Amber Stubbs
Create your own natural language training corpus for machine learning. This example-driven book walks you through the annotation cycle, from selecting an annotation task and creating the annotation specification to designing the guidelines, creating a "gold standard" corpus, and then beginning the actual data creation with the annotation process.
Systems exist for analyzing existing corpora, but making a new corpus can be extremely complex. To help you build a foundation for your own machine learning goals, this easy-to-use guide includes case studies that demonstrate four different annotation tasks in detail. You’ll also learn how to use a lightweight software package for annotating texts and adjudicating the annotations.
This book is a perfect companion to O'Reilly’s Natural Language Processing with Python, which describes how to use existing corpora with the Natural Language Toolkit.
Systems exist for analyzing existing corpora, but making a new corpus can be extremely complex. To help you build a foundation for your own machine learning goals, this easy-to-use guide includes case studies that demonstrate four different annotation tasks in detail. You’ll also learn how to use a lightweight software package for annotating texts and adjudicating the annotations.
This book is a perfect companion to O'Reilly’s Natural Language Processing with Python, which describes how to use existing corpora with the Natural Language Toolkit.
年:
2012
出版:
Early Release
出版社:
O'Reilly Media
语言:
english
页:
97
ISBN 10:
1449306667
ISBN 13:
9781449306663
文件:
PDF, 2.13 MB
IPFS:
,
english, 2012