Data Collection and Analysis of Print and Fan Fiction Classification

Channing Donaldson, James Pope

Research output: Chapter in Book/Report/Conference proceedingConference Contribution (Conference Proceeding)

1 Citation (Scopus)

Abstract

Fan fiction has provided opportunities for genre enthusiasts to produce their own story lines from existing print fiction. It has also introduced concerns including intellectual property issues for traditional print publishers. An interesting and difficult problem is determining whether a given segment of text is fan fiction or print fiction. Classifying unstructured text remains a critical step for many intelligent systems. In this paper we detail how a significant volume of print and fan fiction was obtained. The data is processed using a proposed pipeline and then analysed using various supervised machine learning classifiers. Given 5 to 10 sentences, our results show an accuracy of 80-90% can be achieved using traditional approaches. To our knowledge this is the first study that explores this type of fiction classification problem.
Original languageEnglish
Title of host publicationProceedings of the 11th International Conference on Pattern Recognition Applications and Methods
EditorsMaria De Marsico, Gabriella Sanniti di Baja, Ana Fred
PublisherSciTePress
Pages511-517
Number of pages8
Volume1
ISBN (Electronic)9789897585494
DOIs
Publication statusPublished - 5 Feb 2022
Event11th International Conference on Pattern Recognition Applications and Methods -
Duration: 3 May 20225 May 2022
https://icpram.scitevents.org/?y=2022

Publication series

NameICPRAM
PublisherSciTePress
ISSN (Electronic)2184-4313

Conference

Conference11th International Conference on Pattern Recognition Applications and Methods
Abbreviated titleICPRAM
Period3/05/225/05/22
Internet address

Keywords

  • Natural Language Processing
  • Text Classification

Fingerprint

Dive into the research topics of 'Data Collection and Analysis of Print and Fan Fiction Classification'. Together they form a unique fingerprint.

Cite this