What can network science tell us about the novels of Charles Dickens? This website is part of an interdisciplinary research project, led by Markus Luczak-Roesch and Adam Grener at Victoria University of Wellington, that is using the principles of network theory and information science to analyze the structure of Dickens’s serial novels. Written and published serially over the course of up to eighteen months, Dickens’s novels are evolving and dynamic systems that offer a blend of order and randomness suitable for information theoretic analysis. Our SATIS Tool Prototype offers a range of visual representations of the character systems of all fifteen of Dickens’s novel, allowing you to explore their structures and even jump to the full text of the novel in your browser. Visit the Background & Methodology section for information on Dickens’s career, the Victorian serial novel, the principles of network science underlying our approach, and explanations of the visualizations in the SATIS Tool Prototype. You can also learn more about our team of researchers and contact us with questions and feedback.

Charles Dickens and the Victorian Serial Novel

Charles Dickens (1812-70) was a pioneer of the serial novel form during the nineteenth century. The publication of novels in weekly or monthly instalments, either sold on their own or included in periodicals, made literature more affordable and accessible to a growing body of readers. This format also generated unique compositional challenges for writers like Dickens, who often began composing and publishing instalments of a novel without a defined sense of its overall structure. His first two novels—The Posthumous Papers of the Pickwick Club (1836-37) and Oliver Twist (1837-39)—actually began as episodic sketches before they were reimagined into longer “novels” midway through their composition. As Dickens’s career progressed, he began to think more carefully about the larger coherence and structure of his novels as he wrote. In the Preface to Martin Chuzzlewit (1843-44), for example, Dickens claimed that he “endeavoured in the Progress of this Tale, to resist the temptation of the current Monthly Number, and to keep a steadier eye upon the general purpose and design.” Scholars generally consider his next novel, Dombey and Son (1846-48), to be the first that demonstrates the overall coherence and control that will increasingly define his more mature works. Dickens’s surviving “Working Notes” for novels show how he thought about how to introduce and orchestrate his characters. At the same time, the sheer size of the cast of characters in his novels—numbering up to 100—means that even at their most planned, Dickens managed his characters in unpredictable ways.


Applying Network Theory to the Serial Novel

Digital humanities scholars have been increasingly interested in analyzing literary texts using computational methods. These methods not only facilitate the analysis of large bodies of texts that cannot be “read” in the traditional manner, but also make visible linguistic and structure features of individual texts. Our project approaches novels as evolving information systems, and it is interested in understanding the properties of those systems with particular regard to the management of characters. The temporal dimension of the text is central to our analytical method, as we are interested in trying to isolate and treat quantitatively the development of character networks as they are created as novels unfold. Our work is grounded in the Transcendental Information Cascade (TIC) approach, which enables us to track how character networks are created and managed during the unfolding of the text. On the most basic level, our approach tracks where characters appear in the text and how that relates to the appearance of other characters. Our networks are different from traditional character networks in which each node in the network represents a character and the edges that connect nodes denote connections between characters (although our SATIS Tool Prototype also includes these traditional social networks). Instead, our dynamic networks map the unfolding of the text: each node represents a segment of the text (a 1000-word “slice”) and edges signify the shared information, in this instance the next appearance of a character within the text. See below for a description of how the dynamic networks are created and for an explanation of the various visualisations and features that you can explore using the SATIS Tool Prototype.

Creating the Dynamic Networks

The Dynamic Flow Networks in the SATIS Tool are created using two files: a file of the complete text of a novel and a “dictionary” of character names. The text is divided into “slices” of a given length—in this case, the default is 1000 words (though the end of a chapter will terminate a slice). Each node in the dynamic network represents one slice of the text (e.g., Node 1 represents the first 1000 words of the text, Node 2 represents words 1001-2000, and so on). Our algorithm scans the text of a slice and identifies which characters are present within that slice of text. Two nodes are linked if they contain consecutive appearances of a character. In the sample figure to the right, Nodes 1 and 2 are linked through the shared appearance of Abel Magwitch. Nodes 1 and 3 are linked through the shared appearance of Joe, Mrs. Joe, and Pip (and Nodes 3 and 4 are linked through the shared appearance of the same characters). Note that even though Pip appears in both Node 1 and Node 4, they are not linked because they are not consecutive appearances (i.e., he appears in Node 3 as well).


Using the SATIS Tool

The SATIS Tool Prototype offers a range of visualisations for you to explore the character networks of all fifteen of Dickens’s novels. The Dynamic Character Flow Network allows you to view how the text unfolds from beginning to end. You can pause the network at any point and explore nodes and edges. The node will reveal which characters appear in that node and edges will reveal the matched character(s) who link two nodes. Clicking on the Go to content button will open a new window in your browser where you will see the text from the novel in that slice. Once there, you can “jump” to other slices of text that are connected to that particular segment of text. The Static Character Flow Network shows the entire network and has the same features as the Dynamic Network.

The Static Social Network offers a visualisation of the relationships between characters that are extrapolated from the dynamic network. In this network, each node represents a character and two nodes (characters) are connected if characters appear in the same slice at any point in the novel. This network is thus “social” in a loose sense—it defines a “connection” between characters if they appear in the same slice of text in the novel.

The Statistics provided by the SATIS Tool supplement the networks and provide different ways of looking at the information they present. The First and Last Appearance identifies the nodes where a character enters the text as well as their last appearance. The All Character Appearance tab shows the frequency of characters’ appearance, and both tabs enable you to search within a range of nodes and/or range of appearances.

Entropy is a broad concept that is used in a range of contexts to explore the structure of systems, including thermodynamic systems, social systems, and information systems. In this context, entropy measures the distribution of characters. In a general sense, the entropy at a given node reflects how many characters have been introduced to that point and how evenly they occur. The higher the entropy means the more evenly the characters are distributed, and the point of maximum entropy is the “most unpredictable” state of the character system: the more characters there are and the more evenly they are distributed means that predicting the appearance of characters is more difficult. Assessing the entropy and analyzing its evolution can be a way of exploring when new characters are introduced, when characters start appearing more frequently than others, and when characters might begin to fade out.

Our Team

Based at Victoria University of Wellington in New Zealand, this project is led by Dr Markus Luczak-Roesch and Dr Adam Grener. It has received support and funding from VUW’s “Spearheading Digital Futures” steering group in 2016 and 2017.

Markus Luczak-Roesch is a Senior Lecturer in Information Systems at the School for Information Management, Victoria Business School, Victoria University of Wellington. Before joining Victoria Markus worked as a Senior Research Fellow on the prestigious EPSRC programme grant SOCIAM - The Theory and Practice of Social Machines at the University of Southampton, Electronics and Computer Science (UK, 2013-2016). A computer scientist by education, Markus investigates the formal properties of information in socio-technical systems and human factors of information and computing systems.
More information: http://markus-luczak.de

Adam Grener is Lecturer in the English Programme at Victoria University of Wellington. His main area of research is the nineteenth-century British novel, though he also has interest in the history of the novel, narrative theory, and computational approaches to literature. His work has appeared in the journals Genre, Narrative, and Modern Philology, and he is the co-editor (with Jesse Rosenthal) of the April 2017 special issue of Genre, “Narrative Against Data in the Victorian Novel.” He is completing a book on realist aesthetics and the history of probabilistic thought.
More information: http://www.victoria.ac.nz/seftms/about/staff/adam-grener

Research Assistants
Emma Fenton
Tom Goldfinch
Isabel Parker

Source code of the SATIS tool on Github: https://github.com/vuw-sim-stia/lit-cascades