Understanding University Writing

An innovative corpus database of annotated student texts, syllabi, lesson plans, and activities

A project proposed by the Department of English at Purdue University

To support: The development of a corpus database, user interface, and analytical tools to facilitate interdisciplinary, collaborative research in the teaching of writing for domestic and international students.


This project constitutes a collaboration between scholars in two humanistic fields: Corpus Linguistics and Rhetoric and Composition to develop an innovative tool that will allow scholars to investigate the teaching of university writing from a variety of research perspectives, while simultaneously supporting internal sharing of resources. A corpus is a collection of texts representative of a particular domain. The envisioned corpus database will include student papers, lesson plans, activities, syllabi, and other materials from composition courses taught at Purdue University. Contributors will be encouraged to add useful annotations, and the interface will facilitate the attachment of useful metadata. An online searchable interface will be developed to allow scholars to interact with the documents in different ways. Full texts will be available for qualitative study/rhetorical and genre analysis. Quantitative linguistic analysis will also be possible, at the level of both vocabulary and grammar, through the use of computational tools including a part of speech tagger. The tools developed in this project will follow best practices in the fields of CL and R&C. Our database will provide a model for other universities seeking to collaborate on interdisciplinary Digital Humanities projects and offer valuable outreach to those who look to Purdue for its pedagogical innovation.

Innovation Statement:

This project is innovative as it brings together written texts from a variety of sources for interdisciplinary research on composition and writing instruction. It will be the first database we know of that combines both a corpus of student papers along with other material artifacts from the teaching of composition. This rich data set will allow for a thicker description of the situational characteristics of writing in composition courses at the university. In addition, the student papers will contain metadata that will provide insight into the writing of students from different countries of origin, academic levels, and genders, among other characteristics. Such levels of detail are not included in existing corpora or databases of composition at the university level. The development of a user interface that includes computational tools will make these data accessible to humanities scholars and will push the boundaries of R/C and CL research to innovative interdisciplinary avenues. At the same time, it will support instructor education and professional development, especially in graduate programs, by allowing innovative instructors to share their methods with others. We do not know of any available interface that allows for this type of research and collaboration.

Humanities Significance:

The proposed database will allow scholars and teachers to engage with the texts as close readers and to afford multiple perspectives on the context of composition instruction at the university level. Availability of both course materials and examples of student writing will enable the triangulation necessary for qualitative research in writing, as well as computational analysis of these texts. Such investigations can reveal both linguistic and functional aspects of writing, contributing to greater understanding of many aspects of writing, not only rhetorical factors (purposes, genres, and strategies students use) but writing processes, the role of subject matter in writing. Computational tools can also provide insight into the ways in which instructors rhetorically guide students in effectively conveying information and ideas and the ways in which writing is used for learning and understanding. Ultimately, such research can provide administrators and teachers with a greater understanding of composition processes and reveal ways in which writing instruction can be improved and re-envisioned for various student populations.

Statement of need:

Multiple programs in the Department of English have expressed the need for the functionality the database will provide:

Archiving Writing @ WIDE-EMU 2015