|
|
|
The last one hundred years of scholarship in the history of the English book trade have been dominated by catalogues: of books, of watermarks, of printer's ornaments and title-page borders. At the same time considerable effort has gone into the transcription and publication of primary documents such as those found in company archives and government repositories. Little research has been carried out, however, in support of the tools of quantitative analysis developed by historians and social scientists. The paucity of quantitative-based research is due primarily to the lack of hard data upon which to work. As a result, while we know for the most part what books were published and something about the official lives of the men and women who worked in the printing houses and bookshops, we know very little about the measurable physical, economic and material circumstances of the trade itself.
The Early English Booktrade Database (EEBD) will be the first networked electronic resource devoted to the organization and dissemination of physical and descriptive bibliographical statistics. The EEBD's goal is to collect and describe material evidence related to English printing and publishing 1475-1640 (also know as the STC period, after the Pollard and Redgrave Short-Title Catalogue of Books Printed in England, Scotland & Ireland 1475-1640). The assembled data will for the first time enable large-scale quantitative analyses of historical, industrial, sociological and literary aspects of the early modern print culture. At its heart is a set of digital files constructed in XML and accompanied by a suite of analytical and data-representation tools. It is also designed to be used in conjunction with the electronic English Short-Title Catalogue (ESTC) and British Book Trade Index (BBTI). Using the methods of quantitative history, often called cliometrics, scholars will be able to explore the nuances of the English book trade at a level never before possible. For example, a book historian will be able to chart in detail the disappearance of black letter printing during the reign of Elizabeth, while a divinity scholar might investigate the flourishing trade in printed sermons and its impact on popular religious beliefs.
At its heart, the EEBD consists of two distinct classes of information: new material gathered from the close physical examination of every title printed during the STC period; and relevant data from existing resources such as the STC and ESTC that have been revised and recompiled to be used analytically in correlation with the freshly gathered evidence. New material includes:
Existing data to be revised and compiled includes:
These disparate classes of data will form the core records of the EEBD. However, this ground-breaking compilation of evidence surrounding the early book trade requires an equally dynamic technical base. To support scholars from widely divergent disciplines, we will design the database structure, tool collection, and distribution network underlying the EEBD with an eye toward maximum flexibility and utility.
Perhaps the four most important challenges facing a digital resource designed to be used collaboratively are power, portability, transparency, and preservation. A research collection of the scope and sophistication envisioned by the EEBD will require a powerful query-and-analysis engine supporting it. However, as no software package can offer every approach to analysis, all data must have the ability to be extracted partially or fully from their native environment and imported into a new one. Furthermore, such extraction must take place simply and without any lose of information such as special characters or complex relationships among records. Finally, the data and accompanying structures must be based on a universally recognized industry standard that will survive changes in fashion and technology. The IATH research and development team, with support from the information technology specialists at team members' institutions, will assess the nature of the data to be collected and select the most appropriate technologies for representing and exploiting it in machine-readable form. Preliminary analysis of the data suggests that the optimum maintenance environment will be object-relational database technology and XML for communication between the database and other applications, for example statistical analysis software, and between the database and end-users. XML appears to be the most effective technology for integrating and communicating related data from EEBD, ESTC, and BBTI.
As far as analytical instruments are concerned, many powerful packages such as SPSS and SAS are currently available for use on common computer platforms such as Windows, Macintosh and UNIX; indeed, the EEBD relies on XML technologies in part to support the seamless transfer of data extracts to these popular statistical packages. Nonetheless, many scholars won't require the sophisticated routines offered by SPSS and SAS (nor would they have the time to learn how these packages function). A great deal can be learned from simple descriptive analyses that display types of central tendencies (e.g. averages) or measures of dispersion (e.g. distribution histograms), especially when applied to the EEBD's rich evidence base. In order to encourage the widest use of the EEBD data, we include as a long-term goal the creation a suite of tools specifically designed to exploit the physical and descriptive information it contains. Accompanying these tools will be a set of display strategies based upon Edward Tufte's principles of data compression and density. With such an integrated set of query, analysis, and display strategies, researchers might map the shifting kin and financial alliances of booksellers across a diachronic representation of London.
Finally, the utility of the EEBD will increase exponentially when linked with the ESTC and BBTI. First conceived nearly thirty years ago, the ESTC is an enumerative listing of English printing 1470-1800, containing quasi-regularized title-page information and an annotated census of known copies for each of its records. The BBTI offers biographical and trade details for every person known to have worked in the British book trade up to 1851. These two resources complement perfectly the deep statistical data embedded within the EEBD, forming a tripartite anthology of facts, figures, names, dates, and places describing the volatile world of early printing. In order to exploit fully the potential of this strategic arrangement, the EEBD will build a triangulating mechanism that will allow scholars to use one or more of the resources in whatever combination they choose. For example, the potential researchers mentioned in the previous paragraph might add biographical data from the BBTI to enrich and extend their data map, then use the ESTC census to locate presentation copies that have authorial inscriptions indicating further relationships between author, printer and bookseller.
Because the ESTC currently licenses its data to the Research Library Group (RLG), which in turn charges a fee for its use, the relationship among the three databases will be asymmetrical for the time being. Users with access to the ESTC will have complete access to both the EEBD and BBTI; users with access to the EEBD and BBTI will have full access to one another but only partial access to the ESTC. While only subscribers to the ESTC will have full use of all three datasets, we will make the analysis and display tools we develop freely available to all.