Databases in Historical Sciences Workshop

2021.12.04.

Databases in Historical Sciences Workshop

On Friday, 19 November 2021, starting at 13.00, the workshop "Building - Using - Maintaining Databases in the Historical Sciences" was held at the ELTE BTK, organized by the ElitData Research Group of the Department of Comparative Historical Sociology, the Research Group on the History of Professionalisation and the Research Centre for the History of Social Change of the ELTE BTK-TÁTK.

Below you will find videos of the presentations and a short summary of the presentations

Károly Halmos: Origins and background: computers in history studies

Károly Halmos, Associate Professor of the Department of Economic and Social History at the ELTE, took the students back to the beginning of the database construction. The emergence of the computer was a milestone in historiography, but the old databases were created in a limited capacity environment, so compromises had to be made when entering data. At present, for many databases, the priority is not to upload data to interfaces, but to upload complete data and material without compromise. In parallel with data uploading, there is also a need to think about standardisation, as historians collect their data on the basis of their own research, using a very specific and individual system. In many cases, historical collections lack the collective features and rigorous systems that appear in sociological research. In addition, it is common in historical databases that the reason for data collection and coding is not recorded by the researchers, making subsequent use of the data without the researcher being involved almost impossible. This may also be due to the fact that much of the database construction has taken place on the periphery of official historiography, with the issue not becoming central for a long time. The presentation emphasised that, although these flaws exist, they could easily be avoided by consciously preparing and educating university students. The training of digital historians would be a major step forward in the field and in the creation of future historical databases.

The presentation can be viewed by clicking here!

István Fazekas and Zsófia Kádár: Jesuit high school databases

What information do the registers of Jesuit high schools from the 17th century provide? - This question was answered by István Fazekas (ELTE BTK Institute of History, Department of Early Modern History, Catholic Schooling in Early Modern Hungary Research Group) and Zsófia Kádár (Catholic Schooling in Early Modern Hungary Research Group) at the workshop entitled Building - Using - Maintaining Databases in the Historical Sciences.

The research team on the careers of students in Jesuit high schools in Hungary has accessed source material that is usually only available to local historians. Years ago, however, István Fazekas and Zsófia Kádár, together with their colleagues, started to study the school registers available since the founding of the Jesuit high schools in the 17th century, thus starting a broader use of a source material that had been available to a narrow circle of researchers until then. In the 17th century, Jesuit high schools occupied a monopoly position in Catholic education in the country, and internal communication between schools was well developed and well regulated. As a result, since the 16th century there have been registers of students enrolled in the schools, containing their names, place of origin, nationality, social status, educational data and other notes. The study looked at the student population of the towns of Nagyszombat, Győr and Bratislava, and after digitising the existing source base, the researchers also looked at the grade structure, age distribution, movement of students between schools and subsequent progression of students in each school. In addition, the impact of Jesuit high schools on the city and the region cannot be neglected.

The presentation can be viewed by clicking here!

Csaba Sasfi: High school register databases

What difficulties do researchers encounter when examining and collecting historical data? Csaba Sasfi, historian of the Research Group on the History of Professionalisation, addressed this question at the workshop Building - Using - Maintaining Databases in the Historical Sciences. The historian focused on 18th-19th century high-school careers, using this example and the data he collected to illustrate the beauty and difficulties of research. Surprisingly, we already have formalised, well-standardised birth records for high schools from the end of the 18th century - this high quality documentation can be interpreted as a continuation of the tradition of Jesuit high schools. There was also a state-level requirement for documentation: schools were required to send to the governorate council the yearly, routinely produced registers containing personal details of students, institutional data, academic results, parental data and other comments. In addition to the copy sent to the Governing Board, schools also produced their own, usually more thorough and detailed, birth certificates. These records are relatively easy to digitise, and a photograph of the original document and a searchable online format can be easily combined, making the document more searchable. However, the data files created for each individual must also be able to handle changes: the acquisition and loss of name variations, surnames, makes the interface dynamic. According to the historian, it is therefore necessary to involve external help in the construction of databases to ensure that the data is recorded to the highest quality.

The presentation can be viewed by clicking here!

Péter Őri: Historical demographic sampling databases: the MOSAIC project

Péter Őri, deputy director of the Population Research Institute of the Hungarian Central Statistical Office (KSH), spoke about the MOSAIC project and its Hungarian database. The aim of the project is to collect historical demographic data from the countries of continental Europe, to create databases and to make them available to researchers, professionals and the general public. Historical demographic data collections involve either cross-sectional data, where researchers collect all possible sources about the person they are researching at a point in time, or longitudinal studies, which involve collecting less detailed data over a long period of time, even a lifetime. In the MOSAIC project in Hungary, a mixture of these two sampling methods was used: cross-sectional data on family structure was collected, while family history was explored using longitudinal data. In Hungary, the information needed to build the database was provided by the 1869 census, which recorded literacy and place of birth. Census data survived for 44 complete settlements and a few fragmentary settlements, so the project described roughly 6,500 households. The remaining data was weighted by region and denomination to make it nationally representative, and to ensure a smooth research process, an attempt was made to reconcile the terminology used in earlier and later censuses. The resulting database now allows Hungarian data to be included in the MOSAIC project, providing an opportunity for comparative analyses, such as the exploration of marriage patterns in Europe.

The presentation can be viewed by clicking here!

Diána Bartha and Zsuzsanna Kiss: The new dress of the Gábor I. Kovács prosopographic database - the ELITDATA project

What is the potential of digitising historical data? Zsuzsanna Kiss, assistant professor at the Department of Comparative Historical Sociology, ELTE TáTK, and Diána Bartha, research assistant at the Research Group on Prosopography and Family History, gave a presentation on knowledge elite research since the 1980s. The elite research at the Department, led by Gábor István Kovács, focused on the functional, structural and recruitment patterns of political, economic, ecclesiastical, military and knowledge elites. The current project is digitising the database and transferring it to Wikidata.

Wikidata is a platform based on the principles of Wikipedia, designed to provide structured data storage and easy searchability. It is also referred to in the literature as an information graph, where uploaded entities (locations, concepts, personal, etc.) and their associated properties are displayed as reference links, similar to Wikipedia. In this way, the information displayed on a single interface is linked and redirects the user to the "deeper layers of the graph". The interconnectedness of information allows for the exploration of interrelationships and the production of comparative research. The platform is not only used in the humanities, but is also an excellent tool for storing and linking results of research in the natural sciences.

The research at ELTE aimed to collect demographic, mobility and family history data on people who held university teaching positions in the civilian era. The digitisation and consolidation of these data is currently underway, as well as an analysis of the geographical and social mobility of the individuals included in the database. The research offers the possibility not only to combine historical research and digital tools, but also to examine the data from a new sociological perspective.

The presentation can be viewed by clicking here!

Péter Gerhard: Archival (public collection) databases

Péter Gerhard, Head of the E-Archives and IT Department of the Budapest City Archives, spoke about the importance of digitisation of archival data at the Database Workshop. In the first part of his presentation, the Chief Archivist outlined the main differences between the archives of Anglo-Saxon and continental European countries: while in the Anglo-Saxon areas the materials are organised at the level of collections, on the continent, including Hungary, there is a much stronger hierarchy. This principle of ordering, which is also implemented in our country, causes difficulties in research, since searching within the archives is not trivial, and the selection of the most appropriate document for research is not clear, since there may be related information in different fonds. In order to facilitate research and preservation, the digitisation of archival data along a predefined data structure has now started, but it is not yet possible to search entire databases. The digitisation of the almost unimaginable amount of data stored in archives can only be achieved with the help of information technology: programmes specialised in handwriting recognition and the knowledge required to produce automatic translations from foreign languages are becoming increasingly valuable, without which the digitisation of data would be almost impossible. However, digitisation would not solve all the problems that arise, as the law on the protection of personal data is very strict when it comes to the searchability of archival data, which creates new difficulties for researchers.

The presentation can be viewed by clicking here!

Szilvia Maróthy: Database use and construction in the spirit of Open Science: open data databases

What can encourage researchers not only to make the publication of their research project available, but also to share the steps of the research and the databases they have created with readers? In her presentation, Szilvia Maróthy (Institute for Literary Studies, Centre for Humanities Research) explored the issues of open science and the use of databases.

How is the open science approach new? In the field of open science, every step of research, whether it's photos, compiled databases or code, is a scientific result. Open science could renew our approach to scientific research in several areas. On the one hand, with the advance of open science, the background work - often the most painstaking part of research - would also be recognised as scientific work and results. It would also make it easier and more modern to preserve scientific results. Common preservation repositories would avoid the well-known problem of data storage becoming obsolete. We have all seen paper-based, handwritten data and bases stored on floppy disks, from which information can now only be extracted with great effort. With the rise of open science, the transparency of scientific work would increase, as would the number of references to research through accessibility. At the same time, however, our critique of sources must become more rigorous: for online sources, it is essential to know who created the data and thus the scientific results. A critical attitude towards sources would allow scientific results to be accessible, understandable and respected, but would not increase the number of pseudo-news and pseudo-scientific results.

The presentation can be viewed by clicking here!

Gábor Palkó: ELTE BTK TI Department of Digital Humanities - Opportunities and Perspectives

How can the humanities be connected to the field of informatics? In his presentation, Gábor Palkó (ELTE BTK Department of Digital Humanities) gave an insight into the database developments, further training and research taking place at ELTE.

The Department of Digital Humanities at ELTE BTK is working on precisely these two fields, the humanities and informatics: they want to present and transfer the knowledge that opens up the possibility of connecting the humanities and the digital knowledge that is essential in the 21st century. As a result, all undergraduate students at BTK participate in an introductory course, the aim of which is for students to learn about the possibilities of transition between scientific fields, and within the framework of the BTK PhD student workshop, they can learn about the relationship between digital techniques and their own research field. The Department also conducts a number of independent research projects, including the construction of poetry and press corpora, as well as research on prosopography and humanism. The Department of Digital Humanities strives to demonstrate the novelty and feasibility of open and linked databases. The results of research conducted at BTK are uploaded to Wikidata, an independent and non-editable data network, on which non-predefined queries, visualization, and continuous expansion of the data repository can be implemented.

The presentation can be viewed by clicking here!