Knowledge Graphs and Data Services for Studying Historical Epistolary Data in Network Science on the Semantic Web

Communication data between people is a rich source for insights into societies and organizations in areas ranging from research on history to investigations on fraudulent behavior. These data are typically heterogeneous datasets where communication networks between people and the times and geographical locations they take place are important aspects. We argue that these features make the area of temporal communications a promising application case for Linked Data (LD) -based methods combined with temporal network analyses. The key result of this paper is to present a framework, tools and systems, for creating, publishing, and analyzing historical LD from a network science perspective. The focus is on network analysis of epistolary network data (metadata about letters), based on recent advances in analysis of temporal communication networks and the behavioral patterns commonly found in them. To test, evaluate, and demonstrate the usability of the framework, it has been applied to (1) the Dutch CKCC corpus (of ca. 20000 letters), (2) the pan-European correspSearch corpus (of ca. 135000 letters), (3) to the Early Modern Letters online data (of ca. 160000 letters), and (4) to the aggregated Finnish CoCo collection of more than 300000 letters from 1809--1917.
