Intimate connections : an agent-based model of the relationship between social network structure and language typology
A long standing question concerning natural human language is, why are some languages more complex than others? More specifically, why do some languages utilize intricate morphological structure – in some entire sentences often consist of a single complex word – while in others, all words are void o...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/82953 http://hdl.handle.net/10220/46659 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | A long standing question concerning natural human language is, why are some languages more complex than others? More specifically, why do some languages utilize intricate morphological structure – in some entire sentences often consist of a single complex word – while in others, all words are void of internal structure, and rely only on ordering and contextual clues to signal grammatical dependencies. There is no intrinsic reason for one strategy over the other, and attested human language spans the spectrum of possibilities.
On the one hand, linguists have proposed a number of theories for why this diversity exists, most centering on the intimate social structure of the speakers as the explanatory factor for increases in morphological composition, and the addition of second-language speakers as the explanatory factor for decreases in morphological composition. However, this work is based on inductive observation of extant language, and at present descriptive and untestable. Furthermore, the time span over which such morphological developments would take place is at least an order of magnitude greater than that reachable with current data and reconstruction methods.
On the other hand, much work in the hard sciences (particularly cosmology and astrophysics) has overcome similar time gaps using computational simulations. Indeed, sound mathematical models of social networks, and complex networks in general, have emerged and been applied to linguistic issues such as the diffusion of lexical innovation. However, the level of representation in such models is at a far more abstract level than that used in linguistics, and even more problematic, the mechanisms of learning and change employed in these models often make patently unrealistic assumptions, for example, language spreads like a virus.
In this thesis, an initial attempt is made to bridge this gap between ideas and means. First, two variables of interest are identified to quantify the notions of compositional complexity and intimacy - synthesis and the global clustering coefficient, respectively. Synthesis is the average proportion of bound morphemes per word in a representative corpus, and was developed specifically to quantify the degree of morphological composition used in a language. The global clustering coefficient measures the proportion of triangular connections in a complex network, and these triangular connections have been directly connected in the literature to the notion of intimacy.
To explore the relationship between these variables, two novel computational models are created, and are to my knowledge the first to incorporate the fundamental observations within linguistics concerning the development of bound morphology. Specifically, both models have distinct diffusion and intergenerational transmission stages. In addition, the learning mechanisms are differentiated to account for the difference between the learning of novel lexical forms through diffusion, and the slow, incremental grammatical changes that accrue through language transmission. More specifically, the three subprocesses of extension, reanalysis, and repetition identified in the linguistics literature as driving the development of bound morphology are independently addressed.
First, a high-level model that is computationally cheap is developed. It models the capacity of various social networks to support the repeated application of these processes responsible for creating bound morphology. However, this level of abstraction lacks the structure required to measure synthesis directly, and so a low-level model in which the processes are directly modeled on a symbolic meaning-signal structure is also presented.
The results of these two models are in agreement, and find that the physical network measure of the global clustering coefficient supports the processes responsible for bound morphology in the high level model, as well as predicting higher levels of synthesis in the low-level model. In addition, by testing realistic human social network topologies, it was determined that hub agents that emerge as social networks grow in size, counteract this effect of clustering by serving to reshape the input and interrupt the continuation of the transmission process. This result provides both a mechanistic account of the observed correlation of social structure on increases in synthesis, as well as an alternative explanation for decreases in synthesis that do not involve second-language speakers.
Beyond these initial insights, this thesis demonstrates that the computational study of typological change helps extend our knowledge beyond what can be naturally observed. In this case, it provides inroads to understanding how the intimacy of small societies mechanistically relates to the observed correlations seen in the typological distribution of synthesis. |
---|