Minoan Linguistic Resources: The Linear A Digital Corpus

This paper describes the Linear A/Minoan digital corpus and the approaches we applied to develop it. We aim to set up a suitable study resource for Linear A and Minoan. Firstly we start by introducing Linear A and Minoan in order to make it clear why we should develop a digital marked up corpus of t...

Full description

Saved in:

Bibliographic Details
Main Authors:	Petrolito, Tommaso, Petrolito, Ruggero, Winterstein, Grégoire, Perono Cacciafoco, Francesco
Other Authors:	School of Humanities and Social Sciences
Format:	Conference or Workshop Item
Language:	English
Published:	2016
Subjects:	Linguistics and Multilingual Studies
Online Access:	https://hdl.handle.net/10356/80398 http://hdl.handle.net/10220/40744 http://aclweb.org/anthology/sighum.html#2015_0
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-80398
record_format	dspace
spelling	sg-ntu-dr.10356-803982019-12-06T13:48:34Z Minoan Linguistic Resources: The Linear A Digital Corpus Petrolito, Tommaso Petrolito, Ruggero Winterstein, Grégoire Perono Cacciafoco, Francesco School of Humanities and Social Sciences Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH) Linguistics and Multilingual Studies This paper describes the Linear A/Minoan digital corpus and the approaches we applied to develop it. We aim to set up a suitable study resource for Linear A and Minoan. Firstly we start by introducing Linear A and Minoan in order to make it clear why we should develop a digital marked up corpus of the existing Linear A transcriptions. Secondly we list and describe some of the existing resources about Linear A: Linear A documents (seals, statuettes, vessels etc.), the traditional encoding systems (standard code numbers referring to distinct symbols), a Linear A font, and the newest (released on June 16th 2014) Unicode Standard Characters set for Linear A. Thirdly we explain our choice concerning the data format: why we decided to digitize the Linear A resources; why we decided to convert all the ranscriptions in standard Unicode characters; why we decided to use an XML format; why we decided to implement the TEI-EpiDoc DTD. Lastly we describe: the developing process (from the data collection to the issues we faced and the solving strategies); a new font we developed (synchronized with the Unicode Characters Set) in order to make the data readable even on systems that are not updated. Finally, we discuss the corpus we developed in a Cultural Heritage preservation perspective and suggest some future works. Published version 2016-06-22T07:39:30Z 2019-12-06T13:48:34Z 2016-06-22T07:39:30Z 2019-12-06T13:48:34Z 2015 2015 Conference Paper https://hdl.handle.net/10356/80398 http://hdl.handle.net/10220/40744 http://aclweb.org/anthology/sighum.html#2015_0 192451 en © 2015 Association for Computational Linguistics (ACL) and Asian Federation of Natural Language Processing (AFNLP). This journal published under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. 10 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
country	Singapore
collection	DR-NTU
language	English
topic	Linguistics and Multilingual Studies
spellingShingle	Linguistics and Multilingual Studies Petrolito, Tommaso Petrolito, Ruggero Winterstein, Grégoire Perono Cacciafoco, Francesco Minoan Linguistic Resources: The Linear A Digital Corpus
description	This paper describes the Linear A/Minoan digital corpus and the approaches we applied to develop it. We aim to set up a suitable study resource for Linear A and Minoan. Firstly we start by introducing Linear A and Minoan in order to make it clear why we should develop a digital marked up corpus of the existing Linear A transcriptions. Secondly we list and describe some of the existing resources about Linear A: Linear A documents (seals, statuettes, vessels etc.), the traditional encoding systems (standard code numbers referring to distinct symbols), a Linear A font, and the newest (released on June 16th 2014) Unicode Standard Characters set for Linear A. Thirdly we explain our choice concerning the data format: why we decided to digitize the Linear A resources; why we decided to convert all the ranscriptions in standard Unicode characters; why we decided to use an XML format; why we decided to implement the TEI-EpiDoc DTD. Lastly we describe: the developing process (from the data collection to the issues we faced and the solving strategies); a new font we developed (synchronized with the Unicode Characters Set) in order to make the data readable even on systems that are not updated. Finally, we discuss the corpus we developed in a Cultural Heritage preservation perspective and suggest some future works.
author2	School of Humanities and Social Sciences
author_facet	School of Humanities and Social Sciences Petrolito, Tommaso Petrolito, Ruggero Winterstein, Grégoire Perono Cacciafoco, Francesco
format	Conference or Workshop Item
author	Petrolito, Tommaso Petrolito, Ruggero Winterstein, Grégoire Perono Cacciafoco, Francesco
author_sort	Petrolito, Tommaso
title	Minoan Linguistic Resources: The Linear A Digital Corpus
title_short	Minoan Linguistic Resources: The Linear A Digital Corpus
title_full	Minoan Linguistic Resources: The Linear A Digital Corpus
title_fullStr	Minoan Linguistic Resources: The Linear A Digital Corpus
title_full_unstemmed	Minoan Linguistic Resources: The Linear A Digital Corpus
title_sort	minoan linguistic resources: the linear a digital corpus
publishDate	2016
url	https://hdl.handle.net/10356/80398 http://hdl.handle.net/10220/40744 http://aclweb.org/anthology/sighum.html#2015_0
_version_	1681049665301118976

Minoan Linguistic Resources: The Linear A Digital Corpus

Similar Items