Improving Incremental Packrat Parsing

Packrat parsing, introduced by Ford in 2002 [12, 13] is a na ̈ıve, backtracking, recursive-descent parsing technique that guarantees linear parse times by using memo- ization. Most of the time, a packrat parser is specified using parsing expression gram- mars (PEG) [14], a new formalism to describe...

Full description

Saved in:
Bibliographic Details
Main Author: Guillermo, Jerwin Mark
Format: text
Published: Archīum Ateneo 2019
Subjects:
n/a
Online Access:https://archium.ateneo.edu/theses-dissertations/402
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Ateneo De Manila University
id ph-ateneo-arc.theses-dissertations-1528
record_format eprints
spelling ph-ateneo-arc.theses-dissertations-15282021-09-27T03:00:04Z Improving Incremental Packrat Parsing Guillermo, Jerwin Mark Packrat parsing, introduced by Ford in 2002 [12, 13] is a na ̈ıve, backtracking, recursive-descent parsing technique that guarantees linear parse times by using memo- ization. Most of the time, a packrat parser is specified using parsing expression gram- mars (PEG) [14], a new formalism to describe programming language syntax, similar to the extended Backus-Naur form [6]. In 2017, incremental packrat parsing was in- troduced by Dubroy and Warth [9]. It takes advantage of the memoization table of the previous parse, and reuses it in the current parse, thereby reducing parse times. But such parser still suffers from some slowdowns brought about by sufficiently long inputs. Ad- ditionally, parse times are not reflective of the size of the change in input. Moreover, incremental packrat parsing incurs an additional 12% heap memory usage as compared to its non-incremental variant. This study aimed to improve incremental packrat parsing by further reducing its parse time and heap memory usage. This was done through the use of techniques from experimental algorithmics namely, algorithm tuning, and code tuning, as well as three existing optimizations to packrat parsing namely, grammar simplification, inlining, and transient optimization. Parsing performance of the improved incremental packrat parser was then compared to the performance of the existing one. The improved parser was significantly faster (t(890) = 49.62, p = 8.33×10−259), with about 68% lower average parse time. It also used significantly lower (t(25) = 4.79, p = 3.60×10−5 ) heap memory, with about 26% reduction in size on the average. These results prove that after incorporating the aforementioned techniques and optimizations, the improved incremental packrat parser fared better than the existing parser, in terms of parsing performance. 2019-01-01T08:00:00Z text https://archium.ateneo.edu/theses-dissertations/402 Theses and Dissertations (All) Archīum Ateneo n/a
institution Ateneo De Manila University
building Ateneo De Manila University Library
continent Asia
country Philippines
Philippines
content_provider Ateneo De Manila University Library
collection archium.Ateneo Institutional Repository
topic n/a
spellingShingle n/a
Guillermo, Jerwin Mark
Improving Incremental Packrat Parsing
description Packrat parsing, introduced by Ford in 2002 [12, 13] is a na ̈ıve, backtracking, recursive-descent parsing technique that guarantees linear parse times by using memo- ization. Most of the time, a packrat parser is specified using parsing expression gram- mars (PEG) [14], a new formalism to describe programming language syntax, similar to the extended Backus-Naur form [6]. In 2017, incremental packrat parsing was in- troduced by Dubroy and Warth [9]. It takes advantage of the memoization table of the previous parse, and reuses it in the current parse, thereby reducing parse times. But such parser still suffers from some slowdowns brought about by sufficiently long inputs. Ad- ditionally, parse times are not reflective of the size of the change in input. Moreover, incremental packrat parsing incurs an additional 12% heap memory usage as compared to its non-incremental variant. This study aimed to improve incremental packrat parsing by further reducing its parse time and heap memory usage. This was done through the use of techniques from experimental algorithmics namely, algorithm tuning, and code tuning, as well as three existing optimizations to packrat parsing namely, grammar simplification, inlining, and transient optimization. Parsing performance of the improved incremental packrat parser was then compared to the performance of the existing one. The improved parser was significantly faster (t(890) = 49.62, p = 8.33×10−259), with about 68% lower average parse time. It also used significantly lower (t(25) = 4.79, p = 3.60×10−5 ) heap memory, with about 26% reduction in size on the average. These results prove that after incorporating the aforementioned techniques and optimizations, the improved incremental packrat parser fared better than the existing parser, in terms of parsing performance.
format text
author Guillermo, Jerwin Mark
author_facet Guillermo, Jerwin Mark
author_sort Guillermo, Jerwin Mark
title Improving Incremental Packrat Parsing
title_short Improving Incremental Packrat Parsing
title_full Improving Incremental Packrat Parsing
title_fullStr Improving Incremental Packrat Parsing
title_full_unstemmed Improving Incremental Packrat Parsing
title_sort improving incremental packrat parsing
publisher Archīum Ateneo
publishDate 2019
url https://archium.ateneo.edu/theses-dissertations/402
_version_ 1712577846120022016