The research catalogue is an archive of ESRC-funded grants and outputs. Links, files and other content will no longer be maintained or updated after April 2014.

Machine-readable grammatical resources for Indonesian

This project produced grammatical resources for Indonesian, to guide development of computer-implemented grammars and to establish a standard by which grammar coverage can be measured. The resources consist of a set of 52 machine-readable (plain text) files containing acceptable and unacceptable sentences of Indonesian, their translations, and comments on their grammatical structure. Each file constitutes an in-depth investigation into the grammatical structure of one aspect of Indonesian, or of the interactions among one or more constructions.

Our project connects with the project "Understanding Indonesian: developing a machine-usable grammar, dictionary and corpus", funded by the Australian Research Council, with which PI Dalrymple is associated as a partner investigator.

This project will produce a broad-coverage grammar, lexicon, and balanced corpus of Indonesian as a part of the Parallel Grammar Project (PARGRAM), an international consortium of research institutions to develop computational grammars and lexicons within the shared linguistic framework of Lexical Functional Grammar (LFG).

The test suites have guided the development of the grammar, ensuring coverage of less common as well as of basic constructions, testing the full paradigm of constructions and their interactions, and testing the "tightness" of the grammar in excluding impossible analyses as well as producing well-formed analyses for the constructions under examination.