Détails sur le projet
Description
For centuries, most language scientists have agreed that all languages have some universal building blocks, and the categories of nouns and verbs are among those. All languages are expected to have nouns, from the concrete ones such as 'table' to the abstract ones such as 'strife'. All languages are equally expected to have verbs, words such as 'go', 'eat', or 'love'. However, a persistent thread of research maintains that there are languages that do not have these categories. Polysynthetic languages investigated in this project are at the top of this list. In a polysynthetic language, words are composed of many parts (morphemes) which have independent meaning but may not be able to stand alone. If polysynthetic languages indeed do not have noun-verb distinctions, that would make them highly unusual and would create new challenges to our understanding of universal principles of cognition and speech.
This project explores noun-verb distinctions in polysynthetic languages by developing new methods in computational linguistics that promote and facilitate cross-linguistic comparisons. The specific questions this research addresses are: (1) are there universal word class distinctions, particularly between nouns and verbs, and if yes, at what level does such a distinction exist? (2) can we uncover universal diagnostics for noun-verb distinctions? To answer these questions, two key issues must be addressed computationally: morphological segmentation and part-of-speech tagging of lexical items in context. The researchers propose a novel computational approach to morphological segmentation based on Adaptor Grammars that is unsupervised and is able to include linguistic knowledge as inductive bias. They also develop an unsupervised cross-lingual transfer approach for part-of-speech tagging that will be applied to a range of polysynthetic languages. As a result, the project will assemble computational tools and primary linguistic data from a diverse set of polysynthetic languages. Aside from their computational and linguistic value, the project's results will also have significant societal impact as many polysynthetic languages are spoken in areas that are key for international security, language revitalization, and health concerns. In-depth work on grammar and corpora of polysynthetic languages will serve as the basis of new pedagogical materials to be used for the teaching and revitalization of low-resource languages.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
Statut | Terminé |
---|---|
Date de début/de fin réelle | 9/18/17 → 12/31/22 |
Financement
- National Science Foundation: 249 302,00 $ US
Keywords
- Lengua y lingüística
- Matemática computacional
- Lingüística y lenguaje
- Psicobiología
- Neurociencia cognitiva
- Matemáticas (todo)
- Física y astronomía (todo)
- General