Michael Bramante

I believe that I have designed and built a properly normalized relational database designed to store, retrieve and manipulate the content and structure of any written natural languages.

I have a Master of Science degree in Computer Science and I have been designing and building software and large scale corporate database systems since 1993.


http://home.comcast.net/~bramante/NLDB/NLDB_WEB.htm

[Discussion] Your Thoughts On: A Properly Normalized Relational Database Designed To Store, Retrieve and Manipulate the Content and Structure of Any Written Natural Languages
The purpose of this post is to get your feedback on the general concept of a Natural Language Database (NLDB) which I have designed, built and partially populated. NLDB is a properly normalized relational database that is designed specifically to store and retrieve the content, rules and structures of natural languages in a granular and generalized enough way that it’s data can be queried in any conceivable way. NLDB could be used for true Natural Language Understanding, accurate Machine Translation and other Natural Language Processing applications.

NLDB is a sophisticated, application independent, general-purpose, generalized, data driven natural language knowledge base. NLDB is currently capable of storing all of the contents of the annotated SUSANNE Corpus, WordNet 2.1 and most of the content from the Longman Dictionary of Contemporary English including phonograms and syllabification. I can prove that NLDB can be the fundamental data source and foundation upon which viable, practical Natural Language Processing applications can be built.

Natural language content, structures and rules are both represented in and reflected by the content and structure of NLDB tables and their relationships with one-another. NLDB can store and retrieve any natural language structures including grammatical, morphological, syntactic, semantic, lexical, phonological and predicate-argument. NLDB data can be imported and integrated from currently available machine readable dictionaries, thesauri, lexical data files and annotated corpora; then independently checked and refined by hand to achieve 100% accuracy. NLDB data can be accessed and manipulated using standard Structured Query Language (SQL). NLDB documentation consists of a data model and data dictionary. NLDB is composed of twelve core tables.

by bramante    22 comments   
User Communities: