|
Links
| |
|
10/6/03 - Abstract
The project will product a metaprogram that generates a program in the form of a set of patterns that can be used to analyze English sentences. The generated program should be simple enough that it can be hand-modified by a skilled linguist, and transformed into a set of rules that can extract meaningful information from a body of English text.
Essentially, the rules will be patterns that identify sentences and the variable parts of sentences that are worth investigating (knowledge discovery). For example, I would like the software, upon reading an appropriate corpus of text, to have a rule like "?NOUN is a ?NOUN" where ?NOUN represents a word that is probably a member of a word class ?NOUN. I would like to use something like WordNet to classify words.
I would like the software to also automatically generate classifications based on patterns by detecting groups of patterns that could be collapsed if a sufficient classification existed for some term, and generating the classification. There are inherent dangers here, because I must prevent the software from deciding that all terms should fall under a single, simple pattern. For now, to keep these problems at bay, I will use some pre-existing set of classifications, such as that provided by WordNet.
The metaprogram will be written in Lisp or Caml (I would like to do Caml but don't know it very well, so I may default to Lisp), and the generated program will likely be in the form of a set of Maude rewrite rules or equations.
9/11/03 - Welcome!
  Page posted. |