python 2.7 - NLTK RegEx Chunker - Wildcard match any POS tag? -
i'm using nltk's regexpparser phrases pos-tagged words. example:
grammar = """ found:{<nnp>+<cd>+<,>+<cd>} ... """ pos_tagged_words = [('february', 'nnp'), ('14', 'cd'), (',', ','), ('1993', 'cd')] result = nltk.regexpparser(grammar).parse(pos_tagged_words)
is there way match wildcard tag? if worked, i'd looking this:
found:{<nnp>?<.>*<vbz>}
where <.> wildcard.
edit:
found pretty bad way doesnt include characters. still appreciate dedicated wildcard char.
found:{<nnp>?<[a-z]+|[:punct:]+>*<vbz>}
try this:
{<nnp>?<.*>*<vbz>}
Comments
Post a Comment