Feature Manipulation for Maxent model in NER

Manoj B. Narayanan

2017-12-19 13:27:14 UTC

Hi all,

I tried varying the custom features we provide to the model. I have a few
queries regarding it.

1. Will the probability for a particular feature get affected if I add it
multiple times?
Eg. If I add a feature 'pos=NN' multiple times, will it have an impact
on model performance?

2. What if I add the same feature differently?
Eg. I add 'pos=NN' and 'partsOfSpeech=NN', what will be the impact.
These 2 are always co-occurring too. So how will the model treat them.

3. How does the model learn the features? Please give a small example.

4. What if we can add classes to the features?
Eg. Certain features can have only a certain set of values. If we are
able to label them, can we make the model learn features according to the
labels?
Say, I have a) pos feature b) dictionary feature
If the probability is calculated with respect to the corresponding
class (pos / dictionary) and then the overall probability is calculated how
will the model behave?
Instead of giving a single string as a feature what if we give a key,
value pair as feature?

Awaiting discussion regarding these.

Thanks,
Manoj.