How to Apply Named Entity Recognition (NER) to Concise Concepts

Jun 20, 2021 | Data Science

Named Entity Recognition (NER) can help us identify key elements in texts, but applying it to concise concepts can be tricky. Thankfully, with Few-Shot NER leveraging word embedding similarity, this task becomes much simpler! In this blog post, we’ll explore how to implement the *Concise Concepts* library that enhances NER with entity scoring.

Getting Started with Installation

Before we dive into the code, let’s first install the concise-concepts Python package. This package will give us access to the functionalities that we need.

pip install concise-concepts

Understanding the Code: An Analogy

Think of the NER pipeline as a kitchen where various ingredients (data, text, models) are prepared to create a delicious dish (meaningful insights). The *Concise Concepts* library acts like a skilled chef, who knows how to blend those ingredients precisely based on previous experiences (word embedding similarities) to create something scrumptious without needing a full list of ingredients (few-shot learning).

Here’s how it works:

  • Ingredients Preparation: You define your groups (like fruits, vegetables, meat) as `data` which acts as our ingredients.
  • Recipe Following: The recipe (the NER code) directs how to combine the ingredients with the help of the spaCy EntityRuler, which signifies the specific cooking methods.
  • Final Tasting: This still requires careful tasting (entity scoring) to ensure the flavor is just right – measured by how well the entities match their types.

Creating the SpaCy Pipeline

Now, let’s build our NER pipeline utilizing the *Concise Concepts* library. Below is a code example.

import spacy
from spacy import displacy

data = {
    'fruit': ['apple', 'pear', 'orange'],
    'vegetable': ['broccoli', 'spinach', 'tomato'],
    'meat': ['beef', 'pork', 'turkey', 'duck'],
}

text = "Heat the oil in a large pan and add the onion, celery and carrots. Then, cook over a medium–low heat for 10 minutes, or until softened. Add the courgette, garlic, red peppers and oregano and cook for 2–3 minutes. Later, add some oranges and chickens."

nlp = spacy.load('en_core_web_md', disable=['ner'])
nlp.add_pipe(
    concise_concepts,
    config={
        'data': data,
        'ent_score': True,
        'verbose': True,
        'exclude_pos': ['VERB', 'AUX'],
        'exclude_dep': ['DOBJ', 'PCOMP'],
        'include_compound_words': False,
        'json_path': '.fruitful_patterns.json',
        'topn': (100, 500, 300)
    }
)

doc = nlp(text)
options = {
    'colors': {
        'fruit': 'darkorange', 
        'vegetable': 'limegreen', 
        'meat': 'salmon'
    },
    'ents': ['fruit', 'vegetable', 'meat']
}

ents = doc.ents
for ent in ents:
    new_label = f"{ent.label_} ({ent._.ent_score:.0%})"
    options['colors'][new_label] = options['colors'].get(ent.label_.lower(), None)
    options['ents'].append(new_label)
    ent.label_ = new_label

doc.ents = ents
displacy.render(doc, style='ent', options=options)

Troubleshooting Common Issues

If you encounter issues while setting up your NER pipeline, consider the following troubleshooting tips:

  • Module Not Found: Ensure you have all required packages installed. Use pip list to check.
  • Incorrect Model Path: Double-check your custom embedding model path to ensure accuracy.
  • Version Compatibility: Verify that your spaCy version is compatible with the concise-concepts library.
  • Configuration Errors: Review your configuration settings. Sometimes, omitting or incorrectly typing an option can cause issues.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox