How to Discover New Intents with Deep Aligned Clustering

Aug 30, 2020 | Data Science

homemayankDocumentsarticle-generation-using-llmresized_images_gitnatural_language_processingreadme_thuiar_DeepAligned-Clustering-1

If you’re venturing into the exciting world of intent discovery using deep learning techniques, you’ve landed at the right place! We will explore the deep aligned clustering method, which allows you to discover new intents effectively. Buckle up as we guide you through a step-by-step journey that will help you set up and run this method with ease!

Introduction

The essence of this approach lies in its integration with the open intent discovery module. You can find the implementation on the official GitHub repository for [open intent discovery](https://github.com/thuiar/TEXTOIR/tree/main/open_intent_discovery). Additionally, you can explore the scalable framework [TEXTOIR](https://github.com/thuiar/TEXTOIR). Let’s dive deeper!

Dependencies

Before we dive into using the model, we need to set up our environment. Here are the steps to get everything ready:

Create a Python environment using Anaconda:

conda create --name your_env_name python=3.6

Install all required libraries:

pip install -r requirements.txt

Model Preparation

Next, let’s prepare our model:

Download the pre-trained BERT model.
Convert it into Pytorch format.
Set the path of the uncased BERT model in init_parameter.py.

Usage

Now that your environment is set up, you can run the experiments:

Execute the following command:

sh scripts/run.sh

You can adjust parameters within the script, such as:

dataset: clinc, banking
factor_of_clusters: 1 (default), 2, 3, 4
known_class_ratio: 0.25, 0.5, 0.75 (default)

Understanding the Code Through an Analogy

Imagine trying to sort different types of fruits into several baskets based on their characteristics: color, size, and taste. Each basket represents a cluster of similar fruits. You start by sorting them based on color, but as you delve deeper, you realize that size also plays a crucial role. Just like in this process of clustering, the deep aligned clustering method evaluates multiple traits to segregate data into more meaningful clusters, finding deeper insights into intents that define natural language processing.

Model Architecture

Here’s a glimpse of the model architecture of Deep Aligned:

Results

The results obtained from the experiments are quite compelling. For more detailed results, check out known_intent_ratio_results.csv and k_results.csv. The efficiency of different methods in various aspects such as NMI, ARI, and ACC is well illustrated in the following tables:


           CLINC               BANKING
Method     NMI    ARI    ACC    NMI    ARI    ACC
KM        70.89  26.86  45.06  54.57  12.18  29.55
AG        73.07  27.70  44.03  57.07  13.31  31.58
SAE-KM    73.13  29.95  46.75  63.79  22.85  38.92
DEC       74.83  27.46  46.89  67.78  27.21  41.29
DCN       75.66  31.15  49.29  67.54  26.81  41.99
DAC       78.40  40.49  55.94  47.35  14.24  27.41
DeepCluster   65.58  19.11  35.70  41.77  8.95   20.69  
PCK-means 68.70  35.40  54.61  48.22  16.24  32.66  
BERT-KCL  86.82  58.79  68.86  75.21  46.72  60.15  
BERT-MCL  87.72  59.92  69.66  75.68  47.43  61.14  
CDAC+  86.65  54.33  69.89  72.25  40.97  53.83  
BERT-DTC  90.54  65.02  74.15  76.55  44.70  56.51  
DeepAligned   93.89  79.75  86.49  79.56  53.64  64.90

Troubleshooting

While everything should go smoothly, here are some troubleshooting ideas if you encounter hiccups:

Environment Issues: Ensure your anaconda environment is activated.
Model Loading: Verify the path to the BERT model is correct in init_parameter.py.
Script Errors: Double-check the syntax in your shell script; minor typos can cause major roadblocks.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox