How to Use Doccano: Your Guide to Text Annotation

Jul 3, 2023 | Data Science

Doccano is an open-source text annotation tool designed to seamlessly assist in creating labeled data for a plethora of tasks such as sentiment analysis, named entity recognition, and text summarization. With its collaborative and user-friendly interface, you can build datasets in mere hours! This guide will walk you through the process of setting up and using Doccano effectively.

Getting Started with Doccano

To get started with Doccano, you have three primary options for installation:

  • pip (Python 3.8+)
  • Docker
  • Docker Compose

Installing Doccano Using pip

Follow these steps to install Doccano using pip:

pip install doccano
pip install doccano[postgresql]  # Optional for PostgreSQL

By default, Doccano uses SQLite 3 for the database. If you prefer PostgreSQL, ensure you set the DATABASE_URL environment variable based on your PostgreSQL credentials. Once installed, initialize the database and start the web server with the following commands:

doccano init
doccano createuser --username admin --password pass
doccano webserver --port 8000

In another terminal, run:

doccano task

Now, access Doccano by navigating to http://127.0.0.1:8000.

Installing Doccano Using Docker

If you prefer Docker, follow these steps:

docker pull doccano/doccano
docker container create --name doccano -e ADMIN_USERNAME=admin -e ADMIN_EMAIL=admin@example.com -e ADMIN_PASSWORD=password -v doccano-db:data -p 8000:8000 doccano/doccano
docker container start doccano

Access Doccano again at http://127.0.0.1:8000. To stop the container, use:

docker container stop doccano -t 5

Installing Doccano Using Docker Compose

For Docker Compose:

git clone https://github.com/doccano/doccano.git
cd doccano

Ensure you set up your .env file with the necessary configurations and run:

docker-compose -f docker/docker-compose.prod.yml --env-file .env up

Access Doccano at http://127.0.0.1.

Understanding Doccano’s Features

Doccano offers a variety of features:

  • Collaborative annotation
  • Multi-language support
  • Mobile support
  • Emoji support
  • Dark theme
  • RESTful API

Troubleshooting and Support

If you run into any issues during setup or operation, here are some common troubleshooting tips:

  • Check Dependencies: Ensure all required packages and dependencies are correctly installed.
  • Database Setup: Verify your database configurations, especially after modifying the DATABASE_URL.
  • Container Issues: If the Docker container fails to start, ensure no other services are using port 8000.
  • Network Problems: Ensure your network settings allow Docker to access the internet.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Wrap Up

Doccano is an invaluable tool for text annotation, enabling teams to expedite their data labeling tasks. Its flexibility and collaborative features make it a perfect fit for diverse projects.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox