InvoiceNet is a revolutionary tool meticulously designed to extract valuable information from invoice documents using deep neural networks. Whether you have PDF, JPG, or PNG invoices, InvoiceNet is equipped to handle them all with ease, providing a user-friendly interface for easy data extraction and handling. Let’s embark on a journey to harness the power of InvoiceNet.
Installation
Before diving into the extraction of intelligent information, you’ll need to install InvoiceNet on your system, depending on your operating system.
Installing on Ubuntu 20.04
- First, open your terminal and run the following commands:
git clone https://github.com/naiveHobo/InvoiceNet.git
cd InvoiceNet
# Run installation script
./install.sh
source env/bin/activate
Installing on Windows 10
- For Windows users, it’s best to use Anaconda. Open your command prompt and execute:
git clone https://github.com/naiveHobo/InvoiceNet.git
cd InvoiceNet
# Create conda environment and activate
conda create --name invoicenet python=3.7
conda activate invoicenet
# Install InvoiceNet
pip install .
# Install poppler
conda install -c conda-forge poppler
Data Preparation
To train your custom models effectively, prepare your data as follows:
- Arrange your invoice files and their corresponding JSON label files in a single directory in this format:
train_data/
invoice1.pdf
invoice1.json
nike-invoice.pdf
nike-invoice.json
12345.pdf
12345.json
vendor_name:Nike, invoice_date:12-01-2017, invoice_number:R0007546449, total_amount:137.51 (and other fields as necessary).Custom Fields Additions
InvoiceNet supports adding custom fields to match your specific requirements:
- Edit the
invoicenet/__init__.pyfile to define your fields. - There are four predefined field types:
general,optional,amount, anddate. Here’s how you would add a field:
# Add the following line at the end of the file
FIELDS[total_amount] = FIELD_TYPES[amount]
FIELDS[invoice_date] = FIELD_TYPES[date]
FIELDS[tax_id] = FIELD_TYPES[optional]
FIELDS[vendor_name] = FIELD_TYPES[general]
Using the GUI for Training and Extraction
InvoiceNet features a user-friendly GUI for training models and extracting information:
- To run the trainer GUI, execute:
python trainer.py
python extractor.py
Using the CLI for Training and Prediction
For those who prefer command-line operations:
Training Your Model
- Prepare your data:
python prepare_data.py --data_dir train_data
python train.py --field enter-field-here --batch_size 8
Prediction
To extract fields from invoices using your trained model:
- For a single invoice:
python predict.py --field enter-field-here --invoice path-to-invoice-file
python predict.py --field enter-field-here --data_dir predict_data
Troubleshooting
If you encounter any issues during installation or usage, consider the following troubleshooting tips:
- Ensure all dependencies are installed correctly.
- Check the compatibility of your operating system with the required versions (CUDA, cuDNN, TensorFlow).
- If you’re unable to prepare your data or train your model, revisit the data preparation steps.
- For specific issues, consult the community or reach out via email if you have a dataset to share: sarthakmittal2608@gmail.com.
- For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
InvoiceNet provides a powerful solution for intelligent extraction from invoices, accommodating both custom field additions and diverse input formats effortlessly. At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

