Welcome to the world of artificial intelligence where we will uncover the captivating realms of Deep Reinforcement Learning (DRL) and its application in creating a dynamic recommender system. This blog outlines the implementation of a DRL based recommender system inspired by the paper Deep Reinforcement Learning based Recommendation with Explicit User-Item Interactions Modeling by Liu et al. We will be utilizing the DDPG algorithm along the way. Let’s embark on this enlightening journey!
Dataset
We will be working with the MovieLens 1M Dataset. Before diving into the implementation, ensure you unzip the ml-1m.zip file to access the data.
Procedure
Our goal is to enhance the performance of the RL-based recommender system through various methods:
- Utilizing the actor network with an embedding layer
- Mitigating the overestimation of Q values
- Incorporating several pretrained embeddings
- Applying Prioritized Experience Replay (PER) as outlined in this paper
Additionally, we will create new embedding files, as the previous version contained information for entire timelines, which could mislead the model during training. Moreover, the training and evaluation processes will be updated to best fit our particular implementation, diverging from the original procedures described in the paper.
Results
After our rigorous training and evaluation, please check the results via the following report: Experiment Report (Korean).
Here is a snapshot of our evaluation metrics:
- precision@5 : 0.479
- ndcg@5 : 0.471
- precision@10 : 0.444
- ndcg@10 : 0.429
Usage
Training
To kick-start the training process, simply run the following command:
python train.py
This will generate the saved models for both the actor and critic once the training is completed successfully.
Evaluation
Make sure that the saved models are located in the correct directory. Next, launch Jupyter Notebook and execute the evaluation.ipynb file to check the evaluation results.
Requirements
For smooth sailing, ensure to have the following libraries installed:
- tensorflow==2.5.0
- scikit-learn==0.23.2
- matplotlib==3.3.3
Troubleshooting
If you encounter any issues during the setup or implementation, consider the following troubleshooting ideas:
- Ensure that all dependencies are correctly installed according to the versions specified.
- Verify that your dataset is properly extracted, and the directories contain the necessary files.
- Check paths in your Jupyter Notebook to ensure they point to the right model directories.
- If results appear aberrant, experiment with different hyperparameters to observe changes in performance.
For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.
Conclusion
By combining our approach with an engaging analogy, think of the recommender system as a keen librarian who learns your preferences every time you visit. Initially, the librarian takes guesses based on past library visits. With each book you borrow, a deeper understanding of your taste develops until the librarian can precisely curate your next reading list. This is what DRL does; it continuously learns from your interactions to optimize recommendations.
At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

