Welcome to this comprehensive guide on ChunJun, a powerful distributed integration framework that simplifies data synchronization and calculations across various heterogeneous data sources. If you are looking to streamline your data processes and leverage the capabilities of Apache Flink, you’ve landed in the right place!
What is ChunJun?
Initially known as FlinkX, ChunJun was rebranded on February 22, 2022. It has proven its robustness by being deployed in thousands of businesses, allowing seamless communication between different data sources. For those interested in learning more, the official website of ChunJun offers extensive resources.
Key Features of ChunJun
- Real-time computing capability via the Flink engine, utilizing JSON templates and SQL scripts.
- Supports a variety of submission methods including flink-standalone, yarn-session, and yarn-per job.
- Docker one-click deployment and compatibility with Kubernetes.
- Integrates seamlessly with over 20 heterogeneous data sources like MySQL, Oracle, and Hive.
- Supports full and incremental synchronization, as well as offline and real-time processing.
- Mechanisms for dirty data storage and monitoring indicators.
- Ensures robustness with Flink’s checkpoint mechanism for disaster recovery.
Building and Compiling ChunJun
To kickstart your journey, you need to clone the ChunJun repository and build it. Here’s how:
Step 1: Get the Code
Use the following command to clone the ChunJun repository:
git clone https://github.com/DTStack/chunjun.git
Step 2: Build the Project
Navigate to the project directory and execute the following command:
./mvnw clean package
Alternatively, you can use the build script:
sh build.sh
Troubleshooting Common Issues
During your setup, you may encounter some common issues. One notable error is:
[ERROR] Failed to read artifact descriptor for com.google.errorprone:javac-shaded
This occurs due to an inability to fetch required dependencies. Here’s how to resolve the issue:
- Download the missing file from Maven Repository.
- Then install it locally using:
mvn install:install-file -DgroupId=com.google.errorprone -DartifactId=javac-shaded -Dversion=9+181-r4173-1 -Dpackaging=jar -Dfile=path_to_javac-shaded-9+181-r4173-1.jar
For assistance or updates, feel free to reach out at the community forums or explore fxis.ai for further insights.
Understanding ChunJun with an Analogy
Think of ChunJun as a skilled maestro in a symphony orchestra. Just as a maestro synchronizes different instruments to create harmonious music, ChunJun orchestrates data from various sources, ensuring that they work together seamlessly. The database readers, writers, and lookup plugins function as different instruments, each playing its part. The trumpet might represent MySQL, while the cello can embody Oracle; together under ChunJun’s direction, they produce a beautiful data melody without missing a beat.
Quick Start: Running Tasks with ChunJun
ChunJun supports multiple run modes to cater to varying needs. Here’s a brief overview of how to start tasks in different environments:
Local Mode
To run in local mode, no external dependencies are required. Simply execute:
sh bin/chunjun-local.sh -job $SCRIPT_PATH
Standalone Mode
For this mode, first copy ChunJun jars to Flink’s lib directory:
cp -r chunjun-dist $FLINK_HOME/lib
Then, start the Flink cluster and submit your task:
sh bin/chunjun-standalone.sh -job chunjun-examples/jsonstream/stream.json
Resources for Connectors
For detailed documentation on connectors, visit ChunJun Documentation.
Conclusion
ChunJun stands out as a reliable framework for data synchronization, equipped with various features to meet your needs. At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

