Welcome to my article on MySQL to S3 pipeline using Vanus! In this post, we’ll explore how you can set up a reliable and scalable pipeline to export data from your MySQL database to Amazon S3 using Vanus, a powerful data integration platform.
If you’re using MySQL to store critical data for your business, it’s important to ensure that you have a backup and archiving solution in place. Storing data in S3 provides many benefits, including durability, scalability, and cost-effectiveness. With Vanus, you can easily set up a pipeline to extract data from your MySQL database and write it to S3.
By the end of this blog post, you’ll have a solid understanding of how to set up a MySQL to S3 pipeline using Vanus, and you’ll be equipped with the tools and knowledge you need to ensure that your data is backed up and archived in a reliable and scalable manner. So let’s dive in!
Using Vanus, we can automatically send customized messages to multiple email recipients without a single line of code.
Table of content
Amazon S3 (Simple Storage Service) is a cloud-based object storage service provided by Amazon Web Services (AWS). S3 provides a scalable and secure way to store and retrieve data from anywhere on the internet. It is designed to be highly available and durable, with an SLA that promises 99.999999999% (11 nines) of object durability.
S3 enables users to store and retrieve any amount of data, from anywhere in the world, through a simple web interface, APIs or AWS CLI. S3 stores data as objects, which are organized into buckets. A bucket is a container for objects, and you can think of it as a folder in a file system. Each object in S3 has a unique key, which identifies the object within the bucket.
S3 supports various features such as versioning, lifecycle policies, access control, security and compliance, and data transfer acceleration. S3 can also integrate with other AWS services such as Amazon Elastic Compute Cloud (EC2), AWS Lambda, AWS Glue, and more, to build powerful data processing and analysis pipelines.
S3 is widely used for a variety of use cases such as data backup and archive, big data analytics, media storage and distribution, and mobile and web application hosting, among others. S3 offers a pay-as-you-go pricing model, which makes it a cost-effective solution for businesses of all sizes.
There are several reasons why you should backup your MySQL database into Amazon S3:
Overall, backing up your MySQL database into Amazon S3 provides a reliable and scalable solution for protecting your critical data, while also enabling you to leverage the power of AWS services to process and analyze your data in the cloud.
The following are the three steps for setting up a MySQL to S3 pipeline using Vanus:
By following these three steps, you can set up a MySQL to S3 pipeline using Vanus and begin copying your data from MySQL to S3. Once the pipeline is set up, you can automate the data transfer process and customize it according to your specific business needs.
Install Vanus by typing the following command:
kubectl apply -f https://download.linkall.com/all-in-one/v0.6.0.yml
Verify if Vanus is deployed successfully:
$ watch -n2 kubectl get po -n vanus
vanus-controller-0 1/1 Running
0
96s
vanus-controller-1 1/1 Running
0
72s
vanus-controller-2 1/1 Running
0
69s
vanus-gateway-8677fc868f-rmjt9 1/1 Running
0
97s
vanus-store-0 1/1 Running
0
96s
vanus-store-1 1/1 Running
0
68s
vanus-store-2 1/1 Running
0
68s
vanus-timer-5cd59c5bf-hmprp 1/1 Running
0
97s
vanus-timer-5cd59c5bf-pqkd5 1/1 Running
0
97s
vanus-trigger-7685d6cc69-8jgsl 1/1 Running
0
97s
Install vsctl (the command line tool).
curl -O https://download.linkall.com/vsctl/latest/linux-amd64/vsctl
chmod ug+x vsctl
mv vsctl /usr/local/bin
Set the endpoint for vsctl.
export VANUS_GATEWAY
=
192.168.49.2:30001
Create an Eventbus to store your events.
$ vsctl eventbus create --name mysql-s3-scenario
+----------------+---------------------+
| RESULT | EVENTBUS |
+----------------+---------------------+
| Create Success | mysql-email-scenario|
+----------------+---------------------+
CREATE
USER
'vanus'
@
'%'
IDENTIFIED WITH
mysql_native_password BY
'PASSWORD'
;
GRANT
SELECT
, RELOAD, SHOW
DATABASES, REPLICATION SLAVE, REPLICATION CLIENT ON
. TO
'vanus'
@
'%'
;
cat
<< EOF > config.yml
target: http://192.168.49.2:30002/gateway/mysql-s3-scenario # Vanus in Playground
name: "quick_start"
db:
host: "MYSQL_HOST" # IP address of MySQL server
port: MYSQL PORT # port address of MySQL server
username: "vanus" # Username
password: "PASSWORD" # Password previously set
database_include: [ "<DATABASE_NAME>" ] # The name of your database
# format is vanus_test.tableName
table_include: [ "TABLE_NAME" ] # The name of your Table
store:
type: FILE
pathname: "/vanus-connect/data/offset.dat"
db_history_file: "/vanus-connect/data/history.dat"
EOF
docker run -it --rm --network
=
host \
-v ${
PWD}
:/vanus-connect/config \
-v ${
PWD}
:/vanus-connect/data \
--name source-mysql public.ecr.aws/vanus/connector/source-mysql &
cat
<< EOF > s3-sink.yml
apiVersion: v1
kind: Service
metadata:
name: sink-aws-s3
namespace: vanus
spec:
selector:
app: sink-aws-s3
type: ClusterIP
ports:
- port: 8080
name: sink-aws-s3
---
apiVersion: v1
kind: ConfigMap
metadata:
name: sink-aws-s3
namespace: vanus
data:
config.yml: |-
port: 8080
aws:
access_key_id: AKIAIOSFODNN7EXAMPLE
secret_access_Key: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
region: "us-west-2"
bucket: "mybucket"
scheduled_interval: 10
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: sink-aws-s3
namespace: vanus
labels:
app: sink-aws-s3
spec:
selector:
matchLabels:
app: sink-aws-s3
replicas: 1
template:
metadata:
labels:
app: sink-aws-s3
spec:
containers:
- name: sink-aws-s3
image: public.ecr.aws/vanus/connector/sink-aws-s3
imagePullPolicy: Always
ports:
- containerPort: 8080
name: http
volumeMounts:
- name: config
mountPath: /vanus-connect/config
volumes:
- name: config
configMap:
name: sink-aws-s3
EOF
You can edit the configuration settings by opening the “s3-sink.yml” file in the vi text editor using the command “vi s3-sink.yml”. Once the file is opened in the editor, you can make any necessary modifications to the configuration parameters and save the changes before exiting the editor.
vi s3-sink.yml
How to exit.
ESC Shift + : wq Enter
kubectl apply -f s3-sink.yaml
Here is a short recap of the three steps involved in setting up a MySQL to S3 pipeline using Vanus:
Step 1: Deploy Vanus - Using the playground to install Vanus in Kubernetes.
Step 2: Start MySQL Connector - Configure and start the MySQL connector by providing the necessary database connection details, such as the hostname, port, username, and password, and configuring any necessary data filters.
Step 3: Start S3 Connector - Configure and start the S3 connector by providing the S3 bucket details, such as the bucket name and region, and configuring any necessary data transforms.
To connect the Vanus Eventbus to the S3 Sink, you will need to create a subscription that specifies the Eventbus and the target Sink. This can typically be done via the Vanus command line vsctl.
Here is the command to create a subscription:
vsctl subscription create
\
--name quick-start \
--eventbus mysql-s3-scenario \
--sink 'http://sink-aws-s3:8080'
Learn more about the Subscription specifications for filtering and Transformation here.
Let’s test everything by making an entry in MySQL.
INSERT into DATABASE.TABLE values();
In conclusion, the pipeline from MySQL to AWS S3 using Vanus is an efficient and streamlined solution for storing and managing data. The use of Vanus allows for easy integration with various cloud services and APIs, making it an ideal choice for developers and data analysts alike. By leveraging the power of Vanus, organizations can improve their data management and analysis capabilities, ultimately leading to better business outcomes. With the right implementation and configuration, this pipeline can unlock a wealth of insights from the data generated by MySQL databases.
If you have any questions on mysql backup to s3. We will give the professional answers to your questions.