How to sync your MongoDB databases using Kafka and MongoDB Kafka Connector

2020-05-30T13:37:39+00:00

Future directions of the project:
Add end-to-end latency metrics to Streams
Currently, the actual end-to-end latency of a record flowing through Streams is difficult to gauge at best. It’s also difficult to build real-time applications without some sense of the latency you can expect between the time that an event occurs and when this event is processed and reflected in the output results. Being able to bound this latency is an important requisite for many apps, and exposing this through metrics should go a long way towards enabling users to make the right design choices.
For example, application builders may be producing events in reaction to user actions. Without a sense of how long it will take for this new event to be reflected in the results, how do you know what to do next – should they show a brief loading screen before issuing an interactive query, or will they need to wait hours for this record to be processed? (Hopefully not). Without end-to-end latency metrics, this question is difficult to answer.

Reply

2021-02-02T04:46:56+00:00

Hi Thanks for the great blog.

I have one question, the approach implements one-way sync, how to implement two-way sync?

Reply

2021-03-10T18:49:57+00:00

Why don’t you use Mongodb Replica set?

Reply

2021-03-10T19:40:44+00:00

Why don’t you answer question?

Reply

	/**
	* This script can be used to create, update, and delete sample data.
	* This script is especially helpful when testing change streams.
	*/
	const { MongoClient } = require('mongodb');

	async function main() {
	/**
	* Connection URI. Update <username>, <password>, and <your-cluster-url> to reflect your cluster.
	* See http://bit.ly/NodeDocs_lauren for more details
	*/
	const uri = "mongodb+srv://dbUser:dbUserPassword@cluster0-jj6uu.mongodb.net/test?retryWrites=true&w=majority";

	/**
	* The Mongo Client you will use to interact with your database
	* See bit.ly/Node_MongoClient for more details
	*/
	const client = new MongoClient(uri);

	try {
	// Connect to the MongoDB cluster
	await client.connect();

	// Make the appropriate DB calls
	const operaHouseViews = await createListing(client, {
	name: "Opera House Views",
	summary: "Beautiful apartment with views of the iconic Sydney Opera House",
	property_type: "Apartment",
	bedrooms: 1,
	bathrooms: 1,
	beds: 1,
	address: {
	market: "Sydney",
	country: "Australia"
	}
	});

	const privateRoomInLondon = await createListing(client, {
	name: "Private room in London",
	property_type: "Apartment",
	bedrooms: 1,
	bathroom: 1
	});

	const beautifulBeachHouse = await createListing(client, {
	name: "Beautiful Beach House",
	summary: "Enjoy relaxed beach living in this house with a private beach",
	bedrooms: 4,
	bathrooms: 2.5,
	beds: 7,
	last_review: new Date()
	});

	await updateListing(client, operaHouseViews, { beds: 2 });

	await updateListing(client, beautifulBeachHouse, {
	address: {
	market: "Sydney",
	country: "Australia"
	}
	});

	const italianVilla = await createListing(client, {
	name: "Italian Villa",
	property_type: "Entire home/apt",
	bedrooms: 6,
	bathrooms: 4,
	address: {
	market: "Cinque Terre",
	country: "Italy"
	}
	});

	const sydneyHarbourHome = await createListing(client, {
	name: "Sydney Harbour Home",
	bedrooms: 4,
	bathrooms: 2.5,
	address: {
	market: "Sydney",
	country: "Australia"
	}
	});

	await deleteListing(client, sydneyHarbourHome);

	} finally {
	// Close the connection to the MongoDB cluster
	await client.close();
	}
	}

	main().catch(console.error);

	/**
	* Create a new Airbnb listing
	* @param {MongoClient} client A MongoClient that is connected to a cluster with the sample_airbnb database
	* @param {Object} newListing The new listing to be added
	* @returns {String} The id of the new listing
	*/
	async function createListing(client, newListing) {
	// See http://bit.ly/Node_InsertOne for the insertOne() docs
	const result = await client.db("sample_airbnb").collection("listingsAndReviews").insertOne(newListing);
	console.log(`New listing created with the following id: ${result.insertedId}`);
	return result.insertedId;
	}

	/**
	* Update an Airbnb listing
	* @param {MongoClient} client A MongoClient that is connected to a cluster with the sample_airbnb database
	* @param {String} listingId The id of the listing you want to update
	* @param {object} updatedListing An object containing all of the properties to be updated for the given listing
	*/
	async function updateListing(client, listingId, updatedListing) {
	// See http://bit.ly/Node_updateOne for the updateOne() docs
	const result = await client.db("sample_airbnb").collection("listingsAndReviews").updateOne({ _id: listingId }, { $set: updatedListing });

	console.log(`${result.matchedCount} document(s) matched the query criteria.`);
	console.log(`${result.modifiedCount} document(s) was/were updated.`);
	}

	/**
	* Delete an Airbnb listing
	* @param {MongoClient} client A MongoClient that is connected to a cluster with the sample_airbnb database
	* @param {String} listingId The id of the listing you want to delete
	*/
	async function deleteListing(client, listingId) {
	// See http://bit.ly/Node_deleteOne for the deleteOne() docs
	const result = await client.db("sample_airbnb").collection("listingsAndReviews").deleteOne({ _id: listingId });

	console.log(`${result.deletedCount} document(s) was/were deleted.`);
	}

	const {MongoClient} = require('mongodb');

	async function listDatabases(client){
	databasesList = await client.db().admin().listDatabases();

	console.log("Databases:");
	databasesList.databases.forEach(db => console.log(` - ${db.name}`));
	};

	function closeChangeStream(timeInMs = 60000, changeStream) {
	return new Promise((resolve) => {
	setTimeout(() => {
	console.log("Closing the change stream");
	changeStream.close();
	resolve();
	}, timeInMs)
	})
	};

	async function monitorListingsUsingEventEmitter(client, timeInMs = 60000, pipeline = []) {
	const collection = client.db("sample_airbnb").collection("listingsAndReviews");
	const changeStream = collection.watch(pipeline);
	changeStream.on('change', (next) => {
	console.log(next);
	});
	await closeChangeStream(timeInMs, changeStream);
	}

	async function main(){
	/**
	* Connection URI. Update <username>, <password>, and <your-cluster-url> to reflect your cluster.
	* See https://docs.mongodb.com/ecosystem/drivers/node/ for more details
	*/
	const uri = 'mongodb+srv://dbUser:dbUserPassword@cluster0-jj6uu.mongodb.net/test?retryWrites=true&w=majority';


	const client = new MongoClient(uri);

	try {
	// Connect to the MongoDB cluster
	await client.connect();
	const pipeline = [
	{
	'$match': {
	'operationType': 'delete',
	// 'fullDocument.address.country': 'Australia',
	// 'fullDocument.address.market': 'Sydney'
	},
	}
	];
	// Make the appropriate DB calls
	await listDatabases(client);
	await monitorListingsUsingEventEmitter(client, 30000, pipeline);
	} catch (e) {
	console.error(e);
	} finally {
	await client.close();
	}
	}

	main().catch(console.error);

	name=mongo-source
	connector.class=com.mongodb.kafka.connect.MongoSourceConnector
	tasks.max=1

	# Connection and source configuration
	connection.uri=mongodb+srv://dbUser:dbUserPassword@cluster0-jj6uu.mongodb.net/test?retryWrites=true&w=majority
	database=sample_airbnb
	collection=

	topic.prefix=mongo
	poll.max.batch.size=1000
	poll.await.time.ms=5000

	# Change stream options
	pipeline=[{"$match": { "$or": [{"operationType": "insert"},{"operationType": "update"}]}}]
	batch.size=0
	change.stream.full.document=updateLookup
	publish.full.document.only=true
	collation=

	name=mongo-sink
	topics=mongo.sample_airbnb.listingsAndReviews
	connector.class=com.mongodb.kafka.connect.MongoSinkConnector
	tasks.max=1

	# Message types
	key.converter=org.apache.kafka.connect.json.JsonConverter
	key.converter.schemas.enable=true

	value.converter=org.apache.kafka.connect.json.JsonConverter
	value.converter.schemas.enable=true

	key.converter.schema.registry.url=http://localhost:8081
	value.converter.schema.registry.url=http://localhost:8081

	# Specific global MongoDB Sink Connector configuration
	connection.uri=mongodb+srv://dbUser:dbUserPassword@cluster0-kppha.mongodb.net/test?retryWrites=true&w=majority
	database=sample_airbnb
	collection=listingsAndReviews
	max.num.retries=2147483647
	retries.defer.timeout=5000


	## Document manipulation settings
	key.projection.type=none
	key.projection.list=_id
	value.projection.type=none
	value.projection.list=_id

	field.renamer.mapping=[]
	field.renamer.regex=[]

	document.id.strategy=com.mongodb.kafka.connect.sink.processor.id.strategy.ProvidedInValueStrategy
	post.processor.chain=com.mongodb.kafka.connect.sink.processor.DocumentIdAdder


	# Write configuration
	delete.on.null.values=false
	writemodel.strategy=com.mongodb.kafka.connect.sink.writemodel.strategy.ReplaceOneDefaultStrategy

	max.batch.size = 0
	rate.limiting.timeout=0
	rate.limiting.every.n=0

	# Change Data Capture handling
	change.data.capture.handler=
	# com.mongodb.kafka.connect.sink.cdc.debezium.mongodb.MongoDbHandler

	# Topic override examples for the sourceB topic
	topic.override.sourceB.collection=sourceB

LD Talent Blog

Diverse Vetted Tech Talent. 5h Free Trial. See the future of work.

How to sync your MongoDB databases using Kafka and MongoDB Kafka Connector

Hire the author: Mcdavid E

Introduction

Glossary

Prerequisites

Setting up MongoDB

MongoDB Atlas Setup

Step 1

Step 2

Step 3

Step 4

Download and Setup Kafka

Configure Kafka and Zookeeper

MongoDB Change Streams

Connect to MongoDB Database

Testing the MongoDB connection

Kafka Connectors

Download and Setup MongoDB Kafka Connector

Configure MongoDB Kafka Source Connector

Configure MongoDB Kafka Sink Connector

Testing it all

Learning Tools

Learning Strategy

Reflective Analysis

Conclusion

Hire the author: Mcdavid E

Like this:

4 thoughts on “How to sync your MongoDB databases using Kafka and MongoDB Kafka Connector”

Leave a ReplyCancel reply

Subscribe!

Hire the author: Mcdavid E

Introduction

Glossary

Prerequisites

Setting up MongoDB

MongoDB Atlas Setup

Step 1

Step 2

Step 3

Step 4

Download and Setup Kafka

Configure Kafka and Zookeeper

MongoDB Change Streams

Connect to MongoDB Database

Testing the MongoDB connection

Kafka Connectors

Download and Setup MongoDB Kafka Connector

Configure MongoDB Kafka Source Connector

Configure MongoDB Kafka Sink Connector

Testing it all

Learning Tools

Learning Strategy

Reflective Analysis

Conclusion

Hire the author: Mcdavid E

Share this:

Like this:

4 thoughts on “How to sync your MongoDB databases using Kafka and MongoDB Kafka Connector”

Leave a ReplyCancel reply

Discover more from LD Talent Blog