Bi-directional database syncing for Postgres and Mongodb - mongodb

Let's say I have a local server running and I also have an exactly similar server already running on amazon.
Both server can CRUD data to its databases.
Note that the servers use both `postgres` and `mongodb`.
Now when no one is using the wifi (usually in the night), I would like to sync both postgres and mongodb databases so that all writes from each database on server to each database on local gets properly applied.
I don't want to use Multi-Master because:
MongoDB does not support this architecture itself, so perhaps I will need a complex alternative.
I want to control when and how much I sync both databases.
I do not want to use network bandwidth when others are using the internet.
So can anyone show me right direction.
Also, if you list some tools that solve my problem, it will be very helpful.
Thanks.

We have several drivers what would be able to help you with this process. I'm presuming some knowledge of software development and will showcase our ADO.NET Provider for MongoDB, which using the familiar-looking MongoDBConnection, MongoDBCommand, and MongoDBDataReader objects.
First, you'll want to create your connection string for connecting with you cloud MongoDB instance:
string connString = "Auth Database=test;Database=test;Password=test;Port=27117;Server=http://clouddbaddress;User=test;Flatten Objects=false";
You'll note that we have the Flatten Objects property set to false, this ensures that any JSON/BSON objects contained in the documents will be returned as raw JSON/BSON.
After you create the connection string, you can establish the connection and read data from the database. You'll want to store the returned data in some way that would let you access it easily for future use.
List<string> columns = new List<string>();
List<object> values;
List<List<object>> rows = new List<List<object>>();
using (MongoDBConnection conn = new MongoDBConnection(connString))
{
//create a WHERE clause that will limit the results to newly added documents
MongoDBCommand cmd = new MongoDBCommand("SELECT * FROM SomeTable WHERE ...", conn);
rdr = cmd.ExecuteReader();
results = 0;
while (rdr.Read())
{
values = new List<object>();
for (int i = 0; i < rdr.FieldCount; i++)
{
if (results == 0)
columns.Add(rdr.GetName(i));
values.Add(rdr.GetValue(i));
}
rows.Add(values);
results++;
}
}
After you've collected all of the data for each of the objects that you want to replicated, you can configure a new connection to your local MongoDB instance and build queries to insert the new documents.
connString = "Auth Database=testSync;Database=testSync;Password=testSync;Port=27117;Server=localhost;User=testSync;Flatten Objects=false";
using (MongoDBConnection conn = new MongoDBConnection(connString)) {
foreach (var row in rows) {
//code here to create comma-separated strings for the columns
// and values to be inserted in a SQL statement
String sqlInsert = "INSERT INTO backup_table (" + column_names + ") VALUES (" + column_values + ")";
MongoDBCommand cmd = new MongoDBCommand(sqlInsert, conn);
cmd.ExecuteQuery();
}
At this point, you'll have inserted all of the new documents. You could then change your filter (the WHERE clause at the beginning) to filter based on updated date/time and update their corresponding entries in the local MongoDB instance using the UPDATE command.
Things to look out for:
Be sure that you're properly filtering out new/updated entries.
Be sure that you're properly interpreting the type of variable so that you properly surround with quotes (or not) when entering the values in the SQL query.
We have a few drivers that might be useful to you. I demonstrated the ADO.NET Provider above, but we also have a driver for writing apps in Xamarin and a JDBC driver (for Java).

Related

Mirth Database Reader failed to process row retrieved from the database in channel (index out of range)?

I have a Mirth (v3.10) Database Reader channel source that grabs some test records (from an SQL Server source) using the query...
select *
from [mydb].[dbo].[lab_test_MIRTHTEST_001]
where orc_2_1_placer_order_number
in (
'testid_001', 'testid_002', 'testid_003'
)
Even though the channel appears to function properly and messages are getting written to the channel destination, I am seeing SQL errors in the server logs in the dashboard when deploying the channel:
[2020-12-16 08:16:28,266] ERROR (com.mirth.connect.connectors.jdbc.DatabaseReceiver:268): Failed to process row retrieved from the database in channel "MSSQL2SFTP_TEST"
com.mirth.connect.connectors.jdbc.DatabaseReceiverException: com.microsoft.sqlserver.jdbc.SQLServerException: The index 1 is out of range.
at com.mirth.connect.connectors.jdbc.DatabaseReceiverQuery.runPostProcess(DatabaseReceiverQuery.java:233)
at com.mirth.connect.connectors.jdbc.DatabaseReceiver.processRecord(DatabaseReceiver.java:260)
...
I can run this query fine in the SQL Server Mgmt Studio itself (and the messages seem to be transmitting fine), so not sure why this error is popping up but am concerned there is something I'm missing here.
Anyone with more experience know what is going on here? How to fix?
The issue looks to be in the post-process SQL section of the Database Reader, so it makes sense that the messages appear to be working.
Did you intend to enable the post-process section at the bottom of your source tab?
Kindly share the code that you are using to process data in the result set. In the meantime, you can consider the code below as a staring point. You can place this in Javascript transformer step in the source connector of your channel.
//Declaring variables to hold column values returned from the result set
var variable1;
var variable2;
//defining the sql read command
var Query = "select * from [mydb].[dbo].[lab_test_MIRTHTEST_001]";
Query += " where orc_2_1_placer_order_number in";
Query += " ('testid_001', 'testid_002', 'testid_003')";
var result = dbconn.executeCachedQuery(Query);
//where dbconn is your database connection string
//looping through the results
while(result.next())
{
variable1=result.getString("variable1");
variable2 = result.getString("variable2");
}
//optionally place the returned values in a channel map for use later
$c('variable1',variable1);
$c('variable2',variable2);

how read-through work in ignite

my cache is empty so sql queries return null.
The read-through means that if the cache is missed, then Ignite will automatically get down to the underlying db(or persistent store) to load the corresponding data.
If there are new data inserted into the underlying db table ,i have to down cache server to load the newly inserted data from the db table automatically or it will sync automatically ?
Is work same as Spring's #Cacheable or work differently.
It looks to me that the answer is no. Cache SQL query don't work as no data in cache but when i tried cache.get in i got following results :
case 1:
System.out.println("data == " + cache.get(new PersonKey("Manish", "Singh")).getPhones());
result ==> data == 1235
case 2 :
PersonKey per = new PersonKey();
per.setFirstname("Manish");
System.out.println("data == " + cache.get(per).getPhones());
throws error:- as following
error image, image2
Read-through semantics can be applied when there is a known set of keys to read. This is not the case with SQL, so in case your data is in an arbitrary 3rd party store (RDBMS, Cassandra, HBase, ...), you have to preload the data into memory prior to running queries.
However, Ignite provides native persistence storage [1] which eliminates this limitation. It allows to use any Ignite APIs without having anything in memory, and this includes SQL queries as well. Data will be fetched into memory on demand while you're using it.
[1] https://apacheignite.readme.io/docs/distributed-persistent-store
When you insert something into the database and it is not in the cache yet, then get operations will retrieve missing values from DB if readThrough is enabled and CacheStore is configured.
But currently it doesn't work this way for SQL queries executed on cache. You should call loadCache first, then values will appear in the cache and will be available for SQL.
When you perform your second get, the exact combination of name and lastname is sought in DB. It is converted into a CQL query containing lastname=null condition, and it fails, because lastname cannot be null.
UPD:
To get all records that have firstname column equal to 'Manish' you can first do loadCache with an appropriate predicate and then run an SQL query on cache.
cache.loadCache((k, v) -> v.lastname.equals("Manish"));
SqlFieldsQuery qry = new SqlFieldsQuery("select firstname, lastname from Person where firstname='Manish'");
try (FieldsQueryCursor<List<?>> cursor = cache.query(qry)) {
for (List<?> row : cursor)
System.out.println("firstname:" + row.get(0) + ", lastname:" + row.get(1));
}
Note that loadCache is a complex operation and requires to run over all records in the DB, so it shouldn't be called too often. You can provide null as a predicate, then all records will be loaded from the database.
Also to make SQL run fast on cache, you should mark firstname field as indexed in QueryEntity configuration.
In your case 2, have you tried specifying lastname as well? By your stack trace it's evident that Cassandra expects it to be not null.

Code first migration for a SQL Server CE database file

MigrateDatabaseToLatestVersion is used. The database that is stored within SQL Server Express is updated.
When opening a local stored .sdf file (SQL Server CE database) with a valid path and file name, this file is not updated.
Database.SetInitializer(new MigrateDatabaseToLatestVersion<DTDataContext, Configuration>());
var connection = DTDataContext.GetConnectionSqlServerCE40(fullPathName);
dataBaseContext = new DTDataContext(connection, true);
dataBaseContext.Database.Initialize(true);
The MigrationHistory entries will be made in SQL Server Express and not in the local SQL Server CE database file.
What would be the easiest way to update a local SQL Server CE database file?
After a few experiments, an adequate solution was found (which fits for my purpose).
The question was focused about the old sdf(s) that were previously written but with an older model in contrast to the code.
I decided not to migrate old files (which are applied as a kind of backups).
Only reading will be made within those files. Obviously, it is possible that newer sdf(s) will be read once in the future but that's not a big deal.
Before reading stuff of an entity that could maybe not exist (in a sdf), it will be checked via SqlQuery and count(*).
[System.Diagnostics.CodeAnalysis.SuppressMessage( "Microsoft.Design", "CA1031:DoNotCatchGeneralExceptionTypes" )]
private bool TestIfTableExists( string tableName, DTDataContext dataContext )
{
try
{
int cnt = dataContext.Database.SqlQuery<int>( "select count(*) from " + tableName ).First();
return cnt > 0;
}
catch( Exception ex ) { /*available SqlCeException assembly does not fit --- table does not exist*/ return false; }
}
btw When using SqlCeException (v3.5), which could be provided as a reference via the assembly search, the above situation would fail (=unhandled exception!). Have not tested it with v4 because I wanna avoid a 'manual' reference because it must be checked in (no need for any path problems with other workstations).
Concerning writing a sdf:
When writing a new sdf with the current model, this is not a problem at all.
Database.CreateIfNotExists() was applied.
In my case, updating a sdf was not necessary --- and a quick solution for that was not found.

C# Comparing lists of data from two separate databases using LINQ to Entities

I have 2 SQL Server databases, hosted on two different servers. I need to extract data from the first database. Which is going to be a list of integers. Then I need to compare this list against data in multiple tables in the second database. Depending on some conditions, I need to update or insert some records in the second database.
My solution:
(WCF Service/Entity Framework using LINQ to Entities)
Get the list of integers from 1st db, takes less than a second gets 20,942 records
I use the list of integers to compare against table in the second db using the following query:
List<int> pastDueAccts; //Assuming this is the list from Step#1
var matchedAccts = from acct in context.AmAccounts
where pastDueAccts.Contains(acct.ARNumber)
select acct;
This above query is taking so long that it gives a timeout error. Even though the AmAccount table only has ~400 records.
After I get these matchedAccts, I need to update or insert records in a separate table in the second db.
Can someone help me, how I can do step#2 more efficiently? I think the Contains function makes it slow. I tried brute force too, by putting a foreach loop in which I extract one record at a time and do the comparison. Still takes too long and gives timeout error. The database server shows only 30% of the memory has been used.
Profile the sql query being sent to the database by using SQL Profiler. Capture the SQL statement sent to the database and run it in SSMS. You should be able to capture the overhead imposed by Entity Framework at this point. Can you paste the SQL Statement emitted in step #2 in your question?
The query itself is going to have all 20,942 integers in it.
If your AmAccount table will always have a low number of records like that, you could just return the entire list of ARNumbers, compare them to the list, then be specific about which records to return:
List<int> pastDueAccts; //Assuming this is the list from Step#1
List<int> amAcctNumbers = from acct in context.AmAccounts
select acct.ARNumber
//Get a list of integers that are in both lists
var pastDueAmAcctNumbers = pastDueAccts.Intersect(amAcctNumbers);
var pastDueAmAccts = from acct in context.AmAccounts
where pastDueAmAcctNumbers.Contains(acct.ARNumber)
select acct;
You'll still have to worry about how many ids you are supplying to that query, and you might end up needing to retrieve them in batches.
UPDATE
Hopefully somebody has a better answer than this, but with so many records and doing this purely in EF, you could try batching it like I stated earlier:
//Suggest disabling auto detect changes
//Otherwise you will probably have some serious memory issues
//With 2MM+ records
context.Configuration.AutoDetectChangesEnabled = false;
List<int> pastDueAccts; //Assuming this is the list from Step#1
const int batchSize = 100;
for (int i = 0; i < pastDueAccts.Count; i += batchSize)
{
var batch = pastDueAccts.GetRange(i, batchSize);
var pastDueAmAccts = from acct in context.AmAccounts
where batch.Contains(acct.ARNumber)
select acct;
}

SQL Server CE. Delete data from all tables for integration tests

We are using SQL Server CE for our integration tests. At the moment before every test, we delete all data from all columns, then re-seed test data. And we drop the database file when the structure changes.
For deletion of data we need to go through every table in correct order and issue Delete from table blah and that is error-prone. Many times I simply forget to add delete statement when I add new entities. So it would be good if we can automate data-deletion from the tables.
I have seen Jimmy Bogard's goodness for deletion of data in the correct order. I have implemented that for Entity Frameworks and that works in full-blown SQL Server. But when I try to use that in SQL CE for testing, I get exception, saying
System.Data.SqlServerCe.SqlCeException : The specified table does not exist. [ ##sys.tables ]
SQL CE does not have supporting system tables that hold required information.
Is there a script that works with SQL CE version that can delete all data from all tables?
SQL Server Compact does in fact have system tables listing all tables. In my SQL Server Compact scripting API, I have code to list the tables in the "correct" order, not a trivial task! I use QuickGraph, it has an extension method for sorting a DataSet. You should be able to reuse some of that in your test code:
33
public void SortTables()
{
var _tableNames = _repository.GetAllTableNames();
try
{
var sortedTables = new List<string>();
var g = FillSchemaDataSet(_tableNames).ToGraph();
foreach (var table in g.TopologicalSort())
{
sortedTables.Add(table.TableName);
}
_tableNames = sortedTables;
//Now iterate _tableNames and issue DELETE statement for each
}
catch (QuickGraph.NonAcyclicGraphException)
{
_sbScript.AppendLine("-- Warning - circular reference preventing proper sorting of tables");
}
}
You must add the QuickGraph DLL files (from Codeplex or NuGet) and you can find the implementation of GetAllTableNames and FillSchemaDataSet here http://exportsqlce.codeplex.com/SourceControl/list/changesets (in Generator.cs and DbRepository.cs)