What is the benefit of accessing Amazon Redshift from SQL Server Management? - amazon-redshift

I have seen a video on YouTube about accessing data in Amazon Redshift from SSMS. Could someone tell me how this will benefit me in terms of money and performance? Is this feasible or is there any disadvantage in accessing data stored in the Redshift from SSMS and performing queries and analysis?

SSMS is an SQL client & like any other SQL client, it forwards on the SQL queries that you write. Your queries affect the cost & performance of your database solution, not the SQL client.
There's no disadvantage or advantage, it's your choice which SQL client to use if any.

Related

Stored procedure implementation instead of hard coded queries In Postgres

Aurora Postgres 11.9
In SQL Server we strictly follow the good programming practice that "every single call land on DB from the application will be a stored procedure instead of simple queries". In Oracle, we haven't experienced the same thing may be due to select stored procedures required additional cursors, and so on.
Can any expert Postgres person advise me what practice should we follow in progress in this regard and what are pros and cons in this case of Postgres?
In addition in SQL Server we use "rowversion" for data sync with BI and other external modules, is there any built-in alternate available in Postgres or should we have to do it with manual triggers?

Expose Redshift tables through REST API

I am currently contemplating on how to expose data present in Redshift tables in a meaningful and consistent way through REST API.
The way I want it to work is that caller calls the API and then we do some kind of dynamic querying on the tables. I am worried about the latency as the queries could range from simple to very complicated ones. Since Redshift requires connecting to the database as a client, Some of the approaches we could have are:
Create a lambda function connecting to Redshift, which is invoked through API gateway
Using OData to create RESTful APIs. However, I don't think Redshift supports OData out of the box.
I am inclining towards OData since it has advance filtering options along with pagination.
I am seeking advice, will OData be enough and if yes, how exactly one integrates OData with redshift.
Any other advise/approaches are welcome too.
Thanks!
Let me go over the different options:
Redshift data api
Redshift data API let's you invoke queries and get their results in an asynchronous manner.
You can use the API directly from front-end, or you can put it behind API Gateway.
Lambda
If you trust your users and can get a proper authentication you can simply invoke the Lambda directly from front-end and pass it some SQL to run or generate SQL based on the parameters. You can potentially swap this with Athena using federated query. Optionally you can add in API Gateway for some additional features like rate-limiting and different forms of authentication. Keep in mind that both Lambda and API Gateway have limit in terms of data returned and execution time.
For long running queries I would suggest that the Lambda, API Gateway or even from the front-end itself invoke an AWS Glue Python Shell Job which will use an unload query to drop the results in S3. The front-end can pool for when the job is done.
If you have few types of queries then you can make proper rest API.
Instead of Lambda, you can also use Amazon Athena Federated Query, which you can actually query with directly from the front-end.
OData Implementation
There are third party OData implementations for Redshift. Just google it. With a front-end library that consumes OData(I used KendoUI in the past) you can potentially make a working feature rich front-end in days. The main concern with this option is the tools costs may over your budget. Of course the hours you spent making things is also a cost but it really depends on what are your actual requirements.
So how to choose?
Depending on your requirements I would suggest simply going through the option and selecting them based on costs, time to implementation, performance, reliability and security.
How about Redshift performance?
This is the most difficult part about Redshift and on-demand queries. On Redshift you don't have indexes, data can be compressed and the data is stored columnar fashion. All of these can make Redshift slower than your average relational database for a random query.
However you can make sure that your table is sorted with a distribution style that matches your queries and your queries use the columnar storage to their advantage(not all columns are requested), then it can be faster.
Another thing to keep in mind is that Redshift doesn't handle concurrency well, I believe by default there can only 8 concurrent queries, which you can increase it but you definitely wouldn't to go more than 20.
If your users can wait for their queries(I've seen bad queries go over 2 hours. I'm sure you can make them take longer, then Redshift is fine, if not then you could try putting Postgres in front of Redshift by using external tables and then use your average indexes in front of it to speed things up.

Migrating to a Nosql DB from Oracle

I have a large code base of an online charging application that is tightly coupled to Oracle and relies extensively on SQL queries , PL/SQL procedures etc.
In case , we are to migrate to a NO SQL based DB , would all the code need to be rewritten or are there some already available libraries/drivers that do the job of translation of sql queries to no-sql queries automatically by simply having us define a mapping between the current Oracle Schema and the new underlying NO-SQL DB schema (designed afresh)?
Thanks
You are going to rewrite a lot of things.
Relational database and nosql "things" are so different. And nosql are not transactional, eccept for documents.
You can save money going to mysql or postgresql (suggested) but still you have to implement a lot of things and you need to study proxy, connection pooling when you need to scale.
But, you can save a lot of work with Postgres plus advanced server of enterprise db: http://www.enterprisedb.com/products-services-training/products/postgres-plus-advanced-server
They say you can switch db without a single line to be changed. And save money.
Then you can access things like partitioning that will cost a lot in enterprise version of Oracle.

Copy Large Result Set From SQL Server to Redis or a NoSql Database

We have a large table in SQL Server (100+ million rows) that we would like to export to either Redis or another NoSQL database like RavenDB or MongoDB for efficient caching.
We would like this export to happen once or twice a day.
What would be the best way to do this and make sure that both SQL Server and the caching layer does not experience performance issues at the time of the process.
Note we are C# developers but do not have the option of using NServiceBus.

PostgreSql or SQL Server 2008 R2 should be use with .Net application using entity framework?

I have a database in PostgreSQL with millions of records and I have to develop a website that will use this database using Entity Framework (using dotnetConnect for PostgreSQL driver in case of PostgreSQL database).
Since SQL Server and .Net are both native to the Windows platform, should I migrate the database from PostgreSQL to SQL Server 2008 R2 for performance reasons?
I have read some blogs comparing the two RDBMS' but I am still confused about which system I should use.
There is no clear answer here, as its subjective, however this is what I would consider:
The overhead of learning a new DBMS and its tools.
The SQL dialects each RDBMS uses and if you are using that dialect currently.
The cost (monetary and time) required to migrate from PostgreSQL to another RDBMS
Do you or your client have an ongoing budget for the new RDBMS? If not, don't make the mistake of developing an application to use a RDBMS that will never see the light of day.
Personally if your current database is working well I wouldn't change. Why fix what isn't broke?
You need to find out if there is actually a problem, and if moving to SQL Server will fix it before doing any application changes.
Start by ignoring the fact you've got .net and using entity framework. Look at the queries that your web application is going to make, and try them directly against the database. See if its returning the information quick enough.
Only if, after you've tuned indexes etc. you can't make the answers come back in a time you're happy with should you decide the database is a problem. At that point it makes sense to try the same tests against a SQL Server database, but don't just assume SQL Server is going to be faster. You might find out that neither can do what you need, and you need to use faster disks or more memory etc.
The mechanism you're using to talk to a database (DotConnect or Microsoft drivers) will likely be a very minor performance consideration, considering the amount of information flowing (SQL statements in one direction and result sets in the other) is going to be almost identical for both technologies.