Using Slick with Kudu/Impala - scala

Kudu tables can be accessed via Impala thus its jdbc driver. Thanks to that it is accessable via standard java/scala jdbc api. I was wondering if it is possible to use slick for it. Or if not is any other high level scala db framework supporting impla/kudu.

Slick can be used with any JDBC database
http://slick.lightbend.com/doc/3.3.0/database.html

At least, for me, Slick is not fully compatible with Impala Kudu. Using Slick, I can not modify db entities, can not create, update or delete any item. It works only to read data.

There are two ways you could use Slick with an arbitrary JDBC driver (and SQL dialect).
The first is to use low-level JDBC calls. The SimpleDBIO class gives you access to a JDBC connection:
val getAutoCommit = SimpleDBIO[Boolean](_.connection.getAutoCommit)
That example is from the Slick manual.
However, I think you're more interested in working at a higher level than that. In that case, for Slick, you'd need to implement a custom Profile. If Impala is similar enough to an existing database profile, you may be able to extend an existing profile and adjust it to account for any differences. For example, this would allow you to customize how SQL is formatted for Impala, how timestamps are represented, how column names are quoted. The documentation on Porting SQL from Other Database Systems to Impala would give you an idea of what needs to change in a driver.
Or if not is any other high level scala db framework supporting impla/kudu.
None of the main-stream libraries seem to support Impala as a feature. Having said that, the Doobie documentation mentions customising connections for Hive. So Doobie may be worth quickly trying Doobie to see if you can query and insert, for example.

Related

Is it possible to evaluate a Postgres expression without connecting to a database?

PostgreSQL has excellent support for evaluating JSONPath expressions against JSON data.
For example, this query returns true because the value of the nested field is indeed "foo".
select '{"header": {"nested": "foo"}}'::jsonb #? '$.header ? (#.nested == "foo")'
Notably this query does not reference any schemas or tables. Ideally, I would like to use this functionality of PostgreSQL without creating or connecting to a full database instance. Is it possible to run PostgreSQL in such a way that it doesn't have schemas or tables, but is still able to evaluate "standalone" queries?
Some other context on the project, we need to evaluate JSONPath expressions against JSON data in both a Postgres database and Python application. Unfortunately, Python does not have any JSONPath libraries that support enough of the spec to be useful to us.
Ideally, I would like to use this functionality of PostgreSQL without creating or connecting to a full database instance.
Well, it is open source. You can always pull out the source code for this functionality you want and adapt it to compile by itself. But that seems like a large and annoying undertaking, and I probably wouldn't do it. And short of that, no.
Why do you need this? Are you worried about scalability or ease of installation or performance or what? If you are already using PostgreSQL anyway, firing up a dummy connection to just fire some queries at the JSONB engine doesn't seem too hard.

How to use Solr on Postgresql and index a table

I am new to Solr with the specific need to crawl existing database table and generate results.
Any online example/tutorial so far only explains about you give documents and it gets indexed, but not any indication of how to do same on database.
Can anyone please explain steps how to achieve this ?
Links like this wiki shows everything with jdbc driver and mysql so I even doubt if Solr supports this with .NET or not. My tech boundries are in C# and Postgresql
You have stumpled over the included support for JDBC already, but you have to use the postgres JDBC driver. The example will be identical with the MySQL one, but you'll have to use the proper URL for postgres instead and reference the JDBC driver (which will depend on which Postgres JDBC driver you use).
jdbc:postgresql://localhost/test
This is a configuration option in Solr, and isn't related to .NET or other external dependencies.
However, the other option is to write the indexing code yourself, and this can often be a good solution as it makes it easier to pre-process the content and apply certain logic before storing content in Solr. For .NET you have Solrnet, a Solr client, that'll make it easy to both query from and submit documents to Solr.

play framework anorm for different database

I am new to Scala as well as play framework with Scala 2.0. I like the idea of writing the SQL code myself and have full control rather than depend on ORM tool. But does Anorm SQL work across different database vendors like MySQL and Oracle? Since I am writing an application which should be capable to work with any Relational database, my requirement is to write SQL which should work across databases since my application should work with vendor database.
Some vendor might have Oracle and some might have MySQL. So my code should be DB agnostic.Is this possible in Scala as I know that quires which run on mysql will not run on Oracle.
Thanks in Advance,
Pradeep
Short answer: NO.
Long answer: Anorm is just a library for dispatching your SQL queries to the database through JDBC, retrieving the results and delivering them to you. It does not understand the differences between different databases because it relies on JDBC for connection handling, and on you for writing queries.
You either have to handle different DB engines yourself or have an ORM handle that for you.
PS: Unless you really need to have a DB agnostic application (and fully understand its implications), I'd suggest you simply target 2-3 popular engines and avoid the future complications.

Mongodb access through sql like syntax

Is there any library where i can access mongodb by using sql like syntax.
Example
use db
select * from table1
insert into table1 values (a,b,c)
delete from table
select a,b,count(*) from table1 group by a,b
select a.field1,b.field2 from a,b where a.id=b.id
Thanks
Raman
The learning curve is small only if you are only doing extremely simple sql queries. If the extent of your SQL querying is "select * from X", then MongoDB looks like a brilliant idea to cut through all the too-complicated SQL. But if you need to perform left outer joins, test for null, check for ranges, subselects, grouping and summation, then you will soon end up with a round concave dent in your desk after being moved to Mongo. The sick punchline is that half the time, the thing you are trying to do can't be done in the Mongo interface. Mongo represents a bold new world where instead of databases doing things like aggregation and query optimization, it just stores data and all the magic is done by retrieving everything, slowly, storing it in app memory, and doing all that stuff in code instead.
YES!
A company called UnityJDBC makes a JDBC driver for mongodb. Unlike the mongo java driver, this JDBC driver allows you to run SQL queries against MongoDB and the driver is supported by any Java appliaction that uses JDBC.
to download this driver go to...
http://www.unityjdbc.com/mongojdbc/mongo_jdbc.php
Its free to download too!
hope this helps
MoSQL might satisfy your needs. It'll require you to run a new PostgreSQL instance but from there you can query your entire Mongo dataset with SQL.
"MoSQL imports the contents of your MongoDB database cluster into a PostgreSQL instance, using an oplog tailer to keep the SQL mirror live up-to-date. This lets you run production services against a MongoDB database, and then run offline analytics or reporting using the full power of SQL."
Have a look at this recent project: http://www.mongosql.com/. I've been looking at it over the last few weeks and it looks very promising.
For those of you who have questioned the usefulness of SQL against MongoDB, consider the large number of not-very-technical users in many organizations, like business analysts, who may know SQL, but don't want to make the leap to JavaScript and JSON. Tools like mongoSQL can help push the adoption of MongoDB in an organization.
There are a few solutions out there, but nearly all of them fail to truly represent the MongoDB data model in a way that the "relationally" minded ODBC/JDBC applications and users desire/require. A recent commercial product was released that addresses these challenges
ODBC:
http://www.progress.com/products/datadirect-connect/odbc-drivers/data-sources/mongodb
JDBC:
http://www.progress.com/products/datadirect-connect/jdbc-drivers/data-sources/mongodb
To address the need for ODBC/JDBC (SQL) access...While there are strong arguments for writing new applications using Mongo's clients, there is still a strong need in the marketplace for quality ODBC/JDBC and SQL based access to MongoDB. This need largely arises from all the reporting, analytic, and BI applications that rely on ODBC/JDBC connectivity and do not offer native integration with MongoDB.
Free NoSQL Viewer supports conversion of SQL queries to MongoDB shell syntax. Furthermore, in SQL Viewer you can even use SQL SELECT statements to query MongoDB collections data without knowing MongoDB query syntax. Check out NoSQL Viewer here www.spviewer.com/nosqlviewer.html
Mongodb and its current driver do not support direct SQL like syntax.
However, all operations are easily doable with the driver specific operations.
Here is a brief mapping of mongodb operations to corresponding SQL like query :
http://www.mongodb.org/display/DOCS/SQL+to+Mongo+Mapping+Chart
There are a couple projects underway to emulate a SQL interface for MongoDB. While they provide a familiar interface, in general they should be avoided. They operate on a fundamentally flawed premise in that they parse strings and translate them into method calls.
Once you work with MongoDB you will find the approach of using classes and methods a much more accessible interface as it works exactly like all other parts of your application. Yes there is a small learning curve as you first start, but for the most part, the interface in MongoDB works how you would expect it to.

Experiences with PostgreSQL Java/JDBC Copy API for bulk inserts

With version 8.4 PostgreSQL finally integrated a proprietary API into their JDBC driver, which allows stream based inserts and selects. The so called Copy API grants access to COPY TO/COPY FROM SQL commands, which read text data from a stream/reader into one table at a time or write text data to a stream/writer from one table. Constraints and triggers are regarded for insert operations. Basic transformations (delimiter, quotation, null values etc.) are available. The performance gain is quite impressive, which probably is because of less object instantiation and a much simpler protocol between client and server backend.
Has anyone experiences with this API, good or bad. Is it production ready? Are there any pitfalls one has to be aware of? BTW: The fact that it is a proprietary API is a non-issue for me.
The COPY API is present in PostgreSQL C library for at least 6 years. It is very stable.
See: http://www.postgresql.org/docs/9.0/interactive/libpq-copy.html
and http://www.postgresql.org/docs/9.0/interactive/sql-copy.html
JDBC implementation should have same properties, but I haven't used it.
PS. I think there is a misunderstanding when you call this "proprietary". Both protocol specification and server/client/driver source code is free (as in freedom).