I'm working with openstreetmap data and import it with tools into a postgres database. One key term in openstreetmap is natural.
When importing this data, a column name in the postgres database table is natural.
The issue is, when reading the table in some clients, the attribute natural is represented as "natural" which leads to issues.
Is there a way to store "natural" as natural OR help the client to read it properly?
natural is a reserved keyword in postgres:
https://www.postgresql.org/docs/current/sql-keywords-appendix.html
keywords have to be quoted if they are used as identifiers. If possible, choose a different name.
Related
I have a multi tennant application which will use the SILO Model to save data (each tennant will get an own database).
Because tennant names could be redundand my database are with GUIDs: MyApp_[GUID].
Now I want to save simple but neccesary information for each database like a tennant name and 3 to 5 more informations.
Is there a simple way to write and get these data?
The only way I can think of is to create a special table for this with only 1 row - but it seems a bot of wasting.
If you're looking for a simpler solution than a table per database (and having to deal with the awkward constraint that it must have exactly one row), you could
use a custom configuration parameter. You can change them with ALTER DATABASE. The downside is that you can only store strings, and that the settings might be overridden per session.
use a COMMENT on the database. The downside is that you can only store a single string per databasebase; the advantage is that it is automatically shown in many lists of databases such as psql's \l+ command
add your own columns to the pg_database system table. You should not mess with that, so it's a spectacularly bad idea even if you knew what you were doing, but in a relational model it's the closest to what you were asking for so I'd mention it for completeness.
I don't really advocate any of these solutions, although they do what you were asking for there's probably a better solution to your actual problem. It might be as simple a table of databases, possibly with a foreign key to pg_database, in an extra database shared by all tenants.
I have a table with one or more colums of type CLOB.
This table contains duplicate rows.
Normal mechanisms like distinct and group by don't work for CLOBs in DB2.
How can I remove the duplicates on such tables?
One way of approaching this, especially if this is something that you will need to do regularly, is to compare CLOB digests, or hashes instead of CLOBs themselves.
DB2 does not have a built-in hash function available to you, so you'll need to jump through some hoops to accomplish that. For example, you could export CLOBs as files and calculate their hashes using an OS utility.
Alternatively, you could create a simple user-defined function written in Java (which has built-in MD5 and various SHA algorithm support). One such solution is described in detail here.
You could try to utilize the dbms_lob.compare function to compare the content of CLOB fields. It is a built-in module. The supported CLOB size is up to 10MB.
As part of my final thesis, I must transform a relational database in a graph-oriented database, specifically a PostgreSQL database into a Neo4j embedded database. Now, the way is the problem. In Rik Van Bruggen's book: Learning Neo4j, he mentions a data import process using ETL activities with Trascend and MuleSoft tools, but in their official sites, there's no documentation about how to do it, neither help documentation nor examples. Apart from these tools, what other ways can I use to transform this information without using my own code?
Some modeling advice:
A well normalized relational model, which was not yet denormalized for performance reasons can be translated into the equivalent graph model.
Graph model shapes are mostly driven by use-cases, so there will be opportunity for optimization and model evolution afterwards.
A good, normalized Entity-Relationship diagram often already represents a decent graph model.
So if you still have the orignal ER diagram available, try to use it as a guide.
Here are some tips that help you with the transformation:
Each entity table is represented by a label on nodes
Each row in a table is a node
Columns on those tables become node properties.
Remove technical primary keys, keep business primary keys
Add unique constraints for business primary keys, add indexes for frequent lookup attributes
Replace foreign keys with relationships to the other table, remove them afterwards
Remove data with default values, no need to store those
Data in tables that is denormalized and duplicated might have to be pulled out into separate nodes to get a cleaner model.
Indexed column names, might indicate an array property (like email1, email2, email3)
JOIN tables are transformed into relationships, columns on those tables become relationship properties
It is important to have an understanding of the graph model before you start to import data, then it just becomes the task of hydrating that model.
LOAD CSV might be your best option, but of course it means outputting a CSV first. Here are some great resources:
http://neo4j.com/docs/stable/query-load-csv.html
http://watch.neo4j.org/video/112447027
http://jexp.de/blog/2014/06/load-csv-into-neo4j-quickly-and-successfully/
http://jexp.de/blog/2014/10/load-cvs-with-success/
http://www.markhneedham.com/blog/2014/10/23/neo4j-cypher-avoiding-the-eager/
I've also written a ruby gem which lets you write a little ruby code to import data from various sources. It's called neo4apis. You can look at the neo4apis-twitter gem to get an idea for how it works:
https://github.com/neo4jrb/neo4apis-twitter/
https://github.com/neo4jrb/neo4apis-twitter/blob/master/lib/neo4apis/twitter.rb
I've actually been wanting to implement a neo4apis-activerecord to make it easy to import from SQL with ActiveRecord
You can not directly export data from relational and import to neo4j.
Because these are two different database structures.
Relational Database -
A relational database is a set of tables containing data fitted into predefined categories. Each table (which is sometimes called a relation) contains one or more data categories in columns. Each row contains a unique instance of data for the categories defined by the columns.
Graph-oriented database -
A graph database is essentially a collection of nodes and edges. Each node represents an entity (such as a person or business) and each edge represents a connection or relationship between two nodes.
Sollution To your Problem-
First, you need to design Neo4j Data structure. e.g What will be the nodes you required, what will be the relationships between the nodes.
After that you create Script in your application language to fetch data from relational database and insert it into neo4j.
Load CSA is a option to Import/Export (backup) functionality with graph database. you can not directly Export/Import data from Relational DB to Graph DB
As part of some requirement, I need to migrate a schema from some existing database to a new schema in a different database. Some part of it is already done and now I need to compare the 2 schema and make changes in the new schema as per gap finding.
I am not using a tool and was trying to understand some details using syscat command but could not get much success.
Any pointer on what is the best way to solve this?
Regards,
Ramakant
A tool really is the best way to solve this – IBM Data Studio is free and can compare schemas between databases.
Assuming you are using DB2 for Linux/UNIX/Windows, you can do a rudimentary compare by looking at selected columns in SYSCAT.TABLES and SYSCAT.COLUMNS (for table definitions), and SYSCAT.INDEXES (for indexes). Exporting this data to files and using diff may be the easiest method. However, doing this for more complex structures (tables with range or database partitioning, foreign keys, etc) will become very complex very quickly as this information is spread across a lot of different system catalog tables.
An alternative method would be to extract DDL using the db2look utility. However, you can't specify the order that db2look outputs objects (db2look extracts DDL based on the objects' CREATE_TIME), so you can't extract DDL for an entire schema into a file and expect to use diff to compare. You would need to extract DDL into a separate file for each table.
Use SchemaCrawler for IBM DB2, a free open-source tool that is designed to produce text output that is designed to be diffed. You can get very detailed information about your schema, including view and stored procedure definitions. All of the information that you need will be output in a single file, and can be compared very easily using a standard diff tool.
Sualeh Fatehi, SchemaCrawler
unfortunately as per company policy, cannot use these tools at this point of time. So am writing some program using JDBC to get the details and do some comparison kind of stuff.
Caveats:
Let me first clarify that this is not a question about whether to use surrogates primary keys or not.
Also, this is NOT a related to identities (SQL Server) / Sequences (Oracle) and their pros / cons. I did get a fair bit of idea about that thanks to this, this and this
Question:
I come from a SQL Server background and have always used identity columns as surrogate primary keys for most tables.
Based on my knowledge of Oracle, I find that the nearest equivalent in Oracle are SEQUENCES which can be used to simulate something similar to Identity in SQL server.
As I am new to Oracle and my database has 100+ tables, the main thing that i am concerned about :-
Considering i have to create a sequence for each table in Oracle (almost), would this be the standard accepted implementation for simulating Identity or is there a better / easier way to achieve this kind of implementation in Oracle?
Are there any specific GOTCHA's related to having so many sequences in Oracle?
The system supports both Oracle 10G and 11G
Considering i have to create a
sequence for each table in Oracle
(almost), would this be the standard
accepted implementation for simulating
Identity or is there a better / easier
way to achieve this kind of
implementation in Oracle?
Yes, it is very typical in Oracle to create a sequence for each table. It is possible to use the same sequence for several tables, but you run the risk of making it a bottleneck by using a single sequence for many/all tables: see this AskTom q/a
Are there any specific GOTCHA's
related to having so many sequences in
Oracle?
None that I can think of.
100+ tables is not very many. I routinely work on databases with several hundred sequences, one for each table. The more the merrier.
It's even conceivable to have more sequences than tables - unlike identity columns in other DBMSs, sequences can be used for more than just creating surrogate key values.
An alternative is to use GUIDs - in Oracle you can call SYS_GUID to generate unique values.
A good article, followed by comments with pros and cons for both approaches: http://rwijk.blogspot.com/2009/12/sysguid.html