What applications do you use for data entry and retrieval via ODBC? - forms

What apps or tools do you use for data entry into your database? I'm trying to improve our existing (cumbersome) system that uses a php web based system for entering data one ... item ... at ... a ... time.
My current solution to this is to use a spreadsheet. It works well with text and numbers that are human readable, but not with foreign keys that are used to join with the other table's rows.
Imagine that I want a row of data to include what city someone lives in. The column holding this is id_city, which is keyed to the "city" table which has two columns: id (serial) and name (text).
I envision being able to extend the spreadsheet capabilities to include dropdown menu's for every row of the id_city column that would allow the user to select which city (displaying the text of the city names), but actually storing the city id chosen. This way, the spreadsheet would:
(1) show a great deal of data on each screen and
(2) could be exported as a csv file and thrown to our existing scripts that manually insert rows into the database.
I have been playing around with MS Excel and Access, as well as OpenOffice's suite, but have not found something that gives me the functionality I mention above.
Other items on my wish-list:
(1) dynamically fetch the name of cities that can be selected by the user.
(2) allow the user to push the data directly into the backend (not via external files/scripts.
(3) If any of the columns of the rows of data gets changed in the backend, the user could refresh the data on the screen to reflect any recent changes.
Do you know how I could improve the process of data entry? What tools do you use? I use PostgreSQL for the backend and have access to MS Office, OpenOffice, as well as web based solutions. I would love a solution that is flexible, powerful, and doesn't require much time to develop or deploy (I know, dream on...)
I know that pgAdmin3 has similar functionality, but from what I have seen, it is more of an administrative tool rather than something for users to use.

As j_random_hacker noted, I've used MS Access for years (since Access 97) to connect to an ODBC Data Source.
You can do this via linking to external tables: (in Access 2010:)
New -> Blank Database
External Data -> ODBC Database -> Link to Data Source
Machine Data Source -> New -> System Data Source -> Select Driver (Oracle, or whatever) -> Finish
Enter a new name for your DSN, the all of the connection parameters, then click OK
Select newly created DSN, hit ok.
You can do so much once Access sees your external table as a linked table, including sorting, filtering, etc. There's one caveat: as far as I can tell, ALL operations happen on the client side unless you're using a pass-through query. That's fine if you're looking at a table with 3000 records. With 2,000,000 records, that hurts. To be clear, all data in the table comes down to the workstation, for all tables being joined, and the join happens client-side, NOT server-side.

There are usually standalone tools for basic database management - e.g., for Oracle and MySQL a free tool called SQL Developer suffices for basic database data entry.
For more complex types (especially involving clobs) I can usually knock an application together in Java+SWT in a day if we already have the model and DAOs available on the Java side. Yeah, you have to put some effort in, but if it will be used regularly in the future then it is probably worth it.
In your case (well, the case where you have bulk imports of data) knocking up some Perl that reads from the CSV and does the city id lookup would be trivial to implement. Maybe a waste for a one-off thing? Depends on the amount of data to import.

I would be surprised if MS Access can't do what you're looking for -- this is basically the exact use case for it. Namely, quickly throwing together a nice UI for a simple CRUD DB application that a spreadsheet doesn't quite stretch to.

This is an answer, technically, but not a recommendation:
I've used Excel and SSIS for importing simple data entry files into MS SQL, but it's not adequate - there's very little ability to control the data, and SSIS is so very touchy, especially when working with Excel.

MS Access does not work well with some non-Microsoft databases. There is an open-source equivalent called Apache OpenOffice Base you may want to try.

Related

How to copy Tableau Data Extract logic?

Someone in my org created a Data Extract. There is an issue in one of the worksheets that uses it, and we suspect it's due to a mistake in how the Union was built.
But since it's a Data Extract, I can't see the UI for the data merge. Is there anyway to take a current Data Extract and view the logic that creates it?
Download the extract from the server (I'm assuming you're using server), then open that extract using desktop. You should be able to see the details of it.
Before going too deep into extract details, note that extracts are not intended to be permanent systems of record for data - just an efficient way to work with query results for optimized reporting. So in general, you should always be able to throw away the extract and look at the original source - or recreate the extract on command. But life isn't always perfect so ...
If you use Tableau Desktop to look at your worksheet, and look at the data source icon at the top of the data pane in the left sidebar, do you see an icon for your data source that looks like two databases with one on top of (shadowing) the other? If so, you can at right click on the data source icon and view its properties to see the source database table or file path. You can then even try disabling the extract to view the original source data.
If instead you see a single database icon, you have a "naked" extract where you've discarded the reference to the original source, (unless it is stored in the catalog mentioned below.)
If your organization purchased the Data Management Add-on for Tableau Server (strongly recommended), then if your data source is published to Tableau Server you can trace its history and origin by exploring the Tableau Catalog. That is especially valuable if the extract was built by a Tableau Prep Flow.
If instead, someone built the extract another way, say by writing a custom app using the Tableau Data Extract API, then the answer is to find that program.
One last point, in recent versions of Tableau, extracts are stored in an efficient relational type database file called Hyper. Hyper extracts can either be a single table (say serializing the results of a query joining multiple tables) or a Hyper extract can contain multiple tables (say serializing caching individual tables and deferring the join for later).
That may not be relevant to your question, but could turn out to matter as you reverse engineer how the extract was created.

Talend open studio run only created or modified records among 15k

I have a job in talend open studio which is working fine, it conects a tMSSqlinput to a tMap then tMysqlOutput, very straight forward. My problem is that i need this job running on daily basis, but only run when a new record is created or modified...any help is highly aprecciated!
It seems that you are searching for a Change Data Capture Tool for Talend.
Unfortunately it is only available on the licenced product.
To implement your need, you do have several ways. I want to show the most popular ones.
CDC from Talend
As Corentin said correctly, you could choose to use CDC (Change Data Capture) from Talend if you use the subscription version.
CDC of MSSQL
Alternatively you can check if you can activate or use CDC in your MSSQL server. This depends on your license. If it is possible, you can use the function to identify new elements and proceed them.
Triggers
Also you can create triggers on your database (if you have access to it). For example, creating a trigger for the cases INSERT, UPDATE, DELETE would help you getting the deltas. Then you could store those records separately or their IDs.
Software driven / API
If your database is connected to a software and you have developers around, you could ask for a service which identifies records on insert / update / delete and shows them to you. This could be done e.g. in a REST interface.
Delta via ID
If the primary key is an ID and it is set to autoincrement, you could also check your MySQL table for the biggest number and only SELECT those from the source which have a bigger ID than you have already got. This depends of course from the database layout.

How can I fill the fields of a table in LibreOffice Base automatically?

I have a database which contains a table of cellphones. Let's say that every cellphone has 10 fields. In order to fill or modify the table I will have several forms available for the user. However, I don't want the user to modify all 10 fields every time. I want him to just give information about 4 of the fields and the rest of them will be automatically filled or modified by a program. Does someone know how to do that? :)
While possible with triggers, macros, or other coding, it's generally bad database practice to have calculated fields or duplicate data stored in tables. Related data should be stored through relationships between tables and displayed in a query, not directly in the table.
So if, say, each store only sells a single color of phone, you would have the user enter only the store. You would have another table that showed the relationship between store name and phone color. Then when you wanted a list of users and their phone colors, you would write a query that looks at the table list of users and where they bought their phones and joins it to the list of stores and what colors they sell.
My advice has three tiers:
Almost certainly best - redesign your database to be more normalized, meaning use relationships between tables to prevent the need for duplicate data.
If you decide you need macros, a good resource for working with OpenOffice macros is Andrew Pitonyak's book OpenOffice Macros Explained (a free download from his website).
SQL Triggers are often a cleaner way of doing this (compared to macros) but are not supported by the old database engine that is the Base default. (Base itself only handles queries, forms, and reports. The tables are handled by separate software, which by default is an old version 1.8 of HyperSQL Database or HSQLDB that is "embedded" inside Base.) You would need to upgrade to a newer database software. Instructions on upgrading to HSQLDB 2.3 are in this thread: [Tutorial] Splitting an "embedded HSQL database"

DB2/400 Query - record format level identifiers for all tables in a library

We have multiple copies of the same library for testing, QA, development etc. consisting of hundreds of tables. Over time these libraries got out of sync and we run into a lot of level check problems. I would like to list all tables with a different Record Level Format Identifier from the corresponding tables in a model library. Is this possible using SQL? If not what other choices do we have?
A quick peek into SYSTABLES didn't show anything, but the QDBRTVFD API has that information in the file definition header. If APIs are not your thing, you can use DSPFD FILE(somelib/*ALL) TYPE(*RCDFMT) OUTPUT(*OUTFILE) FILEATR(*PF *LF) OUTFILE(QTEMP/RCDFMTS) to create a file you CAN use SQL on.

Data Warehousing Postgres

We're considering using SSIS to maintain a PostgreSql data warehouse. I've used it before between SQL Servers with no problems, but am having a lot of difficulty getting it to play nicely with Postgres. I’m using the evaluation version of the OLEDB PGNP data provider (http://www.postgresql.org/about/news.1004).
I wanted to start with something simple like UPSERT on the fact table (10k-15k rows are updated/inserted daily), but this is proving very difficult (not to mention I’ll want to use surrogate keys in the future).
I’ve attempted (Link) and (http://consultingblogs.emc.com/jamiethomson/archive/2006/09/12/SSIS_3A00_-Checking-if-a-row-exists-and-if-it-does_2C00_-has-it-changed.aspx) which are effectively the same (except I don’t really understand the union all at the end when I’m trying to upsert) But I run into the same problem with parameters when doing the update using a OLEDb command – which I tried to overcome using (http://technet.microsoft.com/en-us/library/ms141773.aspx) but that just doesn’t seem to work, I get a validation error –
The external columns for complent.... are out of sync with the datasource columns... external column “Param_2” needs to be removed from the external columns.
(this error is repeated for the first two parameters as well – never came across this using the sql connection as it supports named parameters)
Has anyone come across this?
AND:
The fact that this simple task is apparently so difficult to do in SSIS suggests I’m using the wrong tool for the job - is there a better (and still flexible) way of doing this? Or would another ETL package be better for use between two Postgres database? -Other options include any listed on (http://en.wikipedia.org/wiki/Extract,_transform,_load#Open-source_ETL_frameworks). I could just go and write a load of SQL to do this for me, but I wanted a neat and easily maintainable solution.
I have used the Slowly Changing Dimension wizard for this with good success. It may give you what you are looking for especially with the Wizard
http://msdn.microsoft.com/en-us/library/ms141715.aspx
The External Columns Out Of Sync: SSIS is Case Sensitive - I encountered this issue multiple times and it makes me want to pull my hair out.
This simple task is going to take some work either way. SSIS is by no means an enterprise class ETL product yet, but it does give you some quick and easy functionality, and is sufficient for most ETL work. I guess it is also about your level of comfort with it as well.
SCD is way too slow for what I want. I need to use set based sql.
It turned out that a lot of my problems were with bugs in the provider.
I opened a forum topic (http://www.pgoledb.com/forum/viewtopic.php?f=4&t=49) and had a useful discussion with the moderator/support/developer person.
Also Postgres doesn't let you do cross db querys, so I solved the problem this way:
Data Source from Production DB to a temp Archive DB table
Run set based query between temp table and archive table
Truncate temp table
Note that the temp table is not atchally a temp table, but a copy of the archive table schema to temporarily stored data in.
Took a while, but I got there in the end.
This simple task is going to take some work either way. SSIS is by no means an enterprise class ETL product yet, but it does give you some quick and easy functionality, and is sufficient for most ETL work. I guess it is also about your level of comfort with it as well.
What enterprise ETL solution would you suggest?