I am a beginner in using databases, I decide upon Postgres as I have learnt it is usable with Python.
I have reports that I receive in the form of html files, each file has multiple (100+) tables when parsed with Pandas data frame function, there is no unique ID common among all tables, and each table has unique columns.
Is it possible to import all tables, and merge them as a single table with ALL the columns in it, and have each report be a single entry in this new table with a PostgreSQL built-in feature, or do I have to develop a data pipeline using python and add them in manually?
I hope my question is clear enough.
Thank you.
Related
I have a table established in PowerApps for Teams. It has a lookup column referring to another table and functions. I have external data that I wish to populate this table with. When using the Dataflow created by the Import function offered by the UI, I cannot put data into this one column. The column is not available for import. All the other data does get pushed into the table as desired. Even one of the Choice columns receives its data.
How would I enable this lookup column to show up and receive data? The solutions I have found have been for the full PowerApps suite for which I don't have a license. They used alternative keys, but I cannot find a means to enable alternate keys in this stripped down version.
I have used data from an excel file both in OneDrive and external.
I have used the data from a SharePoint list to find the same result.
I have attempted to use and Alternate key in the receiving table, but it doesn't seem to be an option for the PowerApps for Teams.
I have formatted the data to be imported to match the lookup options for the receiving table.
I have a core layer where I have some tables and I want to find out by what tables in the source layer are these tables made up of. Like the tables in core layer are made by joining some of the tables of source layer. I want to generate an excel sheet using code so that I am able to display the core tables are made from which tables.
I am using PySpark on Databricks and the codes are written for creating the tables in notebooks.
Any help on how to approach this will be beneficial.
This is possible when you use Databricks Unity Catalog - as part of it, there is a feature called Data Lineage that tracks what tables & columns were used to create a specific table and who are consumers of it as well. It also includes Lineage API that could be used for exporting of the lineage data.
Not able to load multiple tables, getting error:
Exception in component tMysqlInput_1 (MYSQL_DynamicLoading)
java.sql.SQLException: Bad format for Timestamp 'GUINESS' in column 3
One table works fine. Basically after first iteration the second table trying to use the schema
of the first table. Please help, how to edit the component to make it
correct. Trying to load actor & country table from sakila DB mysql to
a another DB on the same server. Above image is for successful one table
dynamic loading.
you should not use tMysqlInput if output schemas differ. For this case there is no way around tJavaRow and custom code. I however cannot guess what happens in tMap, so you should provide some more details about what you want to achieve.
If all you need is to load data from one table to another without any transformations, you can do one of the following:
If your tables reside in 2 different databases on the same server, you can use a tMysqlRow and execute a query "INSERT INTO catalog.table SELECT * from catalog2.table2..". You can do some simple transformations in SQL if needed.
If your tables live in different servers, check the generic solution I suggested for a similar question here. It may need some tweaking depending on your use case, but the general idea is to replicate the functionality of INSERT INTO SELECT when the tables are not on the same server.
I have the following question (even thorough research couldn't help me):
I want to import data from a (rather large) CSV/TXT file to a postgreSQL DB, but I want to filter each line (before importing it) based on specific criteria.
What command/solution can I use?
On sidenote: If I am not reading from file, but a data stream what is the relevant command/procedure?
Thank you all in advance and sorry if this has been in some answer/doc that I have missed!
Petros
To explain the staging table approach, which is what I use myself:
Create a table (could be a temporary table) matching your csv structure
Import into that table, doing no filtering
Process and import your data into the real tables using SQL to filter and process.
Now, in PostgreSQL, you could also use the file_fdw to give you direct sql access to csv files. In general the staging table solution will usually be cleaner, but you can do this by essentially letting PostgreSQL treat the file as a table and going through a foreign data wrapper.
I have several tables I'm importing from ODBC using the import script step. Currently, I have an import script for each and every table. This is becoming unwieldy as I now have nearly 200 different tables.
I know I can calculate the SQL statement to say something like "Select * from " & $TableName. However, I can't figure out how to set the target table without specifying it in the script. Please, tell me I'm being dense and there is a good way to do this!
Thanks in advance for your assistance,
Nicole Willson
Integrated Research
Unfortunately, the target table of an import has to be hard coded in FileMaker up through version 12 if you're using the Import Records script step. I can think of a workaround to this, but it's rather convoluted and if you're importing a large number of records, would probably significantly increase the time to import them.
The workaround would be to not use the Import Records script step, but to script the creation of records and the population of data into fields yourself.
First of all, the success of this would depend on how you're using ODBC. As far as I can think, it would only work if you're using ODBC to create shadow tables within FileMaker so that FileMaker can access the ODBC database via other script steps. I'm not an expert with the other ODBC facilities of FileMaker, so I don't know if this workaround would be helpful in other cases.
So, if you have a shadow table into the remote ODBC database, then you can use a script something like the following. The basic idea is to have two sets of layouts, one for the shadow tables that information is coming from and another for the FileMaker tables that the information needs to go to. Loop through this list, pulling information from the shadow table into variables (or something like the dictionary library I wrote which you can find at https://github.com/chivalry/filemaker-dictionary). Then go to the layout linked to the target table, create a record and populate the fields.
This isn't a novice technique, however. In addition to using variables and loops, you're also going to have to use FileMaker's design functions to determine the source and destination of each field and Set Field By Name to put the data in the right place. But as far as I can tell, it's the only way to dynamically target tables for importing data.