I have a problem, with no errors or nothing wrong in logs.
I'm trying to repopulate relate tables with Enterprise Manager but the tables aren't being loaded correctly, only the default value appears.
The mechanism I'm using is simple: if a date is bigger then another, the process executes.
But even when I manually changed the dates with a SQL query, the process doesn't give me any results.
I also created a data load with the option to repopulate and still no changes.
Even tried to apply the fix to the 10.1 version and still nothing... (https://community.microstrategy.com/s/article/KB268380-In-MicroStrategy-10-Enterprise-Manager-Lookup-tables-in)
If any of you have any idea I'd be grateful.
Related
I need to process millions of records coming from MongoDb and put a ETL pipeline to insert that data into a PostgreSQL database. However, in all the methods I've tried, I keep getting the out memory heap space exception. Here's what I've already tried -
Tried connecting to MongoDB using tMongoDBInput and put a tMap to process the records and output them using a connection to PostgreSQL. tMap could not handle it.
Tried to load the data into a JSON file and then read from the file to PostgreSQL. Data got loaded into JSON file but from there on got the same memory exception.
Tried increasing the RAM for the job in the settings and tried the above two methods again, still no change.
I specifically wanted to know if there's any way to stream this data or process it in batches to counter the memory issue.
Also, I know that there are some components dealing with BulkDataLoad. Could anyone please confirm whether it would be helpful here since I want to process the records before inserting and if yes, point me to the right kind of documentation to get that set up.
Thanks in advance!
As you already tried all the possibilities the only way that I can see to do this requirement is breaking done the job into multiple sub-jobs or going with incremental load based on key columns or date columns, Considering this as a one-time activity for now.
Please let me know if it helps.
In my project I have to keep on inserting new rows in a table based on some logic. After this I want that each time an event is triggered, the rows of updated table should be fetched.
But the Problem is that new rows aren't accessed. The table is always updated after i close the current simulation. A similar case was posted last year but the answer wasn't clear, and due to less reputation score I am unable to comment on it. Does anyone know that whether Anylogic 8.1.0 PLE supports reading of newly updated database table records at runtime or not? or is there some other beneficial solution?
This works correctly in AnyLogic (at least in the latest 8.2.3 version) so I suspect there is another problem with your model.
I just tested it:
set up a simple 2-column database table;
list its contents (via query) at model startup;
update values in all rows (and add a bunch of rows) via a time 1 event;
list its contents (via query) via a time 2 event.
All the new and updated rows show correctly (including when viewing the table in AnyLogic, even when I do this during the simulation, pausing it just after the changes).
Note that, if you're checking the database contents via the AnyLogic client, you need to close/reopen the table to see the changes if you were already in it when starting the run. This view does auto-update when you close the experiment, so I suspect that is what you were seeing. Basically, the rows had been added (and will be there when/if you query them later in the model) but the table in the AnyLogic client only shows the changes when closing/reopening it or when the experiment is closed.
Since you used the SQL syntax (rather than the QueryDSL alternative syntax) to do your inserts, I also checked with both options (and everything works the same in either case).
The table is always updated after i close the current simulation
Do you mean when you close the experiment?
It might help if you can show the logic/syntax you are using for your database inserts and your queries.
I'm pretty new to PowerPivot and have a problem.
I created an SSIS project (.dtsx) to import around 10 million rows of data and an Analysis Services Tabular Project (.bim) to process the data model.
Up until today, everything worked as expected, but after making a schema change to add further columns to a table and updating the model, I now have a problem. When opening the existing connection in Business Intelligence Development Studio (BIDS) to update the schema changes, I was told that I would have to drop and reload the Sales and Returns tables as they were related.
Now, when I try to filter on a particular attribute, the Sales 'Sum of Units' column always displays the total sum of units for every row, instead of the correct values. I remember having this problem once when I was building the system, but it went away after re-processing the tables in BIDS... this time however, no amount of processing is making any difference.
I'm really hoping that this is a common problem and that someone has a nice easy solution for me, but I'll take whatever I can get at this stage. I'd also quite like to understand what is causing this. Many thanks in advance.
For anyone with a similar problem, I found the answer.
Basically, I had made a schema change and BIDS told me that I had to drop my SalesFact and ReturnsFact tables before updating the model with the new database schema. The problem was that I did not realise that relationships had been set up on these tables and so after re-adding them, the model was missing its relationships to the other tables... that's why all rows showed the same value.
The fix was to put the model into design view and to create relationships between the tables by clicking and dragging between them.
I knew it was something simple.
We're considering using SSIS to maintain a PostgreSql data warehouse. I've used it before between SQL Servers with no problems, but am having a lot of difficulty getting it to play nicely with Postgres. I’m using the evaluation version of the OLEDB PGNP data provider (http://www.postgresql.org/about/news.1004).
I wanted to start with something simple like UPSERT on the fact table (10k-15k rows are updated/inserted daily), but this is proving very difficult (not to mention I’ll want to use surrogate keys in the future).
I’ve attempted (Link) and (http://consultingblogs.emc.com/jamiethomson/archive/2006/09/12/SSIS_3A00_-Checking-if-a-row-exists-and-if-it-does_2C00_-has-it-changed.aspx) which are effectively the same (except I don’t really understand the union all at the end when I’m trying to upsert) But I run into the same problem with parameters when doing the update using a OLEDb command – which I tried to overcome using (http://technet.microsoft.com/en-us/library/ms141773.aspx) but that just doesn’t seem to work, I get a validation error –
The external columns for complent.... are out of sync with the datasource columns... external column “Param_2” needs to be removed from the external columns.
(this error is repeated for the first two parameters as well – never came across this using the sql connection as it supports named parameters)
Has anyone come across this?
AND:
The fact that this simple task is apparently so difficult to do in SSIS suggests I’m using the wrong tool for the job - is there a better (and still flexible) way of doing this? Or would another ETL package be better for use between two Postgres database? -Other options include any listed on (http://en.wikipedia.org/wiki/Extract,_transform,_load#Open-source_ETL_frameworks). I could just go and write a load of SQL to do this for me, but I wanted a neat and easily maintainable solution.
I have used the Slowly Changing Dimension wizard for this with good success. It may give you what you are looking for especially with the Wizard
http://msdn.microsoft.com/en-us/library/ms141715.aspx
The External Columns Out Of Sync: SSIS is Case Sensitive - I encountered this issue multiple times and it makes me want to pull my hair out.
This simple task is going to take some work either way. SSIS is by no means an enterprise class ETL product yet, but it does give you some quick and easy functionality, and is sufficient for most ETL work. I guess it is also about your level of comfort with it as well.
SCD is way too slow for what I want. I need to use set based sql.
It turned out that a lot of my problems were with bugs in the provider.
I opened a forum topic (http://www.pgoledb.com/forum/viewtopic.php?f=4&t=49) and had a useful discussion with the moderator/support/developer person.
Also Postgres doesn't let you do cross db querys, so I solved the problem this way:
Data Source from Production DB to a temp Archive DB table
Run set based query between temp table and archive table
Truncate temp table
Note that the temp table is not atchally a temp table, but a copy of the archive table schema to temporarily stored data in.
Took a while, but I got there in the end.
This simple task is going to take some work either way. SSIS is by no means an enterprise class ETL product yet, but it does give you some quick and easy functionality, and is sufficient for most ETL work. I guess it is also about your level of comfort with it as well.
What enterprise ETL solution would you suggest?
I'm working with SQL 2000 and I need to determine which of these databases are actually being used.
Is there a SQL script I can used to tell me the last time a database was updated? Read? Etc?
I Googled it, but came up empty.
Edit: the following targets issue of finding, post-facto, the last access date. With regards to figuring out who is using which databases, this can definitively monitored with the right filters in the SQL profiler. Beware however that profiler traces can get quite big (and hence slow/hard to analyze) when the filters are not adequate.
Changes to the database schema, i.e. addition of table, columns, triggers and other such objects typically leaves "dated" tracks in the system tables/views (can provide more detail about that if need be).
However, and unless the data itself includes timestamps of sorts, there are typically very few sure-fire ways of knowing when data was changed, unless the recovery model involves keeping all such changes to the Log. In that case you need some tools to "decompile" the log data...
With regards to detecting "read" activity... A tough one. There may be some computer-forensic like tricks, but again, no easy solution I'm afraid (beyond the ability to see in server activity the very last query for all still active connections; obviously a very transient thing ;-) )
I typically run the profiler if I suspect the database is actually used. If there is no activity, then simply set it to read-only or offline.
You can use a transaction log reader to check when data in a database was last modified.
With SQL 2000, I do not know of a way to know when the data was read.
What you can do is to put a trigger on the login to the database and track when the login is successful and track associated variables to find out who / what application is using the DB.
If your database is fully logged, create a new transaction log backup, and check it's size. The log backup will have a fixed small lengh, when there were no changes made to the database since the previous transaction log backup has been made, and it will be larger in case there were changes.
This is not a very exact method, but it can be easily checked, and might work for you.