Pentaho ETL service unavailable from local machine - pentaho-spoon

Pretty straightforward. I have a transformation that only checks if a certain URL is available. When I run said transformation from my local pentaho instance, it fails. When I then run it from a carte server on AWS, it works.
I can however just browse to the same site on my local machine and that works fine as well.
I asked and there were no firewall rules in place to stop me from doing this. The URL itself does not even matter (I could for example use stackoverflow.com as target and the symptoms would be equal).
I'm really stumped by this one. Open to all suggestions.
In case this matters, I run the CE version of Pentaho ETL 8.3

Related

How to replicate a postgresql database from local to web server

I am new in the form and also new in postgresql.
Normally I use MySQL for my project but I’ve decided to start migrating towards postgresql for some valid reasons which I found in this database.
Expanding on the problem:
I need to analyze data via some mathematical formulas but in order to do this I need to get the data from the software via the API.
The software, the API and Postgresql v. 11.4 which I installed on a desktop are running on windows. So far I’ve managed to take the data via the API and import it into Postgreql.
My problem is how to transfer this data from
the local Postgresql (on the PC ) to a web Postgresql (installed in a Web server ) which is running Linux.
For example if I take the data every five minutes from software via API and put it in local db postgresql, how can I transfer this data (automatically if possible) to the db in the web server running Linux? I rejected a data dump because importing the whole db every time is not viable.
What I would like is to import only the five-minute data which gradually adds to the previous data.
I also rejected the idea of making a master - slave architecture
because not knowing the total amount of data, on the web server I have almost 2 Tb of hard disk while on the local pc I have only one hard disk that serves only to take the data and then to send it to the web server for the analysis.
Could someone please help by giving some good advice regarding how to achieve this objective?
Thanks to all for any answers.

Two master instances on same database

I want to use Postgresql in Windows Server 2012 R2 for one our project where it can be 24/7 uptime.
I would like to ask the community if I can have 2 master instances in 2 different servers A&B and they will 'work' on the same DB located in a shared file storage in lan. Always one master instance on server A will be online and when it goes offline for some reason (I suppose) a powershell script will recognize that the postgresql service stopped and will start the service in server B. The same script will continuous check that only one service in servers A & B is working to avoid conflicts.
I'd like to ask if this is possible or a better approach for my configuration.
(I can't use replication because when server A shuts down the server B is in read-only mode thing that I don't want)
If you manage to start two instances of PostgreSQL on the same data directory, serious data corruption will happen.
Normally there is a postmaster.pid file that prevents that, but a PostgreSQL server process on a different machine that accesses the same file system will happily unlink that after spewing some log messages, thinking it was left behind from a crash.
So you are really walking on thin ice with a solution like that.
One other issue that you didn't think of is that script that is supposed to check if the server is still running. What if that script fails, because for example the network connection between the two servers is down, but the server is still up an running happily? Such a “split brain” scenario will cause data corruption with your setup.
Another word of caution: since you seem to be using Windows (Powershell?), you probably envision a CIFS file system when you are talking of shared storage. A Windows “network share” is not a reliable file system — last time I checked, it did not honor _commit.
Creating a reliable failover cluster is harder than you think, and I'd recommend that you check existing solutions before you try to roll your own.

Remote execute Power Shell scripts to collect data

I am looking to collect data snapshot on a random interval from various machines in our network that we don't own, but may get access to install an agent to collect these data.
These machines are either in a domain or work-group and kind of data i get are based on the role they play and information they have. The machines are "Windows Server 2003" and above and I do not want to install anything on those machines before i get started, so thought I can use the PowerShell scripts that I can remote invoke form my server and pass the script it has to run to return the data.
I was wondering if this is possible to do that with the PowerShell scripts and as this is supposed to run in a secure environment, is there any major security implications with this approach. i.e. do I need to do anything on the client machines that can make them vulnerable to security threats.
BTW these machines are not exposed to internet and are behind a firewall.
I would appreciate if you point me to any other alternatives that can be useful for my analysis.
Regards
Kiran

AWS deployment without using SSH

I've read some articles recently on setting up AWS infrastructure w/o enabling SSH on Ec2 instances. My web app requires a binary to run. So how can I deploy my application to an ec2 instance w/o using ssh?
This was the article in question.
http://wblinks.com/notes/aws-tips-i-wish-id-known-before-i-started/
Although doable, like the article says, it requires to think about servers as ephemeral servers. A good example of this is web services that scale up and down depending on demand. If something goes wrong with one of the servers you can just terminate your server and spin up another one.
Generally, you can accomplish this using a pull model. For example at bootup pull your code from a git/mecurial repository and then execute scripts to setup your instance. The script will setup all the monitoring required to determine whether your server and application are up and running appropriately. You would still need an SSH client for this if you want to pull your code using ssh. (Although you could also do it through HTTPS)
You can also use configuration management tools that don't use ssh at all like Puppet or Chef. Essentially your node/server will pull all your application and server configuration from the Puppet master or the Chef server. The Puppet agent or Chef client would then perform all the configuration/deployment/monitoring changes for your application to run.
If you with this model I think one of the most critical components is monitoring. You need to know at all times if there's something wrong with one of your server and in the event something goes wrong discard the server and spin up a new one. (Even better if this whole process is automated)
Hope this helps.

Umbraco on Azure: can I change hostname?

I've deployed in Windows Azure a website made with Umbraco, using
Windows Azure Accelerator for Umbraco.
For development and test i used a test Hostname. Now it's time to switch to the official DNS hostname..
How can I change current hostname?
Actually i configured hostname at deployment time (the only way i know to do this) but i can't deploy again, since many files have been changed working on website on Azure.
EDIT
Let me explain: at the step prompt in the image (during web site deploying) I used as Domain Name "test.mywebsite.com", and configured real DNS.
Now the website is configured, so I'd like to make mywebsite.com point to that site;
But is'nt enough if i configure mywebsite DNS! Shall I deploy again? An will I lose any of the changes I made?
I'd like to make two comments on your question:
1) In order to host your Azure application under a custom host name, you will need to sign up with a DNS provider that supports C-NAME records (most do). I suggest someone like GoDaddy.com because by default C-NAME records can only resolve your "www.domainname.com" records and cannot do anything for queries where "www." is dropped from the URL. DNS providers like GoDaddy also have an option to redirect all traffic destined for "domainname.com" to a URL of your choice. This is a huge deal for Azure apps. Frankly speaking, it is somewhat disappointing that for all the PaaS and IaaS features of Azure, DNS was not included in the overall package.
2) I am a little worried when you say that you can no longer redeploy your app due to the changes made. Can you elaborate on that? Have you made changes to the application's code running on VM's in Azure without going through redeployment process? If so, this is a huge no-no. Your VM's running in Azure are not "permanent". Microsoft and your redeployment process can (and will) re-stage those VM's to the original package at any given time. Microsoft will re-image your VM's at least once a month during their monthly OS upgrades. But they can also do so when they need to move your VM to another rack, etc. Whatever changes that you make to your app must be either stored in source-control before deployment or in a permanent storage facility like SQL Azure, Azure Storage, etc.
HTH
Finally i think that the answers to my questions are:
-Shall I deploy again? Yes, i must deploy again
-Will I lose any of the changes I made? Many changes will be mantained since are stored into DB. But I have to do many activities to make new website work!
This answer confirms my theory:
In my case, I created and uploaded a site with a name, let's say
http://www.contoso.com and then paid a domain from a registrar let's say
http://www.example.com, when I mapped
http://MyAcceleratorsService.cloudapp.net/ to my new domain
( http://www.example.com ) and tried to open that domain I got the home page of
the Accelerator and not the uploaded site.
I had to upload the site again to Azure (using UploadUmbracoSite.cmd
from Accelerator application) and when uploading enter the same domain
name as the one I registered: http://www.example.com. Then, I was able to
browse my uploaded site as expected.
As for your question, will upload site again using
UploadUmbracoSite.cmd (is in the Setup folder) and will enter the new
domain name when requested.
Exactly what I was trying to avoid.. but the only solution, i suppose.
Well it was not easy to publish again, i got errors of many type (i suppose tied to some components that i've installed after deploy and that are not installed in new deployed website).. i'm going to solve them.
Edit
Completed my work:
- loads of different attempts, no-one worked
- CTP backup of DB
- deleted DB and website
- new full deploy of umbraco
- CTP restore of DB
finally:
-all work on content is OK
-all work on styles, pages, templates is lost
Changing hostname is hard; dont'use test hostname but definitive hostname from the beginning.
If anyone has suggest, i'll be pleased to test it, anyway
This is not really an answer to your question, but it might be a solution to your problem: Use a CNAME record to make the production DNS name point to your development name. E.g. www.productionname.com will the point to www.testname.com. I am not sure if everything will just work out of the box, but it seems to be worth a try.
This requires, that your hosting provider allows you to set up CNAME records.
http://en.wikipedia.org/wiki/CNAME_record