Restarting apache during nominatim import

Restarting apache during nominatim import - openstreetmap

I'm updating Europe Database via nominatim from this post on my server.
https://help.openstreetmap.org/questions/15505/import-more-osm-files-in-to-nominatim
Today is 12/20GB and It's 6th day of import so it's going very slow. In my apache & php I have to enable CURL extension so i need to restart apache. Is Nominatim using apache to update and index database ? When I will restart apache process that will cause nominatim update stop ?

I've never done this myself but the import should not require a running Apache instance as that would be a rather unfortunate approach. According to utils/update.php it calls Nominatim and osm2pgsql directly. Thus, restarting Apache should be fine.

Related

How is airflow database managed periodically?

I am running airflow using postgres.
There was a phenomenon that the web server was slow during operation.
It was a problem caused by data continuing to accumulate in dag_run and log of the db table (it became faster by accessing postgres and deleting data directly).
Are there any airflow options to clean the db periodically?
If there is no such option, we will try to delete the data directly using the dag script.
And I think it's strange that the web server slows down because there is a lot of data. Does the web server get all the data when opening another window?

You can purge old records by running:
airflow db clean [-h] --clean-before-timestamp CLEAN_BEFORE_TIMESTAMP [--dry-run] [--skip-archive] [-t TABLES] [-v] [-y]
(cli reference)
It is a quite common setup to include this command in a DAG that runs periodically.

You have specified a database that is not empty, please specify an empty database

I'm trying to connect to an RDS cluster in AWS that's an Aurora PostgreSQL database. It's a brand-new database that I created along with the instances that I have Jira deployed to. However, when I try to connect to the instance from the Jira configuration screen I get this error
You have specified a database that is not empty, please specify an empty database.
I haven't touched this database at all, why is it giving me this error? I have one read and one write database in my cluster and the "hostname" is the endpoint for my write database, which is what the docs say. Could this be an issue with the Jira version I'm using?
This is the download link I'm using in my user-data script to install Jira. I'm also using PostgreSQL version 12.11
https://www.atlassian.com/software/jira/downloads/binary/atlassian-servicedesk-4.19.1-x64.bin

I switched to a different PostgreSQL version and now it's working.
PostgreSQL version 12.11 was giving me the error and switching to version 13.7 works as expected.

PostgreSQL log configuration on Ubuntu

I have PostgreSQL 9.5 (yes I know it's not supported anymore) installed on Ubuntu Server 18.04 using this instructions https://www.postgresql.org/download/linux/ubuntu/
I want to change path and separate log for every database. But it's configuret by package maintainer in such a way that it ignores log* settings in PostgreSQl configuration and uses some other way to log everything to files and I can't find out how. Currently it logs to /var/log/postgresql/postgresql-9.5-clustername.log. I want it to be /var/log/postgresql/clustername/database.log but I don't know where to configure it. In PostgreSQL log_destination is set to stderr

The Ubuntu packages have logging_collector disabled by default, so the log is not handled by PostgreSQL, but by the startup script.
However, there is no way in PostgreSQL to get a separate log file per database, so the only way to get what you want is to put the databases in individual clusters rather than into a single cluster.

Upgrade from Sequential executor to Celery executor in Apache Airflow

I have Apache Airflow running on an EC2 instance (Ubuntu). Everything is running fine.
The DB is SQLite and the executor is Sequential Executor (provided as default). But now I would like to run some DAGs which needs to be run at the same time every hour and every 2 minutes.
My question is how can I upgrade my current setup to Celery executor and postgres DB to have the advantage of parallel execution?
Will it work, if I install and setup the postgres, rabbitmq and celery. And make the necessary changes in the airflow.cfg configuration file?
Or do I need to re-install everything from scratch (including airflow)?
Please guide me on this.
Thanks

You can, indeed, install Postgres/RabbitMQ/Celery, then update your configuration file (airflow.cfg), initialise the database, and restart the Airflow services.
However, there is a side note: if required, you'd also have to migrate data from SQLite to Postgres. Most importantly, the database contains your connections and variables. It's possible to export variables beforehand and import them again using the Airflow CLI (see this answer, and the Airflow documentation).
It's also possible to import your connections using the CLI, as described in this Airflow guide (or the documentation).
If you just switched to the new database set up and you see something's missing, you can still easily switch back to the SQLite setup by reverting the changes to airflow.cfg.

Ambari doesnt start after postgresql Upgrade

We have a four-node Hadoop cluster with HDP 2.4 and Kerberos installed in it. As this is our production cluster, We wanted to have HA for all the services including the PostgreSQL database which is used by Hive, Ambari, and Oozie for storing the metadata. However, the version of our postgreSQL, which is 8.4.2 doesn't support the inbuilt feature(stream replication) of Postgres.
So, we have decided to upgrade PostgreSQL to a version(9.3) ambari supports.
I followed this link to upgrade the Postgres. Everything went well. Expect that, we are getting the following error when restarting ambari server.
Ambari Server running with administrator privileges.
Running initdb: This may take upto a minute.
Data directory is not empty!
[FAILED]
Could someone help?
Thanks.

Your server want's to initilize the Database. I guess your Server does not see the Ambari DB. Use ambari-server setup zu restore the database Connection. Than the sever should start perfectly.

I found the fix for the issue here.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Restarting apache during nominatim import - openstreetmap

I've never done this myself but the import should not require a running Apache instance as that would be a rather unfortunate approach. According to utils/update.php it calls Nominatim and osm2pgsql directly. Thus, restarting Apache should be fine.

Related

How is airflow database managed periodically?

You have specified a database that is not empty, please specify an empty database

PostgreSQL log configuration on Ubuntu

Upgrade from Sequential executor to Celery executor in Apache Airflow

Ambari doesnt start after postgresql Upgrade

Categories

Resources