Elasticsearch shows umlauts as "??" - encoding

Setup:
Ubuntu 12.04 Server installed via VMWare quick install
PostgreSQL 9.1
ElasticSearch 0.90
Mono 3.2.1
Rails 4
Nginx 1.4.2 + Passenger 4.0.16
I have a C# program that on start writes a new ElasticSearch index and points the alias that is used by the rails applications to it, the program then keeps going and watches a redis instance for things to update.
There is another C# program that scrapes data from web pages, once scraped they are put into Postgresql and the index writer above is notified via Redis. Those pages have varying encodings and are converted to UTF-8.
The first appearance of this bug was when I made a mistake and encoded data that was already UTF-8 as UTF-8 again.
Investigation
Now I thought that I obviously have some data corruption going on but the weird thing is: The umlauts are only corrupted when I start the indexing mono process from rails via nohup, if I kill this process and manually start it from the command line it works perfectly fine.
When I do a backup/restore of the database it works again from web interface but once the server is rebooted the umlauts are again replaced with ?? when starting the mono process from the web interface.
The first thing I did was to purge the affected rows from the database and scrape the data again (without encoding it twice), that didn't help and since the error only appears when running it as non-interactive via nohup from the rails application I assumed it is because of the locale setting so I changed that in both, /etc/defaults/locale and /etc/environment to en_US.UTF-8 and en_US:en but that did not help either.
I really have no idea what else I can do or what exactly causes this error, any help would be appreciated.
edit: I forgot to clarify the most important part, when umlauts are replaced with ?? ALL umlauts are replaced in every single document in the index.

Put this in the script that you use to start your process:
export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8
export LANGUAGE=en_US.UTF-8
The reason that your script only picks up the UTF-8 when you start things manually is that these things are not system wide. I've run into this with jruby and init.d scripts before and the solution is to not rely on defaults for this.

Related

psql client failing to import dump file - the system cannot find the specified file

I'm attempting to import an SQL dump in PgAdmin 4 using the psql client - However the error message returned is - The system cannnot find the file specified.
Here is a screenshot of my psql client -
The file films.sql is currently stored on my desktop, but I suspect the default location that the psql client accesses is not my desktop? Is there anyway to set the location that the client looks in order to resolve this?
The file SQL is viewable here: https://github.com/datacamp/courses-intro-to-sql/tree/master/datasets
I simply want to get the database on my local machine so that I don't need to store queries in an online learning platform. It would be best if this database is available locally to query and practice on.
I've attempted to execute the whole SQL file as a query on the films database but this does not seem to be working either and returns 'Asynchronous query execution/operation underway.
Query returned successfully in 388 msec.' - However it seems to be the case that the Asynchronous query never completes when I refresh the database.
Please can someone help?
Just give the path to your file:
psql -d my_database -f /path/to/the/file.sql
psql -d my_database -f C:/path/to/the/file.sql
Depending on whether you are on a unix/linux machine or Windows.
Oh, and if you aren't familiar with file paths you may want to take a step back and become more familiar with general computer terminology before diving into a RDBMS. Your learning will be much easier if you have a solid foundation to build upon.
I suspect this question might be moot for the asker at this point, but for anyone else stumbling upon it like I did: the interactive connection info prompts are provided by a batch script (in Windows, I'd guess there's an analogous shell script for Unix) called runpsql.bat, which then just passes your inputs as commandline arguments to the psql.exe executable. I was getting this error because I had migrated my Postgres installation and the batch script was calling a nonexistent path for psql.exe, hence The system cannot find the file specified. I edited runpsql.bat to point to the correct location of psql.exe and that resolved the issue. So for OP, I would look into PgAdmin4 and see where it's (presumably) calling runpsql.bat, then make sure that that calls psql.exe with the correct path.

FreeTDS - TSQL fails when password given in option, but succeeds when typed?

I'm trying to connect to a MS SQL Server trough PHP 5.6 with an Ubuntu 16.04 server. I'm forced to use this version of PHP, in order to assure compatibility with an 'ancient' application.
I installed PHP 5.6 and its modules (pdo, pdo_mysql, readline, etc.) trough ondrej's PPA without any trouble, but I wasn't able to find and install the 'mssql.so' module package needed for the application I'm trying to make work.
That's why I decided I would use ODBC (and PDO_ODBC) drivers in order to connect to the database.
In order to do so, I installed freetds with unixodbc and configured my files like this :
'odbcinst.ini' file :
'odbc.ini' file :
and finally the 'freetds.conf' file :
Ok, I guess all my files are well configured, but the next part starts to be a bit weird, let me explain you : when I try to connect to the MS SQL database using tsql and not giving the password in option but typing it when asked, the connexion works :
But when I try to give the same password as an option (-P), it doesn't work (I tried it at least 10 times with the correct password) !!!
--> tsql failed to open a session for the user 'WIPSOS-PHP'
The same problem happens when I try to use isql with one of my connectors I configured in the 'odbc.ini' file :
It seems to be related to the password, but I can't find the problem, can you please help me ?
I found a partial answer, and it was effectively related to the password: it seems like special characters are not well interpreted when present in the password given for the -P option of the tsql command.
Therefore, depending on you shell, you have to escape these characters in the string which completes an option. In my case, using bash, I had to use the '\' to escape the special character :
tsql -S hostname -D Database -U User -P pa\$\$word
And it now works!
But when I try to escape the special characters with '\' in the 'odbc.ini' file, it still doesn't seem to work :
My next question is : 'How to escape a special character in a '.ini' file ?' or 'Is there something wrong in my configuration ?'
EDIT: I know it was a long time ago, but I had a configuration problem related to the hostname in the "freetds.conf" file. In fact, I was using the server's local network DNS name instead of its IP adress(which works).
I hope my issue will at least help some people, it isn't easy to deal with TSQL and FreeTDS...

PostgreSQL installation as a user on Cygwin, database server setup

I'm trying to install PostgreSQL on Cygwin as a user and i'm getting stuck on the part where I have to create a database server, after i execute the pg_ctl command, it just prints "server starting" and runs in the foregorund indefinitely, here's a picture of what I'm talking about:
http://postimg.org/image/oh7ucgt9h/
Im generally a beginner to databases so any pointers would be great.
Chances you are hanging allocating shared memory. Please go with the native Windows build instead. It is far easier to manage.
However if you insist, make sure the ipc-daemon is running before you run PostgreSQL. This will probably solve your problem.
You could run:
ipc-daemon --install-as-service
net start ipc-daemon
And this should do it.

MySQL Dump and Import not preserving encoding?

I am trying to copy a table from on MySQL database on a remote machine to another MySQL database on my local machine. I noticed that after importing the dump to my local machine, there were characters like ’ instead of single quotes.
I assumed this was an encoding issue, so I went into both databases and ran show create table posts, near the end of both, I saw CHARSET=utf8. Also, I ran file -i on the dump file, both before and after scping it to my local machine, and they were both utf8.
However, when I import this file, I get this before:
attendees—policy makers,
and after:
attendees—policy makers,
I am not sure why this is happening, everything is using utf8, what am I missing?
EDIT: I am using mysql Ver 14.12 Distrib 5.0.75, for debian-linux-gnu (x86_64) remotely, and mysql Ver 14.14 Distrib 5.5.25a, for osx10.7 (i386) locally.
On both systems you must check that your connection encoding is correct:
SHOW VARIABLES LIKE 'character_set_%'
Usually seeing characters like that is the result of double-encoding. Make sure you can match up the connection and client encoding to be exactly the same. There is a number of command line options that can facilitate this, or if you're using a driver or client, something in there can tweak it.

Portable PostgreSQL for development off a usb drive

In order to take some development work home I have to be able to run a PostgreSQL database.
I don't want to install anything on the machine at home. Everything should run off the usb drive.
What development tools do you carry on your USB drive?
That question covers pretty much everything else, but I have yet to find a guide to getting postgresql portable. It doesn't seem easy if it's even possible.
So how do I get PostgreSQL portable? Is it even possible?
EDIT:
PostgreSQL Portable works. It's very slow on the usb-drive I have, but it works. I can't recommend doing constant development with it but for what I need it's great.
Perhaps if I pick up a full speed external drive I'll try out virtualization. Given the poor performance of just running the database off this drive, a full virtual OS running off of it would be unusable.
Here's how you can do this on your own:
http://www.postgresonline.com/journal/archives/172-Starting-PostgreSQL-in-windows-without-install.html
An alternate route would be to use something like VirtualBox and just install your development environment (database, whatever) on there.
There are 2 projects to try in 2014: http://sourceforge.net/projects/pgsqlportable/ and http://sourceforge.net/projects/postgresqlportable/?source=recommended.
I can't vouch for the second, but I'm using the first and it works right out of the box.
After unzipping using 7-zip (http://www.7-zip.org/download.html):
1) Run "start service without usuario.bat" ( english translation )
2) Then run "pgadmin3.bat"
The only minimal problem for me was that its in spanish. I've been able to change the language to english by following Change language of system and error messages in PostgreSQL. Using google translate the instructions are:
Description
This is a zip to automatically run postgresql 9.1.0.1 for windows. This version already has pgagent and pldebugger. To run must: 1) unzip
the zip 2) run the "start service without usuario.bat" found in the
pgsql directory within the folder you just unzipped. 3) Optional. If
you want to run the agent works postgresql (pgagent) should only run
the "start pgagent.bat" found in the pgsql directory inside the folder
you just unzipped. 4) Optional. To manage and / or develop the bd you
can run the pgadmin3.bat 5 files) Optional. To stop and / or restart
the server correctly use file "service without stopping usuario.bat"
usuario.bat or restart service without depending on the case.
Now option for Linux (file. Tar.gz). Postgresql portable Linux 9.2
Please use the tickets for your answer bugs.
Username: postgres Password: 123
Just a Note : on a new computer , to get pgadminIII working you may need to add a db. The settings are in attached screenshot.
Hope it helps.
I agree with virtualization solution, but maybe you can find useful this link from portable freeware collection, I have used this locally, not from usb though
1.download and extract : zip version
2.inside pgsql folder create data folder(put any name,I used 'data')
3.initalize data folder: c:\pgsql\bin\initdb.exe -D c:\pgsql\data -U postgres -W -E UTF8 -A scram-sha-256
4.to start/stop see next cmd code that I use (press any key inside it to stop)
c:\pgsql\bin\pg_ctl.exe -D c:\pgsql\data -l logfile start
pause
c:\pgsql\bin\pg_ctl.exe -D c:\pgsql\data stop
more info