PHPExcell fetching special characters from database - special-characters

I'm trying to create an xls file from php using PHPExcel and fetching data from a mysql database.
The sentece that gives me problem is something like "Corda Flessibile Antifiamma 1x16mm² NERO - € 1,21"
If I get it from DB the PHPExcel write "FALSE" to the file. Code like this:
$result = mysql_query($query);
$array = mysql_fetch_array($result);
$string = $array['value'];
$activeSheet->setCellValue("B1", $string); //output => "FALSE"
BUT if I type it in the source code I get no problems and it is written to the file. COde like:
$activeSheet->setCellValue("B1", "Corda Flessibile Antifiamma 1x16mm² NERO - € 1,21"); //output correct => "Corda Flessibile Antifiamma 1x16mm² NERO - € 1,21"
Does anyone has ever encountered this same problem?

PHPExcel expects strings to be UTF-8.
If you're pulling a non-UTF-8 character set value from the database, convert it to UTF-8 before writing it to PHPExcel.

Related

How to convert utf8 to text in dart

I want to convert special character utf8 value to it's text.
For example, if input is %20, the output will be whitespace
if input is %23, the output will be #
void main() {
var raw = 'Hello%20Bebop%23yahoo';
var parsed = Uri.decodeComponent(raw);
print(parsed);
}
Result:
Hello Bebop#yahoo
Looks like you converting to an ASCCI like encoding. What function you are using for that result?
Try finding the encoding table to your %20 and %23 output, so you will see where you heading at the moment.

OCTAVE data import from PCE-VDL data logger device and conversion of decimal coma to decimal point

I have a measurement device PCE-VDL, which gives me measurements in following CSV format below, which I need to import to OCTAVE for further investigation.
Especially I need to import last 3 columns with xyz acceleration data.
The file is in CSV format with delimiter of semicolon ";".
I have tried:
A_1 = importdata ("file.csv", ";", 3);
but have recieved
error: missing_idx(10): out of bound 9
The CSV file looks like this:
#PCE-VDL X - TableView series
#2020.16.11
#Date;Time;Duration [s];t [°C];RH [%];p [mbar];aX [g];aY [g];aZ [g];
2020.28.10;16:16:32:0000;00:000;;;;0,0195;-0,0547;1,0039;
2020.28.10;16:16:32:0052;00:005;;;;0,0898;-0,0273;0,8789;
2020.28.10;16:16:32:0104;00:010;;;;0,0977;-0,0313;0,9336;
2020.28.10;16:16:32:0157;00:015;;;;0,1016;-0,0273;0,9297;
The numbers in last 3 columns have also decimal coma and not decimal point. So there probably should be done also some conversion.
Thank you very much for any help.
Regards
EDIT: 18.11.2020
Thanks for help. I have tried now following:
A_1_str = fileread ("file.csv");
A_1_str_m = strrep (A_1_str, ".", "-");
A_1_str_m = strrep (A_1_str_m, ",", ".");
save "A_1_str_m.csv" A_1_str_m;
A_1 = importdata ("A_1_str_m.csv", ";", 8);
and still receive error: file_content(140): out of bound 139
There is probably some problem with time format in first columns, which I do not want to read. I just need last three columns.
After my conversion, the file looks like this:
# Created by Octave 5.1.0, Wed Nov 18 21:40:52 2020 CET <zdenek#ASUS-F5V>
# name: A_1_str_m
# type: sq_string
# elements: 1
# length: 7849
#PCE-VDL X - TableView series
#2020-16-11
#Date;Time;Duration [s];t [°C];RH [%];p [mbar];aX [g];aY [g];aZ [g];
2020-28-10;16:16:32:0000;00:000;;;;0.0195;-0.0547;1.0039;
2020-28-10;16:16:32:0052;00:005;;;;0.0898;-0.0273;0.8789;
2020-28-10;16:16:32:0104;00:010;;;;0.0977;-0.0313;0.9336;
Thanks for support!
You can first read the data with fileread, which stores the data as a string. Then you can manipulate the string like this:
new_string = strrep(string, ",", ".");
strrep replaces all occurrences of a pattern within a string. Afterwards you save this data as a separate file or you overwrite the existing file with the manipulated data. When this is done you proceed as you have tried before.
EDIT: 19.11.2020
To avoid the additional heading lines in the new file, you can save it like this:
fid = fopen("A_1_str_m.csv", "w");
fputs(fid, A_1_str_m);
fclose(fid);
fputs will just write the string to the file.
The you can read the new file with dlmread.
A1_buf = dlmread("A_1_str_m.csv", ";");
A1_buf = real(A1); # get the real value of the complex number
A1_buf(1:3, :) = []; # remove the headlines
A1 = A1_buf(:, end-3:end-1); # get only the the 3 columns you're looking for
This will give you the three columns your looking for. But the date and time data will be ignored.
EDIT 20.11.2020
Replaced abs with real, so the sign of the value will be kept.
Use csv2cell from the io package.

Unicode confusion #3423435

Once again I enter that goddamn unicode-hell ... sigh =(
There are two files:
$ file *
kreise_tmp.geojson: ASCII text
pandas_tmp.csv: UTF-8 Unicode text
I read the first file like this:
with open('kreise_tmp.geojson') as f:
jdata = json.loads(f.read())
I read the second file like this:
pandas_data = pd.read_csv(r'pandas_tmp.csv', sep=";")
Now check out what's inside the strings:
>>> jdata['features'][0]['properties']['name']
u'Kreis Euskirchen' # a unicode string?
>>> pandas_data['kreis'][0]
'Kreis D\xc3\xbcren' # not a unicode string?
Why are the strings from the "UTF-8 Unicode text" file just normal strings and the strings from the "ASCII text" file unicode strings?
JSON strings are always Unicode.
~$ python2
>>> import json
>>> json.loads('"\xc3\xbc"')
u'\xfc'
But they are often serialized with \u escapes, so file will only see ASCII.
>>> json.dumps(_)
'"\\u00fc"'
add encoding='utf-8' to the opening of files to decode them with utf-8
pandas_data = pd.read_csv(r'pandas_tmp.csv', sep=";", encoding='utf8')
you can also do the same with the JSON
with open('kreise_tmp.geojson', encoding='utf8') as f:
jdata = json.loads(f.read())
Also in Python 2.7, you can add this to the top of the file..
#!/usr/bin/env python
# -*- coding: utf-8 -*-

I am unable parse date info from a csv file into ipython

I am running python 3.5, I have imported pandas. My csv file (payinfo.csv) looks like:
"01 DEC",1234.45,2344,11,1212.66
"01 NOV", 9898.33, 2343,12,1009.33
When I run the following:
dateparse = lambda x: pd.datetime.strptime(x,"%d %b")
pay_data = pd.read_csv('payinfo.csv', parse_dates = ['Date'], date_parse
I always get
"ValueError: time data '“01 DEC”' does not match format '%d %b'
I am a new programmer to python, and would appreciate any help.
I think it was just the double quotes around string that caused that error. Try stripping away any hardcoded (not 'python generated') single or double quote marks with .strip('"')
Example:
a = '"01 DEC"'
# Gives error
#a = pd.datetime.strptime(a,"%d %b")
# string without unneccessary quote marks
a = pd.datetime.strptime(a.strip('"'),"%d %b")
print a
Output:
1900-12-01 00:00:00
You haven't included the headers in the question. But this works:
import io
import pandas as pd
a = io.StringIO(u""""01 DEC",1234.45,2344,11,1212.66
"01 NOV", 9898.33, 2343,12,1009.33""")
dateparse = lambda x: pd.datetime.strptime(x,"%d %b")
df = pd.read_csv(a,header=None, parse_dates=[0], date_parser=dateparse)
print df
You can append custom year to x before converting it to datetime
.strptime(year + x,"%Y%d %b")
Output:
0 1 2 3 4
0 1900-12-01 1234.45 2344 11 1212.66
1 1900-11-01 9898.33 2343 12 1009.33
Thank you both for your input. From your answers I modified the csv file to remove the quotes around the date entry, then things worked fine! I am puzzled because I have used the read_csv method before on similar data that looked like this:
"12/31/2016","The UPS Store","THE UPS STORE 031","10.74","debit","Business Services","Interest Checking","",""
"12/31/2016","Hospice of The East Bay","HOSPICE OF THE EAST","14.00","debit","Clara","Interest Checking","",""
and had no problems – in fact I didn't need to parse the data at all and the reader was able to correctly identify the date. Huh! I guess the real issue was that the date was stored in an unconventional format. In any case, I have the answer and thank you both for your answers.

Decode a string with both Unicode and Utf-8 codes in Python 2.x

Say we have a string:
s = '\xe5\xaf\x92\xe5\x81\x87\\u2014\\u2014\xe5\x8e\xa6\xe9\x97\xa8'
Somehow two symbols, '—', whose Unicode is \u2014 was not correctly encoded as '\xe2\x80\x94' in UTF-8. Is there an easy way to decode this string? It should be decoded as 寒假——厦门
Manually using the replace function is OK:
t = u'\u2014'
s.replace('\u2014', t.encode('utf-8')
print s
However, it is not automatic. If we extract the Unicode,
index = s.find('\u')
t = s[index : index+6]
then t = '\\u2014'. How to convert it to UTF-8 code?
You're missing extra slashes in your replace()
It should be:
s.replace("\\u2014", u'\u2014'.encode("utf-8") )
Check my warning in the comments of the question. You should not end up in this situation.