This is continuation of this thread:
Error while importing big file in osm2pgsql
same server, same code, same me :-) just .osm file that I used is different.
So I downloaded .osm file from https://www.openstreetmap.org just for Tuam town (Ireland). It worked, I don't know why, maybe because this file is not binary but XML or maybe because it is not so huge? So first question is why it worked at all and second question which refer to title of my issue is WHY do generated tiles look so WEIRD - streets look fragmented. Here is an example tile:
Tuam 1
Please help
Related
I have a HTML file that contains some records shown in a table that somehow got encoded in the wrong way. A large part of the file is correct and shows the content as expect but some parts of the file seem to be encoded in the wrong way. Actually the whole HTML part is shown correctly (all the elements etc) but the values within the cells of the table are sometimes encoded in the wrong way.
For example one cell contains:
<cell>»¿è²å¼æäºæ 线æ¥å¥ç½ç»ä¸çæ³¢ææå½¢ææ¯ç 究</cell>
While it should contain:
<cell>绿色异构云无线接入网络中的波束成形技术研究</cell>
I already tried figuring out what exactly went wrong, but I can't seem to find the correct solution to completely resolve this problem for the whole file. I tried tools such as FTFY, which didn't give me any meaningful result.
These websites gave me some direction and it seems that something went wrong between Windows-1252/1251 and UTF-8. The first website seems to fix the problem but still returns some unknown characters (UTF-8 displayed as Windows-1252).
Does anyone have an idea how to fix this for the whole file? Or give me any tips to further figure it out on my own.
Thanks in advance.
(I'm new with Tesseract, could miss understand lot of things).
I followed this article to train Tesseract for a specific font.
Everything worked as expected, so I have in my /usr/share/tesseract-ocr/tessdata/ a new file eve.traineddata (the only file I copied because the article didn't asked for more).
But now, when I run:
/usr/local/bin/tesseract -l eve image.png textfile
I got:
mgr->GetComponent(TESSDATA_INTTEMP, &fp):Error:Assert failed:in file adaptmatch.cpp, line 537
Segmentation fault (core dumped)
This only append with -l eve (obviously).
I didn't found any explanation on the internet (even if it's seems to be a usual issue).
I would like to at least understand what is going wrong and if possible learn how to fix it.
Did I done something wrong when computing eve.traineddata or can it be another thing?
This question is not the same as this one, we have the same error, but I don't want to bypass it and didn't overridden my eng.traineddata file.
I can link traineddata file if needed, but I'm not sure it's helpful.
I was receiving this error because my .box and .tif files didn't have matching names. After making sure I have pairs of lang.fontName.countNumber.tif and lang.fontName.countNumber.box it started to work. Hope this helps
We are generating xlsx files using a perl script. Files usually contains thousands of records. This makes spotting errors a very difficult operation.
This process was working since years without problems.
This week we got a request to check a file which contains errors. While opening Excel prompted that the file contains errors and asked whether we want to repair them.
In fact we do not want to recover the data but want to know which part of the file is corrupt. The error should be coming from corrupt data and we are interested to identify these data.
the log message shows the following:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<recoveryLog xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
<logFileName>error068200_01.xml</logFileName> </br>
<summary>Errors were detected in file 'D:\Temp\20161020\file_name.xlsx'</summary>
<repairedRecords summary="Following is a list of repairs:"><repairedRecord>Repaired Records: Cell information from /xl/worksheets/sheet1.xml part</repairedRecord>
</repairedRecords>
</recoveryLog>
The error should come from corrupt data. Is there any tool/method which helps to spot this corrupt data?
I tried renaming it a zip file, extracting it and opening it via an XML editor but was not able to find any errors in XML file.
We also checked that the different XML file structures are fine.
Thank you and best regards
As expected, the problem was coming from text cells containing numbers having an E in the middle.I used the following steps to identify the erronous cells.
1. Wrote small Java class to read the file. The class was checking the cell type and displaying the value afterwards.The java program generated an Exception at some line "Cannot get a numeric value from a text cell" even If I was correctly checking the cell type before displaying the content.
2. I checked the opened Excel file at that line and found that the cell contains only 'inf'.
3. I opened the file using open office and looked at the same cells. They contain 0.
4. I debugged the program generating the data and found out that these cells contain data like '914E5514'. Seems that E which was interpreted by Excel as an exponent.We changed the program to use the format '#' for that cell and this solved the issue.
Thank you.
Thank you very much, you helped me a lot by saying that 1 particular content item may be the root problem.
My corrupted content was https://www.example.com XYZ ... ASDAS
Solution: www.example.com XYZ ... ASDAS
This is something which cannot be handled by excel. Would be nice to have a list of thing which do not work
I have a pretty big .kml file (it is the property of my company, and I can legally use it), what I want to import into an .osm file, downloaded from download.geofabrik.de. I can open this .osm map in softwares, for completely offline usage, such as Marble.
My question is: can I somehow merge this kml and osm file, so when I open the merged file in Marble, the routes described by the kml are also visible? So basically, I want to merge a.kml with b.osm, resulting in c.osm, what I can use offline.
Is it possible? If yes, can you direct me in the right direction?
Any help is appreciated, thanks!
Some notes:
I have tried GPSBabel, it indicates that it can convert .kml to .osm. It generates a 65MB .osm file from my 12MB .kml, but when I open it in Marble, it does not show any routes, so it looks like a dead end. :/
The weird thing is, GPSBabel produced an input what QGIS could open. I merged the two .osm file with osmosis, but the problem is, the output is invalid, nothing can open it.
JOSM does not open the original .osm file, what is 1GB of size.
ps: I have posted this on help.openstreetmap.org as well, but for now, nobody could help me, so I am trying to get some answer here, maybe... Sorry for the "repost", and thanks for the help! :)
What you want to do is honestly the wrong way to go about it, but still possible.
The first step is to use ogr2osm with the command-line flags --positive-id, --add-version, --add-timestampand with--id 3000000000` (or some other number larger than the largest node ID in the file.
You will then have a .osm file that Osmosis or Osmconvert can merge with another file, in this case your downloaded software. In the case of osmosis, osmosis --read-xml internal.osm --sort --read-xml extract.osm --sort --merge --write-xml combined.osm (untested)
A more common way would be to download the shapefiles for the region from geofabrik then use ogr2ogr and similar tools to combine them with the .kml file in the output format of your choice.
Keep in mind that if you distribute this "derivative database" you have created, it has to be licensed under the ODbL. This does not apply if you're distributing it internally only.
I sincerely apologize if this isn't the proper forum to discuss this, but I wasn't sure where to go or what would be the best option.
Basically, I'm trying to find a database friendly list of veteran affairs hospitals. The closest thing that I've been able to find is www.va.gov/ofcadmin/docs/CATB.pdf as it has all the information I'm looking for:
Region
Address
City in a separate column
Zip Code in a separate column
State
Facility # (also known as StationID)
VISN
Symbol
I've tried exporting that PDF out into CSV but it's a complete nightmare to get working. So, I was curious if anyone had any ideas or insights into how I could accomplish this task.
First, here's a CSV file containing the data found in CATB.pdf. The very first line contains the column headers, and the rest of the file contains the contents.
http://tmp.alexloney.com/CATB.csv
Now, for the more detailed explanation...I took the PDF you provided a link to, converted it to an HTML document using Adobe Acrobat, then I used a lot of Regular Expressions to parse the file and clean it up. Once the file was cleaned up enough, I was able to write a program to parse through the remainder of the file, grab the state and region, and spit it all out in a nicely formatted CSV.
Hope that helps you!
I believe that PDFILL has an option in it that will convert a PDF file to Excell. Once in Excell you should have no problem converting to a CSV file.