PDF Table of Contents Parsing with iOS Quartz 2D - iphone

This question has been asked before, I know. However, nobody has answered it well. I'm wondering how to parse a PDF's "table of contents" on the iPhone. The docs tell me to use CGPDFDocumentGetCatalog but not how to use it. All they say is that it returns a dictionary. Also, I can't find any example code. Any suggestions?

looks like the closest thing seen on SO is Create a table of contents from a pdf file

It's basically just parsing the CGPDFDictionary called "Outline" in the CGPDFPage.
// get outline & loop through dictionary...
CGPDFDictionaryRef outlineRef;
if(CGPDFDictionaryGetDictionary(pdfDocDictionary, "Outlines", &outlineRef)) {
}
then you start with the First element and parse your way through.
CGPDFDictionaryGetDictionary(outlineRef, "First", &firstEntry)
You want to get the Title and the Destination.
NSString *outlineTitle = PSPDFStringFromPDFDict(outlineElementRef, #"Title");
CGPDFDictionaryGetObject(outlineElementRef, "Dest", &destinationRef)
The tricky thing starts with getting the correct destination, because there are (horray, PDF!) several ways to store it, plus several ways that are not defined in the PDF Reference but still out in the wild. Plus several variants that are just broken and you have to deal with it.
For example, you could get the Count of the outline dictionary using
CGPDFInteger elements;
if(CGPDFDictionaryGetInteger(outlineRef, "Count", &elements)) {
PSPDFLog(#"parsing outline: %ld elements. (Count will be ignored anyway)", (long int)elements);
}else {
PSPDFLogError(#"Error while parsing outline. No outlineRef?");
}
But note that Count sometimes is invalid due to broken PDF creation tools. See PDF as HTML. Even if it's broken, parsers will do their best to display as much data as they can. So my advice is to ignore Count and parse the dictionary anyway. (A few weeks ago I encountered a document that had Count = -10. Go figure)
I can't post the full code, as it's from my commercial PDF library PSPDFKit, and I need to make a living out of it ;) But this should get you started.

Related

Parsing XML and retrieving attributes from (nested?) elements

I am trying to get specific data from an XML file, namely X, Y coordinates that are appear, to my beginners eyes, attributes of an element called "Point" in my file. I cannot get to that data with anything other than a sledgehammer approach and would gratefully accept some help.
I have used the following successfully:
for Shooter in root.iter('Shooter'):
print(Shooter.attrib)
But if I try the same with "Point" (or "Points") there is no output. I cannot even see "Point" when I use the following:
for child in root:
print(child.tag, child.attrib)
So: the sledgehammer
print([elem.attrib for elem in root.iter()])
Which gives me the attributes for every element. This file is a single collection of data and could contain hundreds of data points and so I would rather try to be a little more subtle and home in on exactly what I need.
My XML file
https://pastebin.com/abQT3t9k
UPDATE: Thanks for the answers so far. I tried the solution posted and ended up with 7000 lines of which wasn't quite what I was after. I should have explained in more detail. I also tried (as suggested)
def find_rec(node, element, result):
for item in node.findall(element):
result.append(item)
find_rec(item, element, result)
return result
print(find_rec(ET.parse(filepath_1), 'Shooter', [])) #Returns <Element 'Shooter' at 0x125b0f958>
print(find_rec(ET.parse(filepath_1), 'Point', [])) #Returns None
I admit I have never worked with XML files before, and I am new to Python (but enjoying it). I wanted to get the solution myself but I have spent days getting nowhere.
I perhaps should have just asked from the beginning how to extract the XY data for each ShotNbr (in this file there is just one) but I didn't want code written for me.
I've managed to get the XY from this file but my code will never work if there is more than one shot, or if I want to specifically look at, say, shot number 20.
How can I find shot number 2 (ShotNbr="2") and extract only its XY data points?
Assuming that you are using:
xml.etree.ElementTree,
You are only looking at the direct children of root.
You need to recurse into the tree to access elements lower in the hierarchical tree.
This seems to be the same problem as ElementTree - findall to recursively select all child elements
which has an excellent answer that I am not going to plagiarize.
Just apply it.
Alternatively,
import xml.etree.ElementTree as ET
root = ET.parse("file.xml")
print root.findall('.//Point')
Should work.
See: https://docs.python.org/2/library/xml.etree.elementtree.html#supported-xpath-syntax

Google Sheets - Retrieve "A:File1" to "A:File2" where "Sheetname:File1" = "B:File2" if "C:File2" is between "E" and "F" in "File1"

Sorry for the somewhat long title, but I was told to be as specific as possible. :D
My problem will require some explantion.
So, I have 2 spreadsheets files ("Konverteringstabeller" and "Tee Posen").
In "Tee Posen" I have a sheet named "Scores MIK" (golf scorecard and my name).
In "Konverteringstabeller" I have sheets with conversion tables for multiple golf courses, but if one works, all should.
What I need is to find out what course handicap I would get if my golf handicap is "HCP 26,0" (as shown in File 2 Picture), and in this case that result should be 29 (not visible), but you should get the point.
(example: golf hcp 10 would result in course hcp 11, because 10 is between 9,9-10,7)
While I have been able to find the right result, it has only been in the "Konverteringstabeller" spreadsheet file and that is not the place I need it.
I want to have it written in E6 in the "Scores MIK" sheet in File 2.
I should mention that in "Scores MIK : File 2", cell C2 (Ikast Golf Klub) has data validation so I can easily change between the different courses in the "Konverteringstabeller" file once I add more.
What I have been messing with is something with vlookup and importrange with concatenate in it, but I can't figure out how to do it, so I ask for your help.
And I am by no means skilled in the art of Spreadsheets, so I would very much appreciate a detailed explanation.
Picture - Scores MIK (File 2)
Picture - Ikast Golf Klub (File 1)
Thanks in advance!
// Mikkel Christensen
OK so a couple notes - One is that to join a static cell where you keep the sheet name but allow it to chance you should add '$' around it, also if the rows for B8-E70 will always be the same position on the various sheets you also need to add $ around those as well.
here is an example of the whole formula
=IFERROR(ARRAYFORMULA(VLOOKUP(E5:E25;IMPORTRANGE("spreadsheet key";"'"&C2&"'!$B$8:$E$70");4;TRUE)))
And lastly - using the "&" operator to concatenate is better at least in my opinion because concatenate sometimes does not work as well with array formula - plus I find it personally quicker and easier to use that having wrap yet another function around my stuff.

importing website to google sheets

I have tried searching everywhere online for a good answer but cannot seem to find anything that matches specifically what i am looking for.
When i use the IMPORTHTML function in google sheets, i end up with data that looks like:
${player.name} (${player.position}, ${team.abbrev}) ${opponent.abbrev} #${opponent_rank} ${minutes} ${pts} ${fgm}-${fga} ${ftm}-${fta} ${p3m}-${p3a} ${treb} ${ast} ${stl} ${blk} ${tov} ${pf} ${fp} $${salary} ${ratio}
the code that i am using looks like this:
=IMPORTHTML("", "table",2)
When I use the same as above (=IMPORTHTML("", "table",2)) only with "0" as my index, it pulls this:
Opp Stats
Player Team Rank Min Pts FGM/A FTM/A 3PM/A Reb Ast Stl Blk Tov Foul FP Cost Value
Basically, I am attempting to pull the table data from this website:
https://www.numberfire.com/nba/fantasy/fantasy-basketball-projections
(because of my rep i cannot post more than two links, however my IMPORTHTML function has the above link input in both functions)
into a google sheet. Please help. any feedback is much appreciated... thanks!
Best advice is to find another Web table you can import. If you do "view source" on the page, you will find that the table content is dynamically populated from a variable named NF_DATA.
You need to create a document script to extract the data you want:
function this_is_test() {
var response = UrlFetchApp.fetch("https://www.numberfire.com/nba/fantasy/fantasy-basketball-projections");
raw_content = response.getContentText();
re = new RegExp('"daily_projections":\\[[^\\]]+','i');
proj = raw_content.match(re);
Logger.log(proj);
}
It will extract all text in-between "daily_projections":[ and ], which is (as of today):
"daily_projections":[{"nba_player_id":"77","nba_game_id":"20015","date":"2016-01-19","nba_team_id":"21","opponent_id":"7","season":"2016","game_play_probability":"1.00","game_start":"1.00","minutes":36.3,"fgm":"8.8","fga":"17.1","p3m":"1.9","p3a":"4.8","ftm":"6.2","fta":"6.9","oreb":"0.8","dreb":"7.2","ast":"4.7","stl":"1.1","blk":"1.2","tov":"2.7","pf":"1.8","pts":"25.3","ts":"0.628","efg":"0.655","oreb_pct":"2.6","dreb_pct":"21.4","treb_pct":"12.4","ast_pct":"23.4","stl_pct":"1.5","blk_pct":"2.4","tov_pct":"12.1","usg":"27.8","ortg":"122.2","drtg":"101.8","nerd":"22.34","star_street_fp":43.08,"star_street_salary":0,"star_street_ratio":0,"draft_street_daily_fp":39.75,"draft_street_daily_salary":0,"draft_street_daily_ratio":0,"fanduel_fp":43.85,"fanduel_salary":9900,"fanduel_ratio":4.43,"draft_kings_fp":46.55,"draft_kings_salary":9900,"draft_kings_ratio":4.7,"fantasy_feud_fp":39.75,"fantasy_feud_salary":153600,"fantasy_feud_ratio":0.26,"fanthrowdown_fp":45.2,"fanthrowdown_salary":0,"fanthrowdown_ratio":0,"fantasy_aces_fp":44.25,"fantasy_aces_salary":7250,"fantasy_aces_ratio":6.1,"draftday_fp":45.25,"draftday_salary":18800,"draftday_ratio":2.41,"fantasy_score_fp":45.6,"fantasy_score_salary":9600,"fantasy_score_ratio":4.75,"draftster_fp":43.75,"draftster_salary":9400,"draftster_ratio":4.65,"yahoo_fp":44.8,"yahoo_salary":52,"yahoo_ratio":0.86,"treb":8},{"nba_player_id":"397","nba_game_id":"20015","date":"2016-01-19","nba_team_id":"21","opponent_id":"7","season":"2016","game_play_probability":"1.00","game_start":"1.00","minutes":35,"fgm":"8.6","fga":"18.0","p3m":"1.3","p3a":"4.1","ftm":"5.9","fta":"7.2","oreb":"1.3","dreb":"5.0","ast":"8.8","stl":"2.0","blk":"0.4","tov":"3.6","pf":"2.2","pts":"24.4","ts":"0.576","efg":"0.592","oreb_pct":"4.6","dreb_pct":"15.3","treb_pct":"10.2","ast_pct":"44.4","stl_pct":"3.0","blk_pct":"0.8","tov_pct":"14.6","usg":"31.3","ortg":"117.8","drtg":"101.2","nerd":"19.75","star_street_fp":44.48,"star_street_salary":0,"star_street_ratio":0,"draft_street_daily_fp":41.33,"draft_street_daily_salary":0,"draft_street_daily_ratio":0,"fanduel_fp":46.36,"fanduel_salary":10500,"fanduel_ratio":4.42,"draft_kings_fp":49.13,"draft_kings_salary":10700,"draft_kings_ratio":4.59,"fantasy_feud_fp":41.33,"fantasy_feud_salary":169800,"fantasy_feud_ratio":0.24,"fanthrowdown_fp":47.33,"fanthrowdown_salary":0,"fanthrowdown_ratio":0,"fantasy_aces_fp":46.68,"fantasy_aces_salary":7800,"fantasy_aces_ratio":5.98,"draftday_fp":47.1,"draftday_salary":20500,"draftday_ratio":2.3,"fantasy_score_fp":48.48,"fantasy_score_salary":9900,"fantasy_score_ratio":4.9,"draftster_fp":45.38,"draftster_salary":9500,"draftster_ratio":4.78,"yahoo_fp":47.01,"yahoo_salary":59,"yahoo_ratio":0.8,"treb":6.3},{"nba_player_id":"279","nba_game_id":"20016","date":"2016-01-19","nba_team_id":"11","opponent_id":"24","season":"2016","game_play_probability":"1.00","game_start":"1.00","minutes":36.7,"fgm":"7.6","fga":"18.1","p3m":"2.5","p3a":"6.9","ftm":"5.5","fta":"6.5","oreb":"1.1","dreb":"5.8","ast":"5.3","stl":"1.8","blk":"0.4","tov":"3.6","pf":"2.4","pts":"22.5","ts":"0.537","efg":"0.610","oreb_pct":"3.3","dreb_pct":"17.6","treb_pct":"10.5","ast_pct":"25.1","stl_pct":"2.5","blk_pct":"0.9","tov_pct":"15.3","usg":"29.1","ortg":"104.1","drtg":"99.2","nerd":"5.26","star_street_fp":38.55,"star_street_salary":0,"star_street_ratio":0,"draft_street_daily_fp":34.13,"draft_street_daily_salary":0,"draft_street_daily_ratio":0,"fanduel_fp":39.53,"fanduel_salary":8700,"fanduel_ratio":4.54,"draft_kings_fp":42.93,"draft_kings_salary":9200,"draft_kings_ratio":4.67,"fantasy_feud_fp":34.13,"fantasy_feud_salary":138800,"fantasy_feud_ratio":0.25,"fanthrowdown_fp":41.13,"fanthrowdown_salary":0,"fanthrowdown_ratio":0,"fantasy_aces_fp":39.88,"fantasy_aces_salary":6500,"fantasy_aces_ratio":6.14,"draftday_fp":41.3,"draftday_salary":16600,"draftday_ratio":2.49,"fantasy_score_fp":41.68,"fantasy_score_salary":8400,"fantasy_score_ratio":4.96,"draftster_fp":39.45,"draftster_salary":8000,"draftster_ratio":4.93,"yahoo_fp":40.78,"yahoo_salary":47,"yahoo_ratio":0.87,"treb":6.9},{"nba_player_id":"2137","nba_game_id":"20014","date":"2016-01-19","nba_team_id":"38","opponent_id":"17","season":"2016","game_play_probability":"1.00","game_start":"1.00","minutes":35,"fgm":"8.0","fga":"16.6","p3m":"0.4","p3a":"1.3","ftm":"4.5","fta":"6.0","oreb":"2.6","dreb":"7.8","ast":"2.2","stl":"1.0","blk":"2.2","tov":"1.9","pf":"2.6","pts":"20.8","ts":"0.541","efg":"0.521","oreb_pct":"8.6","dreb_pct":"24.8","treb_pct":"16.6","ast_pct":"11.5","stl_pct":"1.4","blk_pct":"4.9","tov_pct":"9.4","usg":"27.0","ortg":"107.9","drtg":"103.1","nerd":"5.60","star_street_fp":41.05,"star_street_salary":0,"star_street_ratio":0,"draft_street_daily_fp":36.55,"draft_street_daily_salary":0,"draft_street_daily_ratio":0,"fanduel_fp":41.08,"fanduel_salary":10300,"fanduel_ratio":3.99,"draft_kings_fp":44.25,"draft_kings_salary":10000,"draft_kings_ratio":4.43,"fantasy_feud_fp":36.55,"fantasy_feud_salary":149400,"fantasy_feud_ratio":0.24,"fanthrowdown_fp":41.8,"fanthrowdown_salary":0,"fanthrowdown_ratio":0,"fantasy_aces_fp":41.6,"fantasy_aces_salary":7400,"fantasy_aces_ratio":5.62,"draftday_fp":40.43,"draftday_salary":18200,"draftday_ratio":2.22,"fantasy_score_fp":42.55,"fantasy_score_salary":9700,"fantasy_score_ratio":4.39,"draftster_fp":41.53,"draftster_salary":9200,"draftster_ratio":4.51,"yahoo_fp":41.28,"yahoo_salary":54,"yahoo_ratio":0.76,"treb":10.4},{"nba_player_id":"362","nba_game_id":"20013","date":"2016-01-19","nba_team_id":"15","opponent_id":"16","season":"2016","game_play_probability":"1.00","game_start":"1.00","minutes":34.9,"fgm":"7.3","fga":"15.4","p3m":"1.4","p3a":"3.7","ftm":"4.8","fta":"6.0","oreb":"1.7","dreb":"6.1","ast":"2.3","stl":"0.7","blk":"1.0","tov":"1.7","pf":"2.0","pts":"20.6","ts":"0.571","efg":"0.594","oreb_pct":"6.2","dreb_pct":"19.9","treb_pct":"13.3","ast_pct":"12.1","stl_pct":"1.1","blk_pct":"2.2","tov_pct":"8.5","usg":"26.5","ortg":"115.3","drtg":"104.5","nerd":"11.63","star_street_fp":34.93,"star_street_salary":0,"star_street_ratio":0,"draft_street_daily_fp":30.85,"draft_street_daily_salary":0,"draft_street_daily_ratio":0,"fanduel_fp":35.11,"fanduel_salary":7800,"fanduel_ratio":4.5,"draft_kings_fp":37.05,"draft_kings_salary":7600,"draft_kings_ratio":4.88,"fantasy_feud_fp":30.85,"fantasy_feud_salary":120400,"fantasy_feud_ratio":0.26,"fanthrowdown_fp":36.2,"fanthrowdown_salary":0,"fanthrowdown_ratio":0,"fantasy_aces_fp":35.5,"fantasy_aces_salary":5900,"fantasy_aces_ratio":6.02,"draftday_fp":35.43,"draftday_salary":13950,"draftday_ratio":2.54,"fantasy_score_fp":36.35,"fantasy_score_salary":7000,"fantasy_score_ratio":5.19,"draftster_fp":35.35,"draftster_salary":7200,"draftster_ratio":4.91,"yahoo_fp":35.81,"yahoo_salary":40,"yahoo_ratio":0.9,"treb":7.8},{"nba_player_id":"2249","nba_game_id":"20014","date":"2016-01-19","nba_team_id":"17","opponent_id":"38","season":"2016","game_play_probability":"1.00","game_start":"1.00","minutes":35.7,"fgm":"7.2","fga":"16.6","p3m":"0.9","p3a":"3.2","ftm":"4.6","fta":"6.3","oreb":"1.3","dreb":"2.9","ast":"2.1","stl":"0.9","blk":"0.5","tov":"2.2","pf":"2.4","pts":"20.2","ts":"0.521","efg":"0.530","oreb_pct":"4.3","dreb_pct":"9.6","treb_pct":"6.9","ast_pct":"10.2","stl_pct":"1.3","blk_pct":"1.0","tov_pct":"10.3","usg":"26.7","ortg":"101.6","drtg":"111.5","nerd":"-7.07","star_street_fp":28.68,"star_street_salary":0,"star_street_ratio":0,"draft_street_daily_fp":23.65,"draft_street_daily_salary":0,"draft_street_daily_ratio":0,"fanduel_fp":28.99,"fanduel_salary":6700,"fanduel_ratio":4.33,"draft_kings_fp":30.75,"draft_kings_salary":6900,"draft_kings_ratio":4.46,"fantasy_feud_fp":23.65,"fantasy_feud_salary":113500,"fantasy_feud_ratio":0.21,"fanthrowdown_fp":29.65,"fanthrowdown_salary":0,"fanthrowdown_ratio":0,"fantasy_aces_fp":29.2,"fantasy_aces_salary":4900,"fantasy_aces_ratio":5.96,"draftday_fp":28.43,"draftday_salary":12000,"draftday_ratio":2.37,"fantasy_score_fp":30.3,"fantasy_score_salary":6000,"fantasy_score_ratio":5.05,"draftster_fp":29.23,"draftster_salary":5900,"draftster_ratio":4.95,"yahoo_fp":29.44,"yahoo_salary":29,"yahoo_ratio":1.02,"treb":4.2},{"nba_player_id":"370","nba_game_id":
Note that even this is not complete. You need to somehow map nba_player_id to the appropriate name. Anyway, a lot coding will be involved...

Foursquare venue's category photos issue

i am having trouble about json part of venues, in this picture i am trying to take the prefix and suffix, i am putting size between them but my problem is when i try to put them together the link of prefix + size + suffix comes like this -> i am taking prefix and suffix in seperate NSMutableArray's but when i try to join them together it's not working. and here is my way to join them.
where am i doing this wrong?
Are you sure that your objects in the imagePrefix and imageSuffix arrays are actually strings? Because judging from your logs it looks as if you're trying to concatenate two arrays and a string. If you let us know what is actually in those arrays you might get more helpful answers. You must be doing some conversion/manipulation from the original JSON, as in the API they get returned as dictionary items not arrays.
On a unrelated note, consider using fast enumeration (for id item in array) rather than writing out the for statement as you've done. Generally speaking it's also much better to post your code as text using markdown syntax rather than images: makes it much harder to copy/paste your code into an answer.
so thanks to "Matthias Bauch" i figured it out and here is my answer for my own question :)
for (int e = 0; e<=[imagePrefix count]-1; e++) {
NSLog(#"%#b_32%#", [[imagePrefix objectAtIndex:e] objectAtIndex:0], [[imageSuffix objectAtIndex:e] objectAtIndex:0]);
}
Thanks guys!

Extracting data(strings) from a string large string

A long time ago I had to extract data from a string, and I went with a while loop that went through the whole string char by char extracting bits of data that I need. It wasn't very efficient but it worked.
In my latest app I would like to try and do it in the way that a good engineer would do it. Are there ways to search the string for an expression? or a sub string maybe?
For example out of the html in the string, there is a line that will contain a team name.
<td width="25%"><span class="teamname">Blue Bombers</span></td>
Is there a call I can do that would find the "teamname" and then extract the teamname from between the > <.
I could go char by char saving the last 10 chars to a string until the string equals "teamname", then keep going until i hit the > save everything i get until i again hit a <. but i guess thats taking the easy inefficient way.
Many Thanks
-Code
You can get the range of string "class" using NSRange, then do your logic... it will probably reduce the character searching..
Your code should be like follows,
if ([substring rangeOfString:#"class"].location != NSNotFound) {
// "class" was found
else {
// "class" was not found
}
If that's the only part of the string you're interested in and then just find a starting point like "teamname" via -rangeOfString:. If there's more than one occurrence then make repeated calls with -rageOfString:options:range:.
If you need more comprehensive parsing, however..
If this string is actual XHTML then you may be able to use one of the various XML parsers, e.g. TouchXML, and then find what you need via DOM lookups. However if (as seems likely) it's not pure XHTML then this is unlikely to help. In that case you might try loading up the HTML in an offscreen UIWebView and using JavaScript calls to find specific elements.