Simple HTML Dom Parser - Skip elements with certain ID - dom
I am using Simple HTML Dom Parser to query Google for a specific keyword and then loop through the content. However, I do not want to query ads or the newsbox. Its easy to exclude ads since the list elements have a different class but the newsbox li element has the same class but an additional id.
Result li element
<li class="g">...</li>
Newsbox li element
<li class="g" id="newsbox">...</li>
How can I exclude li elements with the ID newsbox?
I had a read around here and this is the closest I came as per suggestion of one other guy but it didn't work:
$query = file_get_html('https://google.com/search?q=test');
$li_elements = $query->find('li[class=g id!=newsbox]');
Any other ideas or someone that has solved this before?
UPDATE
I am still working on this and I pretty much came to a dead end. This is my latest code:
include('simple_html_dom.php');
$html = file_get_html('https://www.google.co.uk/search?q=football');
// Find all article blocks
foreach($html->find('#res h3.r') as $article) {
$item['title'] = $article->plaintext;
$item['intro'] = $article->find('a', 0)->href;
$articles[] = $item;
}
print_r($articles);
And this is the printed array
Array
(
[0] => Array
(
[title] => BBC Sport - Football
[intro] => /url?q=http://www.bbc.co.uk/sport/0/football/&sa=U&ei=NkblU-s8h6nQBcCJgOAI&ved=0CBQQFjAA&usg=AFQjCNGHTFqXJoRjHKBSCdKFiW_BX6eGDw
)
[1] => Array
(
[title] => News for football
[intro] => /search?q=football&ie=UTF-8&prmd=ivnsl&source=univ&tbm=nws&tbo=u&sa=X&ei=NkblU-s8h6nQBcCJgOAI&ved=0CB8QqAI
)
[2] => Array
(
[title] => Football Games, Results, Scores, Transfers, News | Sky Sports
[intro] => /url?q=http://www1.skysports.com/football/&sa=U&ei=NkblU-s8h6nQBcCJgOAI&ved=0CCgQFjAE&usg=AFQjCNE4VP4WAHIYJAoPIBJoUx1pC-1jBA
)
[3] => Array
(
[title] => Local business results for football near London NW5
[intro] => https://maps.google.co.uk/maps?um=1&ie=UTF-8&fb=1&gl=uk&q=football&hq=football&hnear=0x48761a535791ef6f:0x493f677c231558c8,London+NW5&sa=X&ei=NkblU-s8h6nQBcCJgOAI&ved=0CC4QtQM
)
[4] => Array
(
[title] => Football news, match reports and fixtures | Football | The Guardian
[intro] => /url?q=http://www.theguardian.com/football&sa=U&ei=NkblU-s8h6nQBcCJgOAI&ved=0CE4QFjAM&usg=AFQjCNHPhgIljb53cFPRHlb1vCa1fmWJag
)
[5] => Array
(
[title] => NewsNow: Football News | Breaking News & Search 24/7
[intro] => /url?q=http://www.newsnow.co.uk/h/Sport/Football&sa=U&ei=NkblU-s8h6nQBcCJgOAI&ved=0CFQQFjAN&usg=AFQjCNEmmlrEayvHdebKTfPykGhHxRioLA
)
[6] => Array
(
[title] => Football365 - Football News, Views, Gossip and much more...
[intro] => /url?q=http://www.football365.com/&sa=U&ei=NkblU-s8h6nQBcCJgOAI&ved=0CFoQFjAO&usg=AFQjCNFKIP3xgtxw9DhNtOhVfpT4pbpLPw
)
[7] => Array
(
[title] => Football - Wikipedia, the free encyclopedia
[intro] => /url?q=http://en.wikipedia.org/wiki/Football&sa=U&ei=NkblU-s8h6nQBcCJgOAI&ved=0CGAQFjAP&usg=AFQjCNF2Fk8WH4rzEvWzmYIEUycZnjvjpg
)
[8] => Array
(
[title] => Football in London - Things To Do - visitlondon.com
[intro] => /url?q=http://www.visitlondon.com/things-to-do/whats-on/sport/football&sa=U&ei=NkblU-s8h6nQBcCJgOAI&ved=0CGYQFjAQ&usg=AFQjCNEdSNJc-mlVpaWEY9yPjcoDSaDLIw
)
[9] => Array
(
[title] => London Football Leagues - 5-a-side - 7-a-side - 11-a-side - Midweek ...
[intro] => /url?q=http://www.londonfootball.co.uk/&sa=U&ei=NkblU-s8h6nQBcCJgOAI&ved=0CHMQFjAR&usg=AFQjCNGnZtZQxUmUYQtDF0Tj5nJRnR2Yig
)
[10] => Array
(
[title] => Football Tickets and Event Details | Ticketmaster UK Sport
[intro] => /url?q=http://www.ticketmaster.co.uk/browse/football-catid-11/sport-rid-10004&sa=U&ei=NkblU-s8h6nQBcCJgOAI&ved=0CHkQFjAS&usg=AFQjCNFwTfpq-klboIEf0EbhlMQWvzHeKQ
)
)
I don't understand why the second result, array[1][title], is stored in the array since as per this line $html->find('#res h3.r') as $article it shouldn't. Its neither wrapped within a div with id #res nor within a h3 tag.
Any idea?
Unfortunately Simple HTML Dom Parser doesnt support such flexibility, however a workarround can be found...
You can first remove the unwanted block, then retrieve the right one:
$query->find('li#newsbox', 0)->outertext = '';
$li_elements = $query->find('li.g');
Edit:
Here's an example code showing how it works:
$input = <<<_DATA_
<div class="g" id="newsbox">Bad node</div>
<div class="g">Usefull node</div>
_DATA_;
// Create a DOM object
$html = new simple_html_dom();
// Load HTML from a string
$html->load($input);
// Remove the bad node
$html->find('div#newsbox', 0)->outertext = ''; // Comment this line to print the original html content
echo $html;
Working code
simple_html_dom claims to support that so it appears to be a bug.
The proper css way to select that would be li.g:not(#newsbox) which is not supported by simple, but is supported by this one.
How about this idea
// Remove HTML tags
$elements_remove = array('#newsbox','header','footer','script','css');
foreach($elements_remove as $e) {
foreach($html->find($e) as $item) {
$item->innertext = '';
}
}
That works for me. I added in some HTML elements as well in the exclusion array.
Related
facebook api php news feed get large picture
im adding news feed pictures in my members site and need help getting the large pictures from the news feed. This is my code to get the news feed pictures $json_object = #file_get_contents('https://graph.facebook.com/me/home?access_token=' . $fb_access_token); $news_feeds = json_decode($json_object, true); foreach($news_feeds['data'] as $news_feed) { echo '<img src="' . $news_feed['picture'] . '" />' . '<br />'; } This give me the news feed pictures fine but all different sizes. any help would be good! I have looked around for a long time and still cant find the answer? this is a example of a facebook news feed array i am trying to get the large picture from. [id] => 1423830694520861_1508536172716979 [from] => Array ( [category] => Public figure [name] => Dan Bilzerian [id] => 1423830694520861 ) [to] => Array ( [data] => Array ( [0] => Array ( [category] => Public figure [name] => Jen Selter [id] => 244183679061369 ) ) ) [message] => Happy 21st to Jen Selter and her #NeverEndingAss [message_tags] => Array ( [14] => Array ( [0] => Array ( [id] => 244183679061369 [name] => Jen Selter [type] => page [offset] => 14 [length] => 10 ) ) ) [picture] => https://scontent-b.xx.fbcdn.net/hphotos-xpf1/v/t1.0-9/p130x130/10563168_1508536129383650_2440015247017447728_n.jpg?oh=bfdb86c35b4b0ffe5a4ad9dbe65aba59&oe=54688CA7 [link] => https://www.facebook.com/danbilzerianofficial/photos/a.1425329161037681.1073741829.1423830694520861/1508536129383650/?type=1&relevant_count=1 [icon] => https://fbstatic-a.akamaihd.net/rsrc.php/v2/yx/r/og8V99JVf8G.gif I belive I have to change the _n.jpg to _b.jpg or p130x130 to b740x740 but the function is not working ?? I have tryed running this function before/after the foreach loop. But it is still not working? $news_feed ['picture'] = str_replace("_n.jpg", "_b.jpg", $news_feed['picture']); also this $news_feed ['picture'] = str_replace("p130x130", "b740x740", $news_feed['picture']); can some one please help, :( thanks
Try object_id ....it worked for me :) echo '<img src="https://graph.facebook.com/' . $news_feed['object_id'] . '/picture?type=normal">' . '<br />';
SugarCRM: how to get all contacts for an account via REST API
I am trying to get all contacts for a particular account (i know the account id) from SugarCRM using the v2 REST API. I am sending a GET request with the following parameters: input_type => 'JSON' response_type => 'JSON' method => 'get_entry_list' rest_data => '{session:"some-valid-session-id", module_name:"Contacts", query:"contacts.account_id=some-valid-id"}' I expect to get all contacts that are related to this accoutn, but instead I get an error "... MySQL error 1054: Unknown column 'contacts.account_id' in 'where clause'" However, when I try to get all contacts without providing any query (query='') I get all the contacts with all their properties and I can see that there is an account_id property. Can anyone help?
Try query:"accounts.id=some-valid-id". It has worked for me in the past with the SOAP API.
Here's my method. It uses this wrapper class: http://github.com/asakusuma/SugarCRM-REST-API-Wrapper-Class/ /** * returns an array of contacts that are related to the accountId passed as a param. * The array returned will be an array of associative arrays. * #param $accountId * #param array $contactSelectFields optional sets the different items to return, default includes id, email1, name, title, phone_work, and description * #return array * */ public function getAllContactsAtOrganization( $accountId, $contactSelectFields=array("id", "email1", "name", "title", "phone_work", "description")) { $sugar = new Sugar_REST( SUGAR_REST_URL, SUGAR_USER_NAME, SUGAR_PASSWORD); $fields = array( "Accounts" => array("id", "name"), "Contacts" => $contactSelectFields); $options = array( 'where' => "accounts.id='$accountId'" ); $apiResult = $sugar->get_with_related("Accounts", $fields, $options ); $contacts = array(); foreach( $apiResult['relationship_list'][0]['link_list'][0]['records'] as $almostContact) { $curr = array(); foreach($contactSelectFields as $key) { $curr[$key] = $almostContact['link_value'][$key]['value']; } $contacts[] = $curr; } //print_r($contacts); return $contacts; } Sample Return Array ( [0] => Array ( [id] => 47e1376c-3029-fc42-5ae2-51aeead1041b [email1] => johndoe#gmail.com [name] => Blake Robertson [title] => CTO [phone_work] => 8881112222 [description] => Opinionated developer that hates SugarCRM's REST API with a passion! ) [1] => Array ( [id] => 4c8e3fcf-8e69-ed7d-e239-51a8efa4f530 [email1] => csmith#mailinator.com [name] => Carolyn Smith [title] => Director of Something [phone_work] => 832-211-2222 [description] => She's a smooth operator... ) ) For Reference Purposes Here's the "rest-data" (nicely formatted) Used print_r of the php array Array ( [session] => 9j7fm4268l0aqm25kvf9v567t3 [module_name] => Accounts [query] => accounts.id='e583715b-7168-5d61-5fb1-513510b39705' [order_by] => [offset] => 0 [select_fields] => Array ( [0] => id [1] => name ) [link_name_to_fields_array] => Array ( [0] => Array ( [name] => contacts [value] => Array ( [0] => id [1] => email1 [2] => name [3] => title [4] => phone_work [5] => description ) ) ) [max_results] => 20 [deleted] => FALSE ) Post Body method=get_entry_list&input_type=JSON&response_type=JSON&rest_data={"session":"iov5a257lk5acsg9l3ll6kuej3","module_name":"Accounts","query":"accounts.id='e583715b-7168-5d61-5fb1-513510b39705'","order_by":null,"offset":0,"select_fields":["id","name"],"link_name_to_fields_array":[{"name":"contacts","value":["id","email1","name","title","phone_work","description"]}],"max_results":20,"deleted":"FALSE"}method=logout&input_type=JSON&response_type=JSON&rest_data={"session":"iov5a257lk5acsg9l3ll6kuej3"}
I'm not familiar with SugarCRM yet, but did you try with just account_id=some-valid-id ? because I also did a REST request to add a contact to sugarcrm and I didn't mention the table's name, just the fields. I didn't try this but it seems logical to me since you already mentionned the module's name, so I guess sugar kind of knows what table(s?) to look for when processing your query.
Parse Facebook API data using loop for getting fan page ID#s?
I've been learning how to parse json data returned from the facebook->api. I've figured out how to fetch fan pages from a specific profile id and want to parse them using a loop! Heres the code and example I have below: This is the data I get back from the facebook->api Array ( [0] => Array ( [page_id] => XXXXXX60828 ) [1] => Array ( [page_id] => XXXXXX0750 ) [2] => Array ( [page_id] => XXXXXX91225 ) [3] => Array ( [page_id] => XXXXXX1960343 ) [4] => Array ( [page_id] => XXXXXX60863 ) [5] => Array ( [page_id] => XXXXXX8582 ) ) I need to be able to put this data in a loop and extract the page_id#s out... still getting familiar with json and am having issues figuring this out? How can I get this in a loop using for each and strip out the page id#s?
Why not just use the array you've got: <ul> <?php foreach($pages as $k=>$v) { echo "<li>page id#: $v['page_id'] </li>"; } ?> </ul>
Problems getting monthly active users correctly (via Facebook Graph API)
Hey there! I'm not able to fetch a reliable monthly active users number of my application. I tried it with FQL: [...] $end_time = date('Y-m-d', time()-(60*60*24*2)); // Dont' know what is correct. Sometimes it's *2 sometimes it's working with *3 $fql = $facebook->api(array( "method" => "fql.query", "query" => "SELECT metric, value FROM insights WHERE object_id='000000000' AND metric='application_active_users' AND end_time=end_time_date('".$end_time."') AND period=period('month')" )); [...] Looks like it's not possible to get a value up-to-the-minute. I want to get the same value as stated on my application page. But with this code it's different every day. I also tried this solution: $fql = $facebook->api('/000000000/insights/application_active_users/month'); but as response I only get out dated values which are a few days old: Array ( [data] => Array ( [0] => Array ( [id] => 000000000/insights/application_active_users/month [name] => application_active_users [period] => month [values] => Array ( [0] => Array ( [value] => 166345 [end_time] => 2010-12-09T08:00:00+0000 ) [1] => Array ( [value] => 167679 [end_time] => 2010-12-10T08:00:00+0000 ) [2] => Array ( [value] => 168983 [end_time] => 2010-12-11T08:00:00+0000 ) ) [description] => Monthly Users who have engaged with your application or viewed your application (Unique Users) ) ) [paging] => Array ( [previous] => https://graph.facebook.com/000000000/insights/application_active_users/month?since=1291556506&until=1291815706 [next] => https://graph.facebook.com/000000000/insights/application_active_users/month?since=1292074906&until=1292334106 ) ) What am I doing wrong?
Facebook insights data is not available just-in-time. The insights graph will always display historic data only. If you are interested for current monthly_active_users number look into the applications table with fql.
Populate date in MultiOptions element of Zend Form
Hello I have an array like this : Array ( [id] => 1 [code] => Dep98 [description] => Hello World [facility] => Array ( [0] => FacName1 [1] => FacName2 ) ) But when I populate this array to Zend_Form it only show data in textboxes elements having same id as defined in array index not in multiselect dropdown element. for example: 'code' id is also define in form's first textbox element, 'description' id is also define in form's second textbox element, 'facility' id is also define in form's third MultiOptions element But in MultiOptions it does not show any record.
I agree with Travis, you should pass an array with following values to populate: $vals = array('code'=>5, 'description' => 'testing', 'facility' => array(1=>'FacName2')); $form->populate($vals); But note this - options must be filled in the facility form element before attempting to populate or validate, dont expect facility value to be set if there is an empty list of options in the facility element.
What exactly do you want in the drop down box? The array you pass to multiOptions must be in the form of value => title. You may want to loop through your results and generate an options array. For example $options = array(); foreach ( $data as $value ) { $options[$value['id']] = $value['description']; } $select = Zend_From_Element_Select("select_field"); $select->multiOptions($options);
Try this: Array ( [id] => 1 [code] => Dep98 [description] => Hello World [facility] => Array ( FacName1 => [0] FacName2 => [1] ) )