I am developing a site which is supposed to get the news content of other sites, something like this. but without redirecting to the host for reading the news content.
now the problem is that I don't know what is the best way to get the content completely. I know that I can use RSS feed for each site but it has only a short description of each news not the whole story. I have also read the related questions in SO like these:
How to get the full content from the rss feed in javascript
How to extract the full content from a partial content rss
but none of them solved my problem .
now I wanna ask what is the best way to get the whole content of news from different sites if it is necessary to go directly to them?
I am sorry because of bad english and if my question is not clear enough I can explain it even more
thanks in advance
You could use web scraping library like boilerpipe to extract content from news sites, but scraping breaks easily(if the target site changes layout for example) and there might be legal issues in extracting full content from other sites and displaying in yours.
Edit: I tried boilerpipe api demo and the library seems very smart at extracting articles from web pages.
Related
Website is fully dynamic.
meta tags, opengraph tags and contents are created dynamicially on webpages.
I might be doing something wrong. Please guide me to get approved for GOOGLE ADSENSE Program.
Google Adsense gave reason "Insufficient content" for this
I think the only real answer is to implement some kind of partial caching. If needed content is not in the source code of your pages, it won't be indexed.
What exactly do you mean by "fully dynamic" and what parts do you want to be indexed?
I'm trying to determine the best way to "merge" my orchard blog into my existing website. Currently the blog accessed outside the site.
I threw together a quick view in my MVC site that just loads the blog into an iframe. Any other ideas?
The blog is tuned up with a great theme and tons of mods & styling that matches my main site design to a T.
On the home page of my site, I'm using the RSS feed to output a list of the last 3 blog posts. My idea is that the user will click on a blog post link and go directly the view that hosts the blog in the inline frame.
I guess the only variable that I haven't handled yet is how to load up the correct page in the blog based on the link that the user clicked on my main site home page.
I've read other posts on this subject and it seems like the solution that is always offered is to merge all the code from the main website into Orchard which seems insane...I have a very large auction based website, taking all that logic & content and putting into Orchard is not an option.
Hope all that makes sense, thanks for the input. I can't think it would be a huge issue to "seamless" integrate my blog with my MVC site.
Orchard was never designed to be integrated into an existing application, so something like what you've done is what you have to do. The iframe however has a number of problems, such as its fixed size, and awkward navigation. It's better to integrate data than markup. It's now easy to build WebAPI controllers to expose Orchard data. You could consume that data in your application and render it there. That enables you to manipulate the data before rendering, which is of course easier than manipulating rendered HTML. For example, you can build your own link URLs so that clicking on a post's title goes to an action on your site that fetches the post contents rather than the Orchard post URL.
One final comment: It is a little weird that an auction website would need to integrate a blog in the middle of its own rendering. Shouldn't the blog be a separate section of the site?
I have over 10,000 pages on my website. I just created a php script to automatically integrate the facebook comments widget into each pages.
However, i was wondering if there is a way to monitor the latest comments added to my website so i don't have to browse through 10,000 pages to see the latest comments.
I am also wondering if i will be able to delete comments by other users ? How can facebook tell that i am the webmaster of the page ? If some user leave nasty comments on one of the pages i want to be able to delete them
Facebook's Graph API allows you to query for Facebook data, including comments. It's pretty easy to use but does require a bit of code to make web requests, parse JSON, etc. You can pass in filters, such as only wanting comments after a certain date, which makes querying for "new comments" simple.
The API will return all comments (even those deleted by you; afaik there is no way to tell whether a particular comment has been deleted). The results are returned in pages, so importing a lot of comments can take a bit (as you need multiple round-trips). Also, there is no way to use the API to delete posts - this has to go through the Facebook web pages (or some other means I don't know about).
The documentation pages are pretty exhaustive and will explain how to get started.
Hello if you want to add face book comment box in your website . then just go to face book plugin's and copy the comment box (CODE) .
now open your website admin penal . if your website created in blogger then go to layout . if in other setup. then go to plugin's . you see some layout hear add your code in (top down ) bar .
if you doesn't understand then 2nd way.
go to website and open your dashboard now go to website template . now edit the template .
now you see some codes in new tab . scroll down and put your cods in HTML Body place .
i think you understand : for more shahzebraza425#yahoo .com / www.megaphotocompetition .com /
www.shahzebraza .tk / www.wuwsoftware .tk
Hello I am using GDataXML to parse RSS Feeds.
However most of todays feeds doesn't show the full text article. So most of the times I end up with just a tiny piece of the whole thing.
I see this feature in a lot of iPhone and iPad readers - it kinda fetches the article from the web and put it in full text.
So how do i do that?
My idea is this - the root element starts with the start of the article.
So if the root element have [article]
i need to go to the website, fetch the html code between the starting divs, and then display it in my app.
So how do i get the code between those divs? regular expressions or what? I want example thanks.
And finally how do i display images after I get the full article in html format?
Thanks guys and regards.
use MWFeedParser you will get RSS Feeds in
identifier, title, link, date, updated, summary, content, enclosures
I use MWFeedParser as well, because it will get all the elements of a feed entry, but you are correct that it will not do a "deep dive" into all of the links in the feed entry.
If you want to bring in the full content from the link, and the full content from the enclosures (such as audio or video from a podcast), you are basically talking about saving the web page for offline viewing. For a full html page, you would have to save that HTML, plus crawl the whole page and save the images, and change the path of those images so that you would be able to load it offline. It's not really the job of the RSS applications to save HTML content for offline use, but to get the elements of the RSS feed. Once you have all the links you want to save for offline use, you need to provide the code that will take a URL and save it offline.
I did a search for ios save html offline and found this post which seems pretty positive using ASIHttpRequest to save a page offline: https://stackoverflow.com/a/6698854/1072068. I would recommend you try using something like that once you get the parts of the rss feed entry from MWFeedParser.
I want to present the content of a post in a post/page on another domain. The original post is updated regularly so I cannot use copy/paste. Is there a way to do this?
P.S. both blogs are mine so I have no copyright issues and access to code on both sites.
P.P.S I only need the content not comments etc.
Thanks in advance!
You therefore, as two instances, have two options
Write a piece of custom code to extract the post from the underlying wordpress db
and place it into your post.
As posts are output as rss you could use that rss to display the required post in the other blog, there are plugins for rss import such as http://wordpress.org/extend/plugins/rss-import/