Xpath starting retuning None on Scrapy - dom

I'm trying to crawl a site and to do so, I'm using Scrapy. So, when doing requests to nested pages, the procedure usually gets the the information correctly on the first trials, but, on later requests the nodes starts to return None. I'm using xpath's functionality. Below I'm pasting some lines of the parse function:
(I tried this one with the approach of explicitly comparing the class value)
title = response.xpath('//span[#class="inlineFree"]/text()').extract_first()
(With this one I used the contains function)
view = response.xpath('//span[contains(#class,"count")]/text()').extract_first()
(I've also used this one when I found more suitable)
comments = response.css('div.commentMessage > span::text').extract()
Am I doing something wrong on paths?
Is there any reason for the crawler to stop reading the nodes correctly?

Cannot say what the problem is without the log messages or the spider code but..
What happens most of the time is that websites fo not follow a strict html structure .For some properties the 'title' may be inside the span
but for the next iteration it may be
span[#class="inlineFree"]/h1/text() or or any other tag
so you should check the html for those returning None

Related

TTeeGrid is not displaying the data at runtime using data from REST

I created a simple RME for TTeeGrid, a descendant perhaps of TGrid in Firemonkey. As shown below, the data are displayed at design time but not at runtime except the headers.
I've been breaking my head over this for weeks already but not luck.
Let me know if you need more details but what you see in the image are all you get.
I just need help to have the data displayed at runtime as shown in the design time.
UPDATE 1
This issue is not the case with TPrototypeBindSource. The data shown in the design time are displayed at runtime. Something is wrong somewhere.
I've never used the TeeGrid before, but the following worked fine
first time for me in Delphi Tokyo:
Download the TeeGrid trial from Steema.Com & install.
Create new multi-device app and place a TeeGrid and a FDMemTable on the form.
Load FDMemTable1 with the file Parts.Fds from the Delphi samples Data directory. Note, I did not then create any FieldDefs as I mentioned in my comment earlier as what I'm describing works without them.
Set the DataSource property of TeeGrid1 to FDMemTable1. TeeGrid1 immediately
creates columns for each of the Parts fields and populates them with data - see
screenshot below. I don't ordinarily include screenshots but in this case thought
I would as what I got was so clearly at odds with what you've reported.
Your TeeGrid etc are obviously more complicated than mine. so the best I can
suggest is that you backtrack to step 2 and see if you can replicate my result
with your data (either at design time or run time). It might be worth loading
your FDMemTable with some data at design time, as my impression is that live bindings
is less grief-prone when the datasource has some data.
Incidentally, fwiw the results of my own attempts to set up live bindings even with a regular TGrid have been rather patchy, until I discovered that instead of messing with the LB components myself, simply starting with a fresh TGrid, right-clicking on it and leaving the Live Bindings Wizard
to do its stuff consistently works fine.

Why doesn't elapsed_time() work from log_message() even if called from post_system hook?

If I try to log benchmark->elapsed_time() in post_system hook, it just logs {elapsed_time} as if I called it from a controller.
CodeIgniter documentation says:
"post_system Called after the final rendered page is sent to the browser, at the end of system execution after the finalized data is sent to the browser."
It also says you are supposed to echo the elapsed_time() in a view to show it to the user, but... how is it possible that elapsed_time() is still being calculated after sending the finalized data to the browser?
I feel being lied to.
People keep saying I should use my own marks and get the difference, but that's not the same as using this...
Turns out the documentation also says:
"If the first parameter is empty this function instead returns the {elapsed_time} pseudo-variable. This permits the full system execution time to be shown in a template. The output class will swap the real value for this variable."
I went to the Output class and found the 2 marks it is using: total_execution_time_start and total_execution_time_end and I can use those in the post_system hook.

Problems with GitHub rendering my README.rst incorrectly..?

I've got a GitHub repo/branch where I'm attempting to update the README.rst, but it's not formatting the way I expect when it comes to the bullet lists I'm including.
Everything seems ok except for my Usage section, in which I have the following:
*****
Usage
*****
- Open the template file that corresponds to the API call you'd like to make.
* Example: If we want to make a call to the RefundTransaction API we open up /templates/RefundTransaction.php
- You may leave the file here, or save this file to the location on your web server where you'd like this call to be made.
* I like to save the files to a separate location and keep the ones included with the library as empty templates.
* Note that you can also copy/paste the template code into your own file(s).
- Each template file includes PHP arrays for every parameter available to that particular API. Simply fill in the array parameters with your own dynamic (or static) data. This data may come from:
* Session Variables
* General Variables
* Database Recordsets
* Static Values
* Etc.
- When you run the file you will get a $PayPalResult array that consists of all the response parameters from PayPal, original request parameters sent to PayPal, and raw request/response info for troubleshooting.
* You may refer to the `PayPal API Reference Guide <https://developer.paypal.com/webapps/developer/docs/classic/api/>`_ for details about what response parameters you can expect to get back from any successful API request.
+ Example: When working with RefundTransaction, I can see that PayPal will return a REFUNDTRANSACTIONID, FEEREFUNDAMT, etc. As such, I know that those values will be included in $PayPalResult['REFUNDTRANSACTIONID'] and $PayPalResult['FEEREFUNDAMT'] respectively.
- If errors occur they will be available in $PayPalResult['ERRORS']
You may refer to this `overview video <http://www.angelleye.com/overview-of-php-class-library-for-paypal/>`_ of how to use the library,
and there are also samples provided in the /samples directory as well as blank templates ready to use under /templates.
You may `contact me directly <http://www.angelleye.com/contact-us/>`_ if you need additional help getting started. I offer 30 min of free training for using this library,
which is generally plenty to get you up-and-running.
For some reason, though, when you look at that on GitHub the first line of the bullet lists is coming up bold and italics and I have no idea why. Also, the sub-list where it shows Session Variables, General Variables, etc. is supposed to be all the same sub-list. I'm not sure why it's dropping into another sub when it sees General Variables.
Any information on what I've done wrong here would be greatly appreciated. Thanks!
Switch from .rst to .md and then use '#' for your headings.
## Usage
- Open the template file that corresponds to the API call you'd like to make.
* Example

Refresh or Clear GUI (Apps Script Gadget Embedded in Google Site)

I have a small form that my colleagues and I often fill out that can be seen here, with source code (view-only) here. The spreadsheet that this form posts to can be seen (view-only) here.
Basically, the apps script is solely for preventing my coworkers from having to scroll through thousands of rows of a spreadsheet (this is also used for our film collection) to check in/out inventory. It will also, soon, post detailed history of the checkout in an audit spreadsheet.
I have attempted, with no success, to clear the values of the text fields after the data has been posted to the spreadsheet. What I essentially want to do is restart the GUI so that the fields are once again blank.
I have tried accessing the text fields by id but the return type is a Generic widget, which I of course can't do anything with. And since control is passed to the handler functions, the text fields can't be accessed, or at least I can't figure out how (and I looked through the documentation and online solutions for hours last night).
So the question: how can I erase/clear the values of the text fields in the GUI after values have been posted to the spreadsheet? (return from handler function to access text fields again, or restart the GUI?
I can accomplish restarting the script runs on a spreadsheet, but I have not been able to do this when it is embedded in a Google site.
The getElementById('xxxxx').setText('') approach was the right way to do it. but the way you tried to test it doesn't work... the new value would only be available in the next handler call.
/*
This function only has the desired effect if it is called within the handler
function (not within a nested function in the handler function).
*/
function restartGUI() {
var gui = UiApp.getActiveApplication();
gui.getElementById('equipmentIDField1').setText('');
gui.getElementById('equipmentIDField2').setText('');
gui.getElementById('equipmentIDField3').setText('');
gui.getElementById('studentLastNameField').setText('');
gui.getElementById('labasstLastNameField').setText('');
return gui;
}

Trying to figure out what {s: ;} tags mean and where they come from

I am working on migrating posts from the RightNow infrastructure to another service called ZenDesk. I noticed that whenever users added files or even URL links, when I pull the xml data from RightNow it gives me a lot of weird codes like this:
{s:3:""url"";s:45:""/files/56f5be6c1/MUG_presso.pdf"";s:4:""name"";s:27:""MUG presso.pdf"";s:4:""size"";s:5:""2.1MB"";}
It wasn't too hard to write something that parses them and makes normal urls and links, but I was just wondering if this is something specific to the RightNow service, or if it is a tag system that is used. I tried googling for this but am getting some weird results so, thought stack overflow might have someone who has run into this one.
So, anyone know what these {s ;} tags are called and if there are any particular tools to use to read them?
Any answers appreciated!
This resembles partial PHP serialized data, as returned by the serialize() call. It looks like someone may have turned each " into "", which could prevent it from parsing properly. If it's wrapped with text like this before the {s: section, it's almost definitely PHP.
a:6:{i:1;a:10:{s:
These letters/numbers mean things like "an array with six elements follows", "a string of length 20 follows", etc.
You can use any PHP instance with unserialize() to handle the data. If those double-quotes are indeed returned by the API, you might need to replace :"" and ""; with " before parsing.
Parsing modules exist for other languages like Python. You can find more information in this answer.