Python Screen Scraper works within Eclipse, but not from command line - eclipse

I'm writing a simple screen scraping script using python 2.7 with Eclipse PyDev. When running or debugging from within Eclipse everything works fine. However, when I run my program from the command line the server always returns a Response 500 error code. I've tried running the script and the compiled versions from the command line but get the same result -- Response 500. I've also tried some arbitrary things like adding a delay, repeated attempts, etc. but I do not know what Eclipse is doing that is different than python ran the command line.
First, where's a good place to start digging if I encounter something like this again?
Second, any ideas on how to get this working from the command line?
Code snippet below for reference
from requests import Request, Session
content_type = 'application/x-www-form-urlencoded'
headers2 = {"User-Agent" : 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)',
"Content-Type" : content_type,
"Referer" : url
}
url = loginPage
payload = {"email" : username, "password" : password}
req = Request ('POST', url, data=payload, headers=headers2)
prepped = req.prepare()
s = Session()
resp = s.send(prepped)
print resp # Response 200 (good) from both within Eclipse and from cmd
resp = s.get(targetPage)
print resp # Response 200 (good) from Eclipse, Response 500 (generic web error) from cmd
s.get (logOutPage)
s.close()

Got an answer from somebody. Thanks go to user Justinsaccount from reddit.
First, I was using batch files to save typing and not directly using the command line.
Secondly, when printing out the parameters from inside the program and then comparing the eclipse version versus the .bat version, the .bat version was short a few characters which was the give away.
One of the parameters was a url that had a space character: http://somewhere.com/some page.
In strict URL, this turns into: http://somewhere.com/some%20page
When run from the command line http://somewhere.com/some%20page
works just fine. However, in a batch file the % needed to be escaped so what I got was: http://somewhere.com/some0page
which is why the server through an error -- that page didn't exist. What I needed to do was escape the % character: http://somewhere.com/some%%20page. After that change things worked just fine.

Related

SolrCloud Configset API upload returns 500 "KeeperErrorCode = NoNode"

Situation
First of all I must mention that I'm using Solr 8.1.1 and am running the default "solr -e cloud" to do some testing. This is running on a Windows Azure VM. I'm trying to create a PowerShell script that will do some setup on the SolrCloud. The first step in this is uploading a custom Configset. I was using https://lucene.apache.org/solr/guide/8_1/configsets-api.html as guide and the PowerShell command if you take away all the parameters and such boils down to the following:
Invoke-WebRequest -Uri "http://localhost:8983/solr/admin/configs?action=UPLOAD&name=MyConfig" -Method Post -ContentType "application/octet-stream" -InFile "config.zip"
EDIT: For clarity the contents of the ZIP is as follows: https://imgur.com/a/OHR1bf1
Problem
When I run the above command however I'm met with the following error:
Invoke-WebRequest : { "responseHeader":{ "status":500, "QTime":11}, "error":{ "msg":"KeeperErrorCode = NoNode for /configs/MyConfig/lang/contractions_ca.txt", "trace":"org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for /configs/MyConfig/lang/contractions_ca.txt\r\n\tat org.apache.zookeeper.KeeperException.create(KeeperException.java:114)\r\n\tat
org.apache.zookeeper.KeeperException.create(KeeperException.java:54)\r\n\tat org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:792)\r\n\tat
org.apache.solr.common.cloud.SolrZkClient.lambda$create$7(SolrZkClient.java:415)\r\n\tat org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:71)\r\n\tat
org.apache.solr.common.cloud.SolrZkClient.create(SolrZkClient.java:415)\r\n\tat org.apache.solr.handler.admin.ConfigSetsHandler.createZkNodeIfNotExistsAndSetData(ConfigSetsHandler.java:201)\r\n\tat
org.apache.solr.handler.admin.ConfigSetsHandler.handleConfigUploadRequest(ConfigSetsHandler.java:181)\r\n\tat org.apache.solr.handler.admin.ConfigSetsHandler.handleRequestBody(ConfigSetsHandler.java:111)\r\n\tat
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)\r\n\tat org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:796)\r\n\tat
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:762)\r\n\tat org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:522)\r\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:397)\r\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:343)\r\n\tat
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)\r\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)\r\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)\r\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\r\n\tat
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\r\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)\r\n\tat
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1588)\r\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)\r\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)\r\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)\r\n\tat
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)\r\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1557)\r\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)\r\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)\r\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)\r\n\tat org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)\r\n\tat
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)\r\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\r\n\tat
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)\r\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\r\n\tat
org.eclipse.jetty.server.Server.handle(Server.java:502)\r\n\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)\r\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)\r\n\tat
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)\r\n\tat org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)\r\n\tat
org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)\r\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)\r\n\tat
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)\r\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)\r\n\tat
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)\r\n\tat org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)\r\n\tat
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)\r\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)\r\n\tat java.lang.Thread.run(Thread.java:748)\r\n",
"code":500}}
At line:1 char:1
Observations
When I first "failed" I had created a zip file from my config which contained an additional top level folder (ea instead of MyConfig/solrconfig.xml etc my ZIP was MyConfig/MyConfig/solrconfig.xml) and when I used this the command was run successful but the second command (creating a collection) would fail because it could not find solrconfig.xml. This tells me that the ZIP is correctly present in the request and Solr does seem capable of processing it but once I correct it to an actual configset it massively fails?
EDIT: I was asked about this and whether using "conf" in the zip would work. As I mentioned here this results in a successful upload (https://imgur.com/a/JHLZ8td) however as you can see it does not match the other config sets and when you try to create a collection with this set you will get Error CREATEing SolrCore 'Test_shard1_replica_n1': Unable to create core [Test_shard1_replica_n1] Caused by: Can't find resource 'solrconfig.xml' in classpath or '/configs/Sitecore', cwd=C:\solr-8.1.1\server
Question(s)
What am I doing wrong? Is this a bug?
Going back through some work I did on SolrCloud a while ago, I am reminded of one annoying issue I hit:
I got odd issues uploading the schema config zip files if I had created that zip using "Send to Compressed Folder" in the Windows UI, or via Compress-Archive in PowerShell. I found that compressing the data with 7Zip did work, however.
I suspect there's something incompatible between the Windows zip code (which I think is quite old, and something they licensed ages ago?) and how Solr/ZooKeeper deals with extracting the files again?
I just ran into the same issue without using Windows zip code. I was trying to upload a configset to Solr 7.7.3 from a conf directory containing a "lang" subdirectory with a bunch of files. I got the NoNode error for /configs/_myconfigsetname_/lang/stopwords_eu.txt. The configset was being zipped on the fly through a recursive directory walk in Java, sending each filename to the Zip file using Java's ZipOutputStream. The resulting zipped bytes were then sent to Solr/Zookeeper.
This code worked fine for conf directories without subdirectories. It turned out that when there is a subdirectory, it was necessary to create a ZipEntry for the directory (e.g. lang/) before adding files to the Zip stream such as lang/stopwords_eu.txt.

Print console errors (with response code 4xx or 5xx) in logger file in protractor automation

I want to print console error with response code 4xx or 5xx in a logger file while executing automation script in protractor. Now I am using the following code in my afterEach. It prints everything from the console.
browser.manage().logs().get('browser').then(function(browserLog) {
console.log('log: ' + require('util').inspect(browserLog));
});
Thr protractor tests, are executed over node, so you can use node's file system ('fs' module) commands.
for example the appendFile methode
How to append to a file in Node?

IPython Failing to Run Code

I was trying out this Python code from a training website in IPython:
from bs4 import BeautifulSoup
import requests
url = raw_input("www.google.com")
r = requests.get("http://" +url)
data = r.text
soup = BeautifulSoup(data)
for link in soup.find_all('a'):
print(link.get('href'))
and found that it ran fine on the first try. I've now tried simply restarting the kernel, opening a new notebook, and generally returning the settings to how they were when I first ran the program with no luck. Why might IPython be failing to run the code and giving no response at all (as though I haven't clicked anything)?
Seems like raw_input is not supported by IPython. So it's probably just hanging there. If you change:
url = raw_input("www.google.com")
to
url = "www.google.com"
it should work.

SoapUI print full URL path of rest request using groovy

I'm using soapui with groovy script step
I want to print the full url of my REST request.
I tried using:
myFile.append( testRunner.testCase.testSteps["My Test Name"].getProperty( "requestUri" ));
and I got null.
You will not be able to see the request info from a test step groovy script. However, the groovy script assertion has access to that information.
You can use this to easily retrieve the full endpoint:
def endpoint = messageExchange.getEndpoint()
The below is working fine for me. you can use the same code just need to change your step name.
Note: Make sure your same Test step should have run prior to below code.else you will get the error
[Cannot invoke method getURL() on null object], see error log for details.
Working Code:
def tr=testRunner.testCase.getTestStepByName("TriggerRequestTransactionsReportsService_V)
def String endPointUrlSave= tr.getHttpRequest().getResponse().getURL();
log.info "Your EndpointUrl is : " + endPointUrlSave;

Simple Javascript http request snippet but not work

The code is modified from some tutorials as below:
xmlhttp=new XMLHttpRequest();
var url = "get_data.pl";
xmlhttp.open("POST", url, true);
xmlhttp.setRequestHeader("Content-type", "application/x-www-form-urlencoded");
xmlhttp.onreadystatechange = function() {
if(xmlhttp.readyState == 4 && xmlhttp.status == 200) {
alert(xmlhttp.responseText);
}
}
xmlhttp.send(null);
I want get_data.pl script get executed and return the running result(print a string "test"), but I only got all lines of get_data.pl in xmlhttp.responseText. How should I do?
get_data.pl as below:
#!C:/Perl/bin/perl.exe
use strict;
use warnings;
&main;
sub main()
{
print "test";// I'd like the script being executed by Perl interpreter and return the string "test" .
}
Yes, Gar, you're right, thanks. I'm using Apache and I modified the httpd.conf's handler line(un-comment and add .pl) as you said. Now the issue seems resolved but I got another error:
POST http: //localhost/get_data.pl 403 (Forbidden)
I put the get_data.pl in the htdocs folder and the security option(OS Win7) has already being set to execute permission. So why being forbidden could you help me again?
Yes I've run the .pl from command line and without error.
Normally the .pl was put in cgi-bin folder which is a brother folder with "htdocs".
When I put the .pl in /cgi-bin and modified url to "../cgi-bin/get_data.pl", I got an error 500 which I guess the server didn't find the file. So any other configuration I missed in httpd.conf? Anyway, I moved it to htdocs folder to avoid the error 500...
Please ensure that you added a handler to .pl files in your http server.