Using sed, delete everything between two characters - sed

How can I delete symbols, whitespaces, characters, words everything between two characters in a line?
My 5-line file is:
"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; InfoPath.1)" 120.94.30.12 264 556 -
"Skype for Macintosh" 120.94.30.9 1038 482 -
-129.94.30.4 217 309 -
"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; InfoPath.1)" 120.94.30.8 1197 747 -
"¢¢HttpClient" 120.94.30.12 594 231 -
I want to delete everything comes in between " and " (including the " characters) so that the required output should be:
120.94.30.12 264 556 -
120.94.30.9 1038 482 -
-120.94.30.4 217 309 -
120.94.30.8 1197 747 -
120.94.30.12 594 231 -

You mean like this?
sed 's/"[^"]*"//' file

echo '"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; InfoPath.1)" 120.94.30.12 264 556 -' |\
sed -e 's/".*"\(.*\)/\1/g'

Related

Flutter Dio - extraneous request to server

I am debugging http requests to our server and decided to try Dio dart package. After some trials (with no difference in results from standard http packages), I decided to stop using the Dio package.
I though happen to notice extraneous requests from random location (traced back to China telecom). Considering we are only trying to setup the server, and the requests started showing up only after I used Dio in my flutter app - Is DIO snooping on my server?
Seen on Server
X-Forwarded-Protocol: https
X-Real-Ip: 183.136.225.35
Host: 0.0.0.0:5002
Connection: close
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36 QIHU 360SE
Accept: */*
Referer: ******
Accept-Encoding: gzip
2022-10-06 15:06:06,768 [DEBUG] root:
X-Forwarded-Protocol: https
X-Real-Ip: 45.79.204.46
Host: 0.0.0.0:5002
Connection: close
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36
Accept: */*
Referer: *****
Accept-Encoding: gzip
Traceroute on IP
4 142 ms 141 ms 153 ms 116.119.68.60
5 140 ms 138 ms 140 ms be6391.rcr21.b015591-1.lon13.atlas.cogentco.com [149.14.224.161]
6 139 ms 139 ms 139 ms be2053.ccr41.lon13.atlas.cogentco.com [130.117.2.65]
7 144 ms 142 ms 142 ms 154.54.61.158
8 191 ms 190 ms 190 ms chinatelecom.demarc.cogentco.com [149.14.81.226]
9 299 ms * 299 ms 202.97.13.18
10 * 316 ms * 202.97.90.30
11 * 317 ms * 202.97.24.141
12 * * * Request timed out.
13 317 ms 308 ms 320 ms 220.191.200.166
14 334 ms 354 ms * 115.233.128.133
15 * * * Request timed out.
16 * * * Request timed out.
17 * * * Request timed out.
18 * * * Request timed out.
19 * * * Request timed out.
20 * * * Request timed out.
21 325 ms 325 ms 333 ms 183.136.225.35

Why am I getting an SSL socket timeout connecting to Keycloak?

I think this question can be rewritten to "Using Spring Boot and keycloak-spring-boot-starter, what happens after KeycloakSpringBootConfigResolver.resolve()?"
I have a custom keycloak config resolver:
public class CustomKeycloakConfigResolver
extends KeycloakSpringBootConfigResolver {
...
#Override
public KeycloakDeployment resolve(final HttpFacade.Request request) {
LOGGER.debug("-----------------------------------------------");
LOGGER.debug("Resolving Deployment for {}", request.getURI());
...
LOGGER.trace("---------- CREATING KEYCLOAK DEPLOYMENT ---------");
KeycloakDeployment keycloakDeployment =
KeycloakDeploymentBuilder.build(adapterConfig);
LOGGER.trace("---------- /CREATED KEYCLOAK DEPLOYMENT ---------");
return keycloakDeployment;
This pulls the KeycloakDeployment configuration from our database instead of from application.properties. Testing on a local docker swarm cluster, it works like a charm (but this is only on one machine, without SSL enabled, etc).
Pushing out to our QA environment (nginx on one machine with TLS termination, REST service on another, Keycloak on another, database provided by RDS), the generation of a KeycloakDeployment goes off without a hitch. This includes reaching out to .well-known/openid-configuration and resolving all of the Keycloak URLs, and printing both of the final trace() statements.
Almost immediately after (within 10-30ms) creating the KeycloakDeployment and returning it to the keycloak-spring-boot-starter framework, I receive a SocketTimeoutException exception. I can't tell from the exception what the system is trying to do when throws this exception, and I haven't been able to tell from https://github.com/keycloak/keycloak what the workflow is after the deployment is "resolved()".
So - what happens next?
A secured method is accessed
Spring Boot auth hands off to Keycloak auth
Keycloak auth generates a custom KeycloakDeployment
KeycloakDeployment is resolved - reaches out to Keycloak service and obtains the OIDC configuration
KeycloakDeployment is passed back to Keycloak auth framework
... Something happens that throws an immediate socket timeout exception...
Method is never invoked
How do I figure out what's happening in step 6? I find it hard to believe it's an actual socket timeout after only 20ms, and everything I think it should be accessing is up and responsive. But I'm willing to be wrong...
---- Original Question ----
I'm trying to get Keycloak working with a Spring Boot REST service behind an nginx proxy. TLS is terminated at the nginx server. Everything is in docker containers, but on separate machines (no kube, no swarm, etc).
Everything seems good. I can login to the master realm, create a new realm, add users, etc. However when the REST service tries to contact the Keycloak server (through the nginx proxy), I'm getting an SSL timeout error:
2022-03-22 18:59:13.384 INFO 26 --- [nio-8080-exec-3] o.keycloak.adapters.KeycloakDeployment : Loaded URLs from https://hostname/auth/realms/myrealm/.well-known/openid-configuration
javax.net.ssl|WARNING|18|http-nio-8080-exec-3|2022-03-22 18:59:13.409 UTC|SSLSocketImpl.java:1672|handling exception (
"throwable" : {
java.net.SocketTimeoutException: Read timed out
at java.base/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:283)
...
Thing is, this "timeout" happens almost instantaneously after the .well-known/openid-configuration response, so I'm skeptical it's even actually a timeout, unless the threshold is set to like 10ms by default or something.
I cranked up javax.net logging with System.setProperty("javax.net.debug", "ssl:all");, and I can't see anything obvious:
2022-03-22 18:59:13.384 INFO 26 --- [nio-8080-exec-3] o.keycloak.adapters.KeycloakDeployment : Loaded URLs from https://hostname/auth/realms/myrealm/.well-known/openid-configuration
javax.net.ssl|DEBUG|18|http-nio-8080-exec-3|2022-03-22 18:59:13.385 UTC|SSLSocketOutputRecord.java:331|WRITE: TLSv1.2 application_data, length = 11
javax.net.ssl|DEBUG|18|http-nio-8080-exec-3|2022-03-22 18:59:13.385 UTC|SSLCipher.java:1770|Plaintext before ENCRYPTION (
0000: 07 00 00 00 03 63 6F 6D 6D 69 74 .....commit
)
javax.net.ssl|DEBUG|18|http-nio-8080-exec-3|2022-03-22 18:59:13.388 UTC|SSLSocketOutputRecord.java:346|Raw write (
0000: 17 03 03 00 23 00 00 00 00 00 00 00 80 9C 76 46 ....#.........vF
0010: 44 FA F9 3A A4 9B A1 B2 D8 9B 6A 69 76 C7 1A 3D D..:......jiv..=
0020: 94 C4 40 D2 D8 F2 E4 7E ..#.....
)
javax.net.ssl|DEBUG|18|http-nio-8080-exec-3|2022-03-22 18:59:13.388 UTC|SSLSocketInputRecord.java:488|Raw read (
0000: 17 03 03 00 23 ....#
)
javax.net.ssl|DEBUG|18|http-nio-8080-exec-3|2022-03-22 18:59:13.388 UTC|SSLSocketInputRecord.java:214|READ: TLSv1.2 application_data, length = 35
javax.net.ssl|DEBUG|18|http-nio-8080-exec-3|2022-03-22 18:59:13.388 UTC|SSLSocketInputRecord.java:488|Raw read (
0000: 15 13 12 41 88 AD 18 6F B8 5E 25 90 9D BA 23 BF ...A...o.^%...#.
0010: B3 A5 A9 5E 61 FA 77 BD AE A4 C0 57 B2 1D 5B 18 ...^a.w....W..[.
0020: E8 C7 77 ..w
)
javax.net.ssl|DEBUG|18|http-nio-8080-exec-3|2022-03-22 18:59:13.388 UTC|SSLSocketInputRecord.java:247|READ: TLSv1.2 application_data, length = 35
javax.net.ssl|DEBUG|18|http-nio-8080-exec-3|2022-03-22 18:59:13.388 UTC|SSLCipher.java:1672|Plaintext after DECRYPTION (
0000: 07 00 00 01 00 00 00 00 00 00 00 ...........
)
javax.net.ssl|DEBUG|18|http-nio-8080-exec-3|2022-03-22 18:59:13.388 UTC|SSLSocketOutputRecord.java:331|WRITE: TLSv1.2 application_data, length = 21
javax.net.ssl|DEBUG|18|http-nio-8080-exec-3|2022-03-22 18:59:13.389 UTC|SSLCipher.java:1770|Plaintext before ENCRYPTION (
0000: 11 00 00 00 03 53 45 54 20 61 75 74 6F 63 6F 6D .....SET autocom
0010: 6D 69 74 3D 31 mit=1
)
javax.net.ssl|DEBUG|18|http-nio-8080-exec-3|2022-03-22 18:59:13.390 UTC|SSLSocketOutputRecord.java:346|Raw write (
0000: 17 03 03 00 2D 00 00 00 00 00 00 00 81 6B 87 27 ....-........k.'
0010: 9C 91 53 E2 F8 70 1C D4 FA F3 4A 79 1B B0 11 05 ..S..p....Jy....
0020: 13 3E 4F 10 A8 E8 43 B3 BB FA 1E 48 82 DF 59 25 .>O...C....H..Y%
0030: CF 9D ..
)
javax.net.ssl|DEBUG|18|http-nio-8080-exec-3|2022-03-22 18:59:13.392 UTC|SSLSocketInputRecord.java:488|Raw read (
0000: 17 03 03 00 23 ....#
)
javax.net.ssl|DEBUG|18|http-nio-8080-exec-3|2022-03-22 18:59:13.392 UTC|SSLSocketInputRecord.java:214|READ: TLSv1.2 application_data, length = 35
javax.net.ssl|DEBUG|18|http-nio-8080-exec-3|2022-03-22 18:59:13.392 UTC|SSLSocketInputRecord.java:488|Raw read (
0000: 15 13 12 41 88 AD 18 70 E3 20 7E 21 DA B0 24 28 ...A...p. .!..$(
0010: EF 6D EB BC 5C CE 5D 94 1D BC 04 BB F9 D1 3D 72 .m..\.].......=r
0020: 0C 71 83 .q.
)
javax.net.ssl|DEBUG|18|http-nio-8080-exec-3|2022-03-22 18:59:13.392 UTC|SSLSocketInputRecord.java:247|READ: TLSv1.2 application_data, length = 35
javax.net.ssl|DEBUG|18|http-nio-8080-exec-3|2022-03-22 18:59:13.393 UTC|SSLCipher.java:1672|Plaintext after DECRYPTION (
0000: 07 00 00 01 00 00 00 02 00 00 00 ...........
)
javax.net.ssl|WARNING|18|http-nio-8080-exec-3|2022-03-22 18:59:13.409 UTC|SSLSocketImpl.java:1672|handling exception (
"throwable" : {
java.net.SocketTimeoutException: Read timed out
at java.base/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:283)
...
So between the well-known configuration call and the final timeout exception:
18:59:13.384
18:59:13.409
all of 25ms passes by - seems hard to believe there would be a legitimate timeout exception thrown, but I can't seem to get any more clue as to what is causing the timeout. The keycloak service definitely IS reachable, and responds quite quickly.
Nothing at all in the Keycloak logs, and not terribly much in the nginx logs:
[--- AUTH ---] [22/Mar/2022:19:19:56 +0000] [200] "POST /auth/realms/myrealm/protocol/openid-connect/token HTTP/1.1" "https://web.mycompany.com/"
[--- AUTH ---] [22/Mar/2022:19:19:57 +0000] [200] "GET /auth/realms/myrealm/.well-known/openid-configuration HTTP/1.1" "-"
[--- AUTH ---] [22/Mar/2022:19:19:57 +0000] [200] "GET /auth/realms/myrealm/.well-known/openid-configuration HTTP/1.1" "-"
2022/03/22 19:19:57 [info] 73#73: *659 client #.#.#.# closed keepalive connection
[--- AUTH ---] [22/Mar/2022:19:19:57 +0000] [200] "GET /auth/realms/myrealm/.well-known/openid-configuration HTTP/1.1" "-"
[--- AUTH ---] [22/Mar/2022:19:19:57 +0000] [200] "GET /auth/realms/myrealm/.well-known/openid-configuration HTTP/1.1" "-"
[--- AUTH ---] [22/Mar/2022:19:19:58 +0000] [200] "GET /auth/realms/myrealm/.well-known/openid-configuration HTTP/1.1" "-"
[--- AUTH ---] [22/Mar/2022:19:19:58 +0000] [200] "GET /auth/realms/myrealm/.well-known/openid-configuration HTTP/1.1" "-"
2022/03/22 19:19:58 [info] 73#73: *670 client #.#.#.# closed keepalive connection
2022/03/22 19:19:58 [info] 73#73: *668 client #.#.#.# closed keepalive connection
[--- AUTH ---] [22/Mar/2022:19:19:58 +0000] [200] "GET /auth/realms/myrealm/.well-known/openid-configuration HTTP/1.1" "-"
[--- AUTH ---] [22/Mar/2022:19:19:58 +0000] [200] "GET /auth/realms/myrealm/protocol/openid-connect/certs HTTP/1.1" "-"
[--- AUTH ---] [22/Mar/2022:19:19:58 +0000] [200] "GET /auth/realms/myrealm/.well-known/openid-configuration HTTP/1.1" "-"
[--- AUTH ---] [22/Mar/2022:19:19:58 +0000] [200] "GET /auth/realms/myrealm/.well-known/openid-configuration HTTP/1.1" "-"
[--- AUTH ---] [22/Mar/2022:19:19:59 +0000] [200] "GET /auth/realms/myrealm/.well-known/openid-configuration HTTP/1.1" "-"
[--- AUTH ---] [22/Mar/2022:19:19:59 +0000] [200] "GET /auth/realms/myrealm/protocol/openid-connect/certs HTTP/1.1" "-"
2022/03/22 19:19:59 [info] 73#73: *677 client #.#.#.# closed keepalive connection
2022/03/22 19:19:59 [info] 73#73: *675 client #.#.#.# closed keepalive connection
[--- AUTH ---] [22/Mar/2022:19:19:59 +0000] [200] "GET /auth/realms/myrealm/.well-known/openid-configuration HTTP/1.1" "-"
[--- AUTH ---] [22/Mar/2022:19:19:59 +0000] [200] "GET /auth/realms/myrealm/.well-known/openid-configuration HTTP/1.1" "-"
[--- AUTH ---] [22/Mar/2022:19:19:59 +0000] [200] "GET /auth/realms/myrealm/.well-known/openid-configuration HTTP/1.1" "-"
[--- AUTH ---] [22/Mar/2022:19:20:00 +0000] [200] "GET /auth/realms/myrealm/.well-known/openid-configuration HTTP/1.1" "-"
2022/03/22 19:20:00 [info] 73#73: *664 client #.#.#.# closed keepalive connection
2022/03/22 19:20:00 [info] 73#73: *666 client #.#.#.# closed keepalive connection
2022/03/22 19:20:00 [info] 73#73: *672 client #.#.#.# closed keepalive connection
2022/03/22 19:20:00 [info] 73#73: *682 client #.#.#.# closed keepalive connection
2022/03/22 19:20:00 [info] 73#73: *684 client #.#.#.# closed keepalive connection
2022/03/22 19:20:00 [info] 74#74: *686 client #.#.#.# closed keepalive connection
[--- AUTH ---] [22/Mar/2022:19:20:00 +0000] [200] "GET /auth/realms/myrealm/.well-known/openid-configuration HTTP/1.1" "-"
[--- AUTH ---] [22/Mar/2022:19:20:00 +0000] [200] "GET /auth/realms/myrealm/.well-known/openid-configuration HTTP/1.1" "-"
[--- AUTH ---] [22/Mar/2022:19:20:01 +0000] [200] "GET /auth/realms/myrealm/.well-known/openid-configuration HTTP/1.1" "-"
[--- AUTH ---] [22/Mar/2022:19:20:01 +0000] [200] "GET /auth/realms/myrealm/protocol/openid-connect/certs HTTP/1.1" "-"
[--- REST ---] [22/Mar/2022:19:20:01 +0000] [401] "GET /api/v1/stuff HTTP/1.1" "https://web.mycompany.com/"
Any help as to what's going on?
Other notes:
I've gone through the Setting Up a load balancer or proxy instructions from Keycloak. Specifically, the nginx server has the following:
upstream AUTH {
server #.#.#.#:8080;
server #.#.#.#:8080;
}
...
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Host $http_host;
proxy_pass http://AUTH;
I think that covers bullet points 1 and 2.
Bullet point 3 is covered by the keycloak server. The standalone-ha.xml file has:
<subsystem xmlns="urn:jboss:domain:undertow:12.0" default-server="default-server" default-virtual-host="default-host" default-servlet-container="default" default-security-domain="other" statistics-enabled="true">
<buffer-cache name="default"/>
<server name="default-server">
<ajp-listener name="ajp" socket-binding="ajp"/>
<http-listener name="default" socket-binding="http" redirect-socket="https" proxy-address-forwarding="${env.PROXY_ADDRESS_FORWARDING:false}" enable-http2="true"/>
<https-listener name="https" socket-binding="https" ssl-context="applicationSSC" proxy-address-forwarding="${env.PROXY_ADDRESS_FORWARDING:false}" enable-http2="true"/>
<host name="default-host" alias="localhost">
<location name="/" handler="welcome-content"/>
<http-invoker http-authentication-factory="application-http-authentication"/>
</host>
</server>
and the container is launched with the environment variable
PROXY_ADDRESS_FORWARDING=true
I've verified with a false login that Keycloak is seeing the end user's IP address rather than the nginx server:
15:17:07,304 WARN [org.keycloak.events] (default task-14) type=LOGIN_ERROR, realmId=master, clientId=security-admin-console, userId=null, ipAddress=#.#.#.#, error=user_not_found, auth_method=openid-connect, auth_type=code, redirect_uri=https://host.name/auth/admin/master/console/#/realms/MyRealm/login-settings, code_id=..., username=foo, authSessionParentId=..., authSessionTabId=...
Where #.#.#.# is my IP address.

Py4JJavaError: An error occurred while calling o840.showString

I am trying to parse a log file with millions of records. It contains host name, timestamp, status code etc. After successfully parsing host and status code and url, when I am trying to parse timestamp, I am getting an error. following is my code:
lines=sc.textFile(filepath)
df_log= lines.map(lambda x: Row(header=x)).toDF()
timestamp_pattern= r'\[\d{2}\/\w{3}\/\d{4}\:\d{2}\:\d{2}\:\d{2}\s\S+\d{4}]'
df2=df_log.select(regexp_extract(col('header'),timestamp_pattern,1).alias("timestamp"))
everything is working fine till here. after this when I am trying to df2.show(10). I am getting following error:
Py4JJavaErrorTraceback (most recent call last) <ipython-input-112-5a86d13b2926> in <module>()
----> 1 df2.show(1)
/opt/cloudera/parcels/SPARK2/lib/spark2/python/pyspark/sql/dataframe.pyc in show(self, n, truncate)
316 """
317 if isinstance(truncate, bool) and truncate:
--> 318 print(self._jdf.showString(n, 20))
319 else:
320 print(self._jdf.showString(n, int(truncate)))
/opt/cloudera/parcels/SPARK2/lib/spark2/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py in __call__(self, *args) 1131 answer = self.gateway_client.send_command(command) 1132 return_value
= get_return_value(
-> 1133 answer, self.gateway_client, self.target_id, self.name) 1134 1135 for temp_arg in temp_args:
/opt/cloudera/parcels/SPARK2/lib/spark2/python/pyspark/sql/utils.pyc in deco(*a, **kw)
61 def deco(*a, **kw):
62 try:
---> 63 return f(*a, **kw)
64 except py4j.protocol.Py4JJavaError as e:
65 s = e.java_exception.toString()
/opt/cloudera/parcels/SPARK2/lib/spark2/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
317 raise Py4JJavaError(
318 "An error occurred while calling {0}{1}{2}.\n".
--> 319 format(target_id, ".", name), value)
320 else:
321 raise Py4JError(
Py4JJavaError: An error occurred while calling o965.showString. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2.0 (TID 5, ip-20-0-31-210.ec2.internal, executor 2): java.lang.IndexOutOfBoundsException: No group 1 at java.util.regex.Matcher.group(Matcher.java:538) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377) at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:231) at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:225) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:826) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:826) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1430) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1417) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1417) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:797) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:797) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:797) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1645) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1600) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1589) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:623) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1930) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1943) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1956) at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:333) at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38) at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2378) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57) at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2772) at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2377) at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2384) at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2120) at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2119) at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2802) at org.apache.spark.sql.Dataset.head(Dataset.scala:2119) at org.apache.spark.sql.Dataset.take(Dataset.scala:2334) at org.apache.spark.sql.Dataset.showString(Dataset.scala:248) at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:280) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:214) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.IndexOutOfBoundsException: No group 1 at java.util.regex.Matcher.group(Matcher.java:538) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377) at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:231) at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:225) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:826) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:826) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ... 1 more
Please help me with this.
following is the excerpt of the records in the log file for referance:
109.169.248.247 - - [12/Dec/2015:18:25:11 +0100] GET /administrator/ HTTP/1.1 200 4263 - Mozilla/5.0 (Windows NT 6.0; rv:34.0) Gecko/20100101 Firefox/34.0 -
109.169.248.247 - - [12/Dec/2015:18:25:11 +0100] GET /administrator/ HTTP/1.1 200 4263 - Mozilla/5.0 (Windows NT 6.0; rv:34.0) Gecko/20100101 Firefox/34.0 -
109.169.248.247 - - [12/Dec/2015:18:25:11 +0100] GET /administrator/ HTTP/1.1 200 4263 - Mozilla/5.0 (Windows NT 6.0; rv:34.0) Gecko/20100101 Firefox/34.0 -
109.169.248.247 - - [12/Dec/2015:18:25:11 +0100] POST /administrator/index.php HTTP/1.1 200 4494 http://almhuette-raith.at/administrator/ Mozilla/5.0 (Windows NT 6.0; rv:34.0) Gecko/20100101 Firefox/34.0 -
46.72.177.4 - - [12/Dec/2015:18:31:08 +0100] GET /administrator/ HTTP/1.1 200 4263 - Mozilla/5.0 (Windows NT 6.0; rv:34.0) Gecko/20100101 Firefox/34.0 -
46.72.177.4 - - [12/Dec/2015:18:31:08 +0100] POST /administrator/index.php HTTP/1.1 200 4494 http://almhuette-raith.at/administrator/ Mozilla/5.0 (Windows NT 6.0; rv:34.0) Gecko/20100101 Firefox/34.0 -
83.167.113.100 - - [12/Dec/2015:18:31:25 +0100] GET /administrator/ HTTP/1.1 200 4263 - Mozilla/5.0 (Windows NT 6.0; rv:34.0) Gecko/20100101 Firefox/34.0 -
83.167.113.100 - - [12/Dec/2015:18:31:25 +0100] POST /administrator/index.php HTTP/1.1 200 4494 http://almhuette-raith.at/administrator/ Mozilla/5.0 (Windows NT 6.0; rv:34.0) Gecko/20100101 Firefox/34.0 -
95.29.198.15 - - [12/Dec/2015:18:32:10 +0100] GET /administrator/ HTTP/1.1 200 4263 - Mozilla/5.0 (Windows NT 6.0; rv:34.0) Gecko/20100101 Firefox/34.0 -
95.29.198.15 - - [12/Dec/2015:18:32:11 +0100] POST /administrator/index.php HTTP/1.1 200 4494 http://almhuette-raith.at/administrator/ Mozilla/5.0 (Windows NT 6.0; rv:34.0) Gecko/20100101 Firefox/34.0 -
109.184.11.34 - - [12/Dec/2015:18:32:56 +0100] GET /administrator/ HTTP/1.1 200 4263 - Mozilla/5.0 (Windows NT 6.0; rv:34.0) Gecko/20100101 Firefox/34.0 -
109.184.11.34 - - [12/Dec/2015:18:32:56 +0100] POST /administrator/index.php HTTP/1.1 200 4494 http://almhuette-raith.at/administrator/ Mozilla/5.0 (Windows NT 6.0; rv:34.0) Gecko/20100101 Firefox/34.0 -
`
change last line to -
timestamp_pattern= r'\[\d{2}\/\w{3}\/\d{4}\:\d{2}\:\d{2}\:\d{2}\s\S+\d{4}]'
df2=df_log.select(regexp_extract(col('header'),timestamp_pattern,0).alias("timestamp"))
note: I changed the groupId from 1 to 0 since there is no group set in your timestamp_pattern

RMarkdown: Creating two side-by-side heatmaps with full figure borders using the pheatmap package

I am writing my first report in RMarkdown and struggling with specific figure alignments.
I have some data that I am manipulating into a format friendly for the package pheatmap such that it produces heatmap HTML output. The code that produces one of these looks like:
cleaned_mayo<- cleaned_mayo[which(cleaned_mayo$Source=="MayoBrainBank_Dickson"),]
# Segregate data
ad<- cleaned_mayo[which(cleaned_mayo$Diagnosis== "AD"),-c(1:13)]
control<- cleaned_mayo[which(cleaned_mayo$Diagnosis== "Control"),-c(1:13)]
# Average data across patients and assign diagnoses
ad<- as.data.frame(t(apply(ad,2, mean)))
control<- as.data.frame(t(apply(control,2, mean)))
ad$Diagnosis<- "AD"
control$Diagnosis<- "Control"
# Combine
avg_heat<- rbind(ad, control)
# Rearrange columns
avg_heat<- avg_heat[,c(32, 1:31)]
# Mean shift all expression values
avg_heat[,2:32]<- apply(avg_heat[,2:32], 2, function(x){x-mean(x)})
#################################
# CREATE HEAT MAP
#################################
# Plot average heat map
pheatmap(t(avg_heat[,2:32]), cluster_col= F, labels_col= c("AD", "Control"),gaps_col = c(1), labels_row = colnames(avg_heat)[2:32],
main= "Mayo Differential Expression for Genes of Interest: Averaged Across \n Patients within a Diagnosis",
show_colnames = T)
Where the numeric columns of cleaned_mayo look like:
C1QA C1QC C1QB LAPTM5 CTSS FCER1G PLEK CSF1R CD74 LY86 AIF1 FGD2 TREM2 PTK2B LYN UNC93B1 CTSC NCKAP1L TMEM119 ALOX5AP LCP1
1924_TCX 1101 1392 1687 1380 380 279 198 1889 6286 127 252 771 338 5795 409 494 337 352 476 170 441
1926_TCX 881 770 950 1064 239 130 132 1241 3188 76 137 434 212 5634 327 419 292 217 464 124 373
1935_TCX 3636 4106 5196 5206 1226 583 476 5588 27650 384 1139 1086 756 14219 1269 869 868 1378 1270 428 1216
1925_TCX 3050 4392 5357 3585 788 472 350 4662 11811 340 865 1051 468 13446 638 420 1047 850 756 616 1008
1963_TCX 3169 2874 4182 2737 828 551 208 2560 10103 204 719 585 499 9158 546 335 598 593 606 418 707
7098_TCX 1354 1803 2369 2134 634 354 245 1829 8322 227 593 371 411 10637 504 294 750 458 367 490 779
ITGAM LPCAT2 LGALS9 GRN MAN2B1 TYROBP CD37 LAIR1 CTSZ CYTH4
1924_TCX 376 649 699 1605 618 392 328 628 1774 484
1926_TCX 225 381 473 1444 597 242 290 321 1110 303
1935_TCX 737 1887 998 2563 856 949 713 1060 2670 569
1925_TCX 634 1323 575 1661 594 562 421 1197 1796 595
1963_TCX 508 696 429 1030 355 556 365 585 1591 360
7098_TCX 418 1011 318 1574 354 353 179 471 1471 321
All of this code is wrapped around the following header in the RMarkdown environment: {r heatmaps, echo=FALSE, results="asis", message=FALSE}.
What I would like to achieve is the two heatmaps side-by-side with black boxes around each individual heat map (i.e. containing the title and legend of the heatmap as well).
If anyone could tell me how to do this, or either one individually it would be greatly appreciated.
Thanks!

changing /client/BigBlueButton.html portion getting:404 Not Found nginx/1.4.6 (Ubuntu

trying to when change /client/BigBlueButton.html portion of the URL according to - http://docs.bigbluebutton.org/support/faq.html#how-do-i-change-the-client-bigbluebutton-html-portion-of-the-url,
but getting -
404 Not Found
nginx/1.4.6 (Ubuntu)
my /etc/bigbluebutton/nginx/client.nginx:
#location /client/BigBlueButton.html {
# root /home/firstuser/dev/bigbluebutton/bigbluebutton-client;
# index index.html index.htm;
# expires 1m;
#}
# BigBlueButton Flash client.
location /client {
root /home/firstuser/dev/bigbluebutton/bigbluebutton-client;
index index.html index.htm;
}
my /etc/bigbluebutton/nginx/rewrite.nginx:
location /client/BigBlueButton.html {
rewrite ^ /conference permanent;
}
location /conference {
alias /var/www/bigbluebutton/client;
index BigBlueButton.html;
expires 1m;
}
$sudo /etc/init.d/nginx restart
/var/log/nginx/bigbluebutton.access.log
33.126.263.65 - - [16/Aug/2015:12:37:57 -0400] "GET / HTTP/1.1" 200 2852 "-" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36"
33.126.263.65 - - [16/Aug/2015:12:37:57 -0400] "GET /css/bijou.min.css HTTP/1.1" 200 2753 "http://M_IP_Ad/" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36"
33.126.263.65 - - [16/Aug/2015:12:37:57 -0400] "GET /css/style.css HTTP/1.1" 200 2918 "http://M_IP_Ad/" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36"
33.126.263.65 - - [16/Aug/2015:12:37:57 -0400] "GET /css/font-awesome.min.css HTTP/1.1" 200 20766 "http://M_IP_Ad/" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36"
33.126.263.65 - - [16/Aug/2015:12:37:57 -0400] "GET /images/jimtalk-logo.png HTTP/1.1" 200 10251 "http://M_IP_Ad/" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36"
33.126.263.65 - - [16/Aug/2015:12:37:57 -0400] "GET /images/bbb-setup-audio.jpg HTTP/1.1" 200 18876 "http://M_IP_Ad/" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36"
33.126.263.65 - - [16/Aug/2015:12:37:57 -0400] "GET /images/bbb-viewer-overview.jpg HTTP/1.1" 200 21929 "http://M_IP_Ad/" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36"
33.126.263.65 - - [16/Aug/2015:12:37:57 -0400] "GET /images/bbb-presenter-overview.jpg HTTP/1.1" 200 18309 "http://M_IP_Ad/" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36"
33.126.263.65 - - [16/Aug/2015:12:37:58 -0400] "GET /fonts/fontawesome-webfont.woff?v=4.1.0 HTTP/1.1" 200 83760 "http://M_IP_Ad/css/font-awesome.min.css" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36"
M_IP_Ad - - [16/Aug/2015:12:38:00 -0400] "POST /bigbluebutton/api/create?name=Demo+Meeting&meetingID=Demo+Meeting&voiceBridge=72274&attendeePW=ap&moderatorPW=mp&record=false&checksum=a93ab8433532c633ab2467afc0d91e0eb1dc4e88 HTTP/1.1" 200 488 "-" "Java/1.7.0_79"
33.126.263.65 - - [16/Aug/2015:12:38:00 -0400] "GET /demo/demo1.jsp?username=%D7%99%D7%A2%D7%99%D7%A2%D7%99&action=create HTTP/1.1" 200 1003 "http://M_IP_Ad/" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36"
33.126.263.65 - - [16/Aug/2015:12:38:01 -0400] "GET /bigbluebutton/api/join?meetingID=Demo+Meeting&fullName=%D7%99%D7%A2%D7%99%D7%A2%D7%99&password=mp&checksum=3438ed39a50723be59798038f86fcba0af30b325 HTTP/1.1" 302 0 "http://M_IP_Ad/demo/demo1.jsp?username=%D7%99%D7%A2%D7%99%D7%A2%D7%99&action=create" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36"
33.126.263.65 - - [16/Aug/2015:12:38:02 -0400] "GET /client/BigBlueButton.html HTTP/1.1" 301 193 "http://M_IP_Ad/demo/demo1.jsp?username=%D7%99%D7%A2%D7%99%D7%A2%D7%99&action=create" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36"
33.126.263.65 - - [16/Aug/2015:12:38:02 -0400] "GET /conference HTTP/1.1" 301 193 "http://M_IP_Ad/demo/demo1.jsp?username=%D7%99%D7%A2%D7%99%D7%A2%D7%99&action=create" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36"
33.126.263.65 - - [16/Aug/2015:12:38:02 -0400] "GET /conference/ HTTP/1.1" 403 208 "http://M_IP_Ad/demo/demo1.jsp?username=%D7%99%D7%A2%D7%99%D7%A2%D7%99&action=create" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36"
Thank you
The nginx read all the files with nginx extension (*.nginx). I guess the problem is that nginx is not able to handle same locations.