Apache Spark in Azure Synapse Analytics - HTTP request in notebook - pyspark

I use a Notebook in Synapse where I run my Python code.
I would like to make an API request from this Notebook to Microsoft Purview to send the entities.
I added the pyapacheatlas library to spark.
On my local computer, this code works fine in Visual Studio.
I need to sign in with Microsoft. I created Purview client connections using a service principal. Here is the code that I am running:
from pyapacheatlas.auth import ServicePrincipalAuthentication
from pyapacheatlas.core import PurviewClient
from pyapacheatlas.core import AtlasEntity
auth = ServicePrincipalAuthentication(
tenant_id="...",
client_id="...",
client_secret="..."
)
# Create a client to connect to your service.
client = PurviewClient(
account_name = "...",
authentication = auth
)
# Get All Type Defs
all_type_defs = client.get_all_typedefs()
I am getting an error after running for a long time:
"ConnectionError: HTTPSConnectionPool(host='login.microsoftonline.com', port=443): Max retries exceeded with url: /.../oauth2/token (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f00effc5250>: Failed to establish a new connection: [Errno 110] Connection timed out'))"
It turned out that I can't make any HTTP requests in Notebook.
Please advise, maybe this is not provided by the functionality or is it possible to solve it?
Thank you.
Even an ordinary GET like this:
import json
import requests
r = requests.get("http://echo.jsontest.com/insert-key-here/insert-value-here/key/value")
df = sqlContext.createDataFrame([json.loads(line) for line in r.iter_lines()])
As a result I get:ConnectionError:
"HTTPConnectionPool(host='echo.jsontest.com', port=80): Max retries exceeded with url: /insert-key-here/insert-value-here/key/value (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f90b0111940>: Failed to establish a new connection: [Errno 110] Connection timed out'))"

Related

Total open connections reached the connection limit

I'm running Python Flask with Waitress. I'm starting the server using the following code:
from flask import Flask, render_template, request
from waitress import serve
#app.route("/get")
def first_method():
...
#app.route("/second")
def second_method():
...
app = Flask(__name__)
app.static_folder = 'static'
serve(app, host="ip_address", port=8080)
I'm calling the server from a Webpage and also from Unity. From the webpage, I'm using the following example get request in jQuery:
$.get("/get", { variable1: data1, variable2: data2 }).done(function (data) {
...
}
In Unity I'm using the following call:
http://ip_address/get?msg=data1?data2
Unfortuantely, after some time I'm getting the error on the server total open connections reached the connection limit, no longer accepting new connections. This especially happens with Unity. I assume that for each get request a new channel/connection is established.
How can this be fixed, i.e. how can channels/connections be reused?

Writing log to gcloud Vertex AI Endpoint using gcloud client fails with google.api_core.exceptions.MethodNotImplemented: 501

Trying to use google logging client library for writing logs into gcloud, specifically, i'm interested in writing logs that will be attached to a managed resource, in this case, a Vertex AI endpoint:
Code sample:
import logging
from google.api_core.client_options import ClientOptions
import google.cloud.logging_v2 as logging_v2
from google.oauth2 import service_account
def init_module_logger(module_name: str) -> logging.Logger:
module_logger = logging.getLogger(module_name)
module_logger.setLevel(settings.LOG_LEVEL)
credentials= service_account.Credentials.from_service_account_info(json.loads(SA_KEY_JSON))
client = logging_v2.client.Client(
credentials=credentials,
client_options=ClientOptions(api_endpoint="us-east1-aiplatform.googleapis.com"),
)
handler = client.get_default_handler(
resource=Resource(
type="aiplatform.googleapis.com/Endpoint",
labels={"endpoint_id": "ENDPOINT_NUMBER_ID",
"location": "us-east1"},
)
)
#Assume we have the formatter
handler.setFormatter(ENRICHED_FORMATTER)
module_logger.addHandler(handler)
return module_logger
logger = init_module_logger(__name__)
logger.info("This Fails with 501")
And i am getting:
google.api_core.exceptions.MethodNotImplemented: 501 The GRPC target
is not implemented on the server, host:
us-east1-aiplatform.googleapis.com, method:
/google.logging.v2.LoggingServiceV2/WriteLogEntries. Sent all pending
logs.
I thought we need to enable api and was told it's enabled, and that we have: https://www.googleapis.com/auth/logging.write
what could be causing the error?
As mentioned by #DazWilkin in the comment, the error is because the API endpoint us-east1-aiplatform.googleapis.com does not have a method called WriteLogEntries.
The above endpoint is used to send requests to Vertex AI services and not to Cloud Logging. The API endpoint to be used is the logging.googleapis.com as shown in the entries.write method. Refer to this documentation for more info.
The ClientOptions() function should have logging.googleapis.com as the api_endpoint parameter. If the client_options parameter is not specified, logging.googleapis.com is used by default.
After changing the api_endpoint parameter, I was able to successfully write the log entries. The ClientOptions() is as follows:
client = logging_v2.client.Client(
credentials=credentials,
client_options=ClientOptions(api_endpoint="logging.googleapis.com"),
)

Azure Bicep Error - Error while attempting to retrieve the latest Bicep version

I'm trying to deploy using bicep from a Powershell terminal in VSCode behind a corporate proxy.
My command line is:
az deployment sub create -f .\main.bicep -l uksouth
If I'm joined to the VPN then I get the following error:
Error while attempting to retrieve the latest Bicep version: HTTPSConnectionPool(host='api.github.com', port=443): Max retries exceeded with url: /repos/Azure/bicep/releases/latest (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x04BC0058>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond')).
If I drop off the VPN then the error is:
Error while attempting to retrieve the latest Bicep version: HTTPSConnectionPool(host='api.github.com', port=443): Max retries exceeded with url: /repos/Azure/bicep/releases/latest (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')])"))).
Any help on how to configure VSCode / PowerShell to make this work?

Download Sharepoint Online file using Python

I am trying to automate some work processes by using Python to download an Excel file from our SharePoint site. Some background is that I'm fairly new to Python and am mostly self-taught. I have read over many examples here using sharepy, office365, requests, etc. and none seem to work. Below is what I have tried (basically copied from another example) with the error below that:
from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.file import File
#Inputs
url_shrpt = "https://company.sharepoint.com/sites/teamsite"
username_shrpt = "user#company.com"
password_shrpt = "password"
folder_url_shrpt = "sites/teamsite/library"
#Authentication
ctx_auth = AuthenticationContext(url_shrpt)
if ctx_auth.acquire_token_for_user(username_shrpt, password_shrpt):
ctx = ClientContext(url_shrpt, ctx_auth)
web = ctx.web
ctx.load(web)
ctx.execute_query()
print("Authenticated into sharepoint as: ",web.properties['Title'])
else:
print(ctx_auth.get_last_error())
#Function for extracting the file names from a folder
global print_folder_contents
def print_folder_contents(ctx, folder_url):
try:
folder = ctx.web.get_folder_by_server_relative_url(folder_url)
fold_names = []
sub_folders = folder.files
ctx.load(sub_folders)
ctx.execute_query()
for s_folder in sub_folders:
fold_names.append(s_folder.properties["Name"])
return fold_names
except Exception as e:
print('Problem printing out library contents: ', e)
######################################################
# Call the function by giving your folder URL as input
filelist_shrpt=print_folder_contents(ctx,folder_url_shrpt)
#Print the list of files present in the folder
print(filelist_shrpt)
Here is the error I receive:
Error: HTTPSConnectionPool(host='login.microsoftonline.com', port=443): Max retries exceeded with url: /extSTS.srf (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond'))
After collaborating with some more knowledgeable coworkers, I was told it might be a permissions issue; but I'm not sure how to verify that. Any help and/or advice would be appreciated.

can’t connect to overpass

I can’t connect to overpass!
import osmnx as ox
ox.plot_graph(ox.graph_from_place(‘Modena, Italy’))
gives:
ConnectionError: HTTPConnectionPool(host='overpass-api.de', port=80): Max retries exceeded with url: /api/interpreter (Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond',))
Also noticed that osmnx gets more nodes that openstreetmap per box – how can this happen ?
thnks in advance!!!
It sounds like the Overpass API was probably down. I just tested now and was able to connect without a problem. I'd suggest trying again.