Building lambda function to connect to Postgresql getting :Unable to import module 'lambda_function': No module named 'psycopg2' - postgresql

I am testing my api gateway to call lambda function.
i was successful in the test.
i was then trying to make a connection to postgresql through the same lambda
import json
import psycopg2
db_host = "hostname"
db_port = 5432
db_name ="db name"
db_user ="user"
db_pass ="password"
def connect():
conn = None
try :
conn = psycopg2.connect("dbname={} user={} host={} password={}".format(db_name,db_user,db_host,db_pass))
except :
print("connetion error")
return conn
print("Loading function")
def lambda_handler(event, context):
# paring query from the string
name = event['queryStringParameters']['name']
action = event['queryStringParameters']['action']
print('name = '+name )
print('action = '+action)
# body of the response object
transactionResponse = {}
transactionResponse['name'] = name
transactionResponse['action'] = action
transactionResponse['message'] = 'Lambda called from api_gateway'
# construting Http response
responseObject = {}
responseObject['statusCode'] = 200
responseObject['headers'] {}
responseObject['headers']['Content-Type'] = 'application/json'
responseObject['body'] = json.dumps(transactionResponse)
# return the response object
return responseObject
when i tried to trigger it through the API endpoint i got
Unable to import module 'lambda_function': No module named 'psycopg2'
then i went ahead and build my lambda function by downloading the required package and then uploaded a zip file .
when i try to call try the same to trigger the lambda i am getting
Runtime.ImportModuleError: Unable to import module 'lambda_function': No module named 'lambda_function'
don't know what lamda_function is .
Could any one suggest me out of this slump ?
or
provide me a way to connect to RDS through lambda from API gateway trigger
This is my build Package

the issue is no longer there.
Get the psycopg2 build library from https://github.com/jkehler/awslambda-psycopg2 was built for python 3.6 and make sure you change the name to psycopg2 while uploading your code to AWS lambda, select Python Runtime environment as 3.6, and it should work.

Can you check the lambda_handler settings and ensure they are correctly set to represent your function:

You should check the lambda handler name from the console. This is likely to be caused because the handler name is referring to lambda_function.foobar but the filename of the Lambda within the zip would be not be named lambda_function.py.
Ensure the name is in the format filename.function_name.
In this example if the file was named lambda_function then the handler value should be lambda_function.lambda_handler.
The directory structure does not currently include the psycopg2 module so this will still not be able to be loaded.
To solve this the following solutions are applicable:
Add the dependency via pip install, then zip up again deploy
Add a Lambda layer with this dependency already installed

Related

How can I run pytesseract / tesseract in Foundry Code Repositories?

I am trying to use the function image_to_string from the library pytesseract in a repository to perform OCR of PDFs. However, I am getting the following error:
From the checks I would assume the library was loaded correctly:
Does anyone have an idea how to trouble shoot here?
It seems like Foundry is not respecting / running the environment activation script
https://github.com/conda-forge/tesseract-feedstock/blob/main/recipe/activate.sh
that sets the TESSDATA_PREFIX environment variable automatically. However, we can infer the value manually and provide it to the pytesseract API calls.
Define the following helper function:
def _get_tessdata_directory_path():
import sys
from pathlib import Path
env_root = Path(sys.executable).parent.parent
share_dir = env_root / 'share' / 'tessdata'
assert share_dir.exists(), 'tessdata directory does not exist in <envroot>/share/tessdata'
return str(share_dir)
and use it like shown in the following snippet:
tessdata_dir_config = f'--tessdata-dir "{_get_tessdata_directory_path()}"'
pytesseract.image_to_string(image, ..., config=tessdata_dir_config)

How to mock an S3AFileSystem locally for testing spark.read.csv with pytest?

What I'm trying to do
I am attempting to unit test an equivalent of the following function, using pytest:
def read_s3_csv_into_spark_df(s3_uri, spark):
df = spark.read.csv(
s3_uri.replace("s3://", "s3a://")
)
return df
The test is defined as follows:
def test_load_csv(self, test_spark_session, tmpdir):
# here I 'upload' a fake csv file using tmpdir fixture and moto's mock_s3 decorator
# now that the fake csv file is uploaded to s3, I try read into spark df using my function
baseline_df = read_s3_csv_into_spark_df(
s3_uri="s3a://bucket/key/baseline.csv",
spark=test_spark_session
)
In the above test, the test_spark_session fixture used is defined as follows:
#pytest.fixture(scope="session")
def test_spark_session():
test_spark_session = (
SparkSession.builder.master("local[*]").appName("test").getOrCreate()
)
return test_spark_session
The problem
I am running pytest on a SageMaker notebook instance, using python 3.7, pytest 6.2.4, and pyspark 3.1.2. I am able to run other tests by creating the DataFrame using test_spark_session.createDataFrame, and then performing aggregations. So the local spark context is indeed working on the notebook instance with pytest.
However, when I attempt to read the csv file in the test I described above, I get the following error:
py4j.protocol.Py4JJavaError: An error occurred while calling o84.csv.
E : java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
org.apache.hadoop.fs.s3a.S3AFileSystem not found
How can I, without actually uploading any csv files to S3, test this function?
I have also tried providing the S3 uri using s3:// instead of s3a://, but got a different, related error: org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "s3".

ModuleNotFoundError: No module named 'gspread' on python anywhere

What I am trying to achieve.
Run a python script saved On pythonanywhere host from google sheets on a button press.
Check the answer by Dustin Michels
Task of Each File?
app.py: contains code of REST API made using Flask.
runMe.py: contains code for that get values from(google sheet cell A1:A2). And sum both values send sum back to A3.
main.py: contains code for a GET request with an argument as name(runMe.py).filename may change if the user wants to run another file.
I Made an API by using Flask.it works online and offline perfectly but still, if you want to recommend anything related to the app.py.Code Review App.py
from flask import Flask, jsonify
from flask_restful import Api, Resource
import os
app = Flask(__name__)
api = Api(app)
class callApi(Resource):
def get(self, file_name):
my_dir = os.path.dirname(__file__)
file_path = os.path.join(my_dir, file_name)
file = open(file_path)
getvalues = {}
exec(file.read(), getvalues)
return jsonify({'data': getvalues['total']})
api.add_resource(callApi, "/callApi/<string:file_name>")
if __name__ == '__main__':
app.run()
Here is the Code of runMe2.py
import gspread
from oauth2client.service_account import ServiceAccountCredentials
# use creds to create a client to interact with the Google Drive API
scopes =['https://www.googleapis.com/auth/spreadsheets',"https://www.googleapis.com/auth/drive.file","https://www.googleapis.com/auth/drive"]
creds = ServiceAccountCredentials.from_json_keyfile_name('service_account.json', scopes)
client = gspread.authorize(creds)
# Find a workbook by name and open the first sheet
# Make sure you use the right name here.
sheet = client.open("Demosheet").sheet1
# Extract and print all of the values
list_of_hashes = sheet.get_all_records()
print(list_of_hashes)
below is the main.py code
import requests
BASE = 'https://username.pythonanywhere.com/callApi/test.py'
response = requests.get(BASE)
print(response.json())
main.py output
{'data': 54}
Test.py code
a = 20
b = 34
total = a+b
print(total)
PROBLEM IS
if I request runMe2.py at that time I am got this error.
check runMe2.py code above
app.py is hosted on https://www.pythonanywhere.com/
ModuleNotFoundError: No module named 'gspread'
However, I installed gspread on pythonanywhere why using the command. but it's not working.
You either haven't installed the gspread package on your current python environment or it is installed somewhere (e.g. in a diff. virtual env) and your script cant find it.
Try installing the package inside the environment your running your script in using pip3:
pip3 install gspread
You can try something like this on Windows
pip install gspread
or on Mac
pip3 install gspread
If you're running on Docker, or building with a requirements.txt you can try adding this line you your requirements.txt file
gspread==3.7.0
Any other instructions for this package can be found here => https://github.com/burnash/gspread
Download gspread here
Download the tar file: gspread-3.7.0.tar.gz from the above link
Extract file and convert folder in zip then upload it back on server
Open bash console and use command as
$ unzip gspread-3.7.0
$ cd gspread-3.7.0
$ python3.7 setup.py install --user

Loading JDBC connector in Eclipse Jython plugin

I'm working with an Eclipse-based tool which provides an interactive Jython shell for scripting and data analysis against an internal data model.
I'm trying to write a script which exports results to some form of database, so I'm trying to use the built-in com.ziclix.python.sql package in Jython to provide the interface and the Xerial JDBC connector for SQLite (https://github.com/xerial/sqlite-jdbc) to provide the backend.
The script below works perfectly when run outside of the third-party tool using a standard command line Jython interpreter, including relying on the importJar() hack which is commonly used to work around Jython not always using the user CLASSPATH when run using java -jar <blah>:
from com.ziclix.python.sql import zxJDBC
from java.net import URL, URLClassLoader
from java.lang import ClassLoader
from java.io import File
JDBC_URL = "jdbc:sqlite:test.db"
JDBC_DRIVER = "org.sqlite.JDBC"
JDBC_JAR = "E:/sqlite-jdbc-3.21.0.jar"
# Import Jar file into local class path
def importJar(jarFile):
m = URLClassLoader.getDeclaredMethod("addURL", [URL])
m.accessible = 1
m.invoke(ClassLoader.getSystemClassLoader(), [File(jarFile).toURL()])
def main():
try:
importJar(JDBC_JAR)
dbConn = zxJDBC.connect(JDBC_URL, None, None, JDBC_DRIVER)
cursor = dbConn.cursor()
# Do something useful
cursor.close()
dbConn.close()
except zxJDBC.DatabaseError, msg:
print msg
if __name__ == '__main__':
main()
... but fails when run from the plugin inside Eclipse, where the zxJDBC.connect() call errors with:
driver [org.sqlite.JDBC] not found
If I add the Jar file to the JYTHONPATH environment I can do import org.sqlite.JDBC successfully in the Python script, but the connect call still fails in the Java-side of the JDBC driver manager.
For sake of completeness the full path to the Jar file is on the CLASSPATH, PYTHONPATH, and JYTHONPATH environment variables ...
Any ideas?

pytest implementing a logfile per test method

I would like to create a separate log file for each test method. And i would like to do this in the conftest.py file and pass the logfile instance to the test method. This way, whenever i log something in a test method it would log to a separate log file and will be very easy to analyse.
I tried the following.
Inside conftest.py file i added this:
logs_dir = pkg_resources.resource_filename("test_results", "logs")
def pytest_runtest_setup(item):
test_method_name = item.name
testpath = item.parent.name.strip('.py')
path = '%s/%s' % (logs_dir, testpath)
if not os.path.exists(path):
os.makedirs(path)
log = logger.make_logger(test_method_name, path) # Make logger takes care of creating the logfile and returns the python logging object.
The problem here is that pytest_runtest_setup does not have the ability to return anything to the test method. Atleast, i am not aware of it.
So, i thought of creating a fixture method inside the conftest.py file with scope="function" and call this fixture from the test methods. But, the fixture method does not know about the the Pytest.Item object. In case of pytest_runtest_setup method, it receives the item parameter and using that we are able to find out the test method name and test method path.
Please help!
I found this solution by researching further upon webh's answer. I tried to use pytest-logger but their file structure is very rigid and it was not really useful for me. I found this code working without any plugin. It is based on set_log_path, which is an experimental feature.
Pytest 6.1.1 and Python 3.8.4
# conftest.py
# Required modules
import pytest
from pathlib import Path
# Configure logging
#pytest.hookimpl(hookwrapper=True,tryfirst=True)
def pytest_runtest_setup(item):
config=item.config
logging_plugin=config.pluginmanager.get_plugin("logging-plugin")
filename=Path('pytest-logs', item._request.node.name+".log")
logging_plugin.set_log_path(str(filename))
yield
Notice that the use of Path can be substituted by os.path.join. Moreover, different tests can be set up in different folders and keep a record of all tests done historically by using a timestamp on the filename. One could use the following filename for example:
# conftest.py
# Required modules
import pytest
import datetime
from pathlib import Path
# Configure logging
#pytest.hookimpl(hookwrapper=True,tryfirst=True)
def pytest_runtest_setup(item):
...
filename=Path(
'pytest-logs',
item._request.node.name,
f"{datetime.datetime.now().strftime('%Y%m%dT%H%M%S')}.log"
)
...
Additionally, if one would like to modify the log format, one can change it in pytest configuration file as described in the documentation.
# pytest.ini
[pytest]
log_file_level = INFO
log_file_format = %(name)s [%(levelname)s]: %(message)
My first stackoverflow answer!
I found the answer i was looking for.
I was able to achieve it using the function scoped fixture like this:
#pytest.fixture(scope="function")
def log(request):
test_path = request.node.parent.name.strip(".py")
test_name = request.node.name
node_id = request.node.nodeid
log_file_path = '%s/%s' % (logs_dir, test_path)
if not os.path.exists(log_file_path):
os.makedirs(log_file_path)
logger_obj = logger.make_logger(test_name, log_file_path, node_id)
yield logger_obj
handlers = logger_obj.handlers
for handler in handlers:
handler.close()
logger_obj.removeHandler(handler)
In newer pytest version this can be achieved with set_log_path.
#pytest.fixture
def manage_logs(request, autouse=True):
"""Set log file name same as test name"""
request.config.pluginmanager.get_plugin("logging-plugin")\
.set_log_path(os.path.join('log', request.node.name + '.log'))