Compiling Python Redshift DB adapter for AWS Lambda env.

AWS lambda has gained huge momentum in the last couple of years and enabled software architects/ developers to build FaaS (Function as a Service).  As much as Lambda helps in scaling applications, it has some limitations like execution duration or memory space availability, etc.   For long running jobs, typically in the backend or batch processing, 5 minute duration can be a deal breaker.  But with appropriate data partitions and architecture it is still an excellent option for enterprises to scale their applications and be cost effective.

In the recent project, I architected data be loaded from a datalake into Redshift.  The data is produced by an engine in batches and pushed to s3.  The data partitioned on time scale and a consumer Python application will load this data at regular intervals into Redshift staging environment.  For scalable solution datalake can be populated from multiple producers and similarly one or more consumers can drain the datalake queue to load to Redshift.  The data from multiple staging tables are then loaded to final table after deduping and data augmentation.

As with lambda system it comes with many tools, applications, libraries including boto3, Python, Perl, etc. but not psycopg2 – a Python DB adapter/ wrapper.  So one has to package psycopg2 along with the function (service application) and uploaded it while creating a lambda.  Following are the steps that I took compile statically linked library on AWS Lambda AMI compatible system.

Lambda environment is a Linux system and the current available AMI is amzn-ami-hvm-2017.03.1.20170812-x86_64-gp2 .   First start an instance of this AMI and download needed tools and source code.

# Download git/ lynx...
> sudo yum install git
> sudo yum install lynx

# Download Postgres...
> mkdir ~/postgres
> cd ~/postgres
> lynx
saved postgresql-9.4.3.tar.gz

> gunzip post*gz
> tar xvf post*tar
... [More than 6,200 files]

# Download psycop...
> mkdir ~/psycop
> cd ~/psycop
> lynx
> gunzip *gz
> tar xvf *tar

# Make config changes to Postgres and build
> cd ~/postgres
> ./configure --prefix ~/postgres/postgresql-9.4.3 --without-readline --without-zlib
> make
# After install check for pg_config, psql, etc. executables

# Make config changes to psycop and build statically linked lib
> vim psycopg2-2.6.1/setup.cfg

# Build pscycopg2
> python setup build

# Zip lambda function and library
> zip -r [] *.py [LIBRARY] -x *.pyc -x [ANYTHING_NOT_NEEDED]


Precompiled one on Github:

One thought on “Compiling Python Redshift DB adapter for AWS Lambda env.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s