Automate Python with AWS Lambda

AWS Lambda is a serverless compute service provided by Amazon Web Services that allows you to easily and efficiently automate Python code in the cloud. When we needed to automate our Data Science pipeline for our Fantasy Basketball Optimization project, we chose to use AWS Lambda functions along with several other AWS services for our cloud solution (which was almost all done within the free tier limits!)

The 5 other AWS services that we paired with Lambda are

  1. Elastic Container Registry (ECR) to supply code in Docker images to the function
  2. S3 buckets to store files that the function can read from and write to
  3. Secrets Manager to securely store sensitive info like API keys and passwords
  4. EventBridge to schedule functions to run automatically
  5. Identity and Access Management (IAM) to manage permissions as we connect everything together

Most of these services have free tiers for a set amount of usage, which actually allow you to do quite a bit (for instance, Lambda gives you about 400K GB-seconds of compute per month). We managed to run our Fantasy Basketball Optimization pipeline on mostly free tiers (with a few exceptions that cost us only about $1.35 per month total). The full list of AWS free tier offerings can be found here.

Rather than dive into everything at once, we provided a few different examples, which start simple and grow in complexity a little bit each time:

  1. Simple Lambda Function
    • Create your first Lambda function by simply writing code in the online editor
  2. Lambda with Additional Packages
    • Deploy code through a Docker image so that you can use Python packages that aren’t already included in the Lambda runtime
  3. Full Automation Setup
    • Connect other AWS services so that you can read/write data to cloud storage and schedule functions to run automatically

Let’s get into it, starting with just a simple Lambda function.

Simple Lambda Function

We’ll start with a very basic Lambda function where we simply edit the code right in the browser window. We’ll make a quick demo function to perform a few mathematical calculations and print the results.

There are a few basic steps to create our simple function:

  1. Navigate to the AWS Lambda page (by searching for “Lambda” in the console)
  2. Click the ‘Create function’ button on the top right
  3. Name your function and select the ‘Python 3.13’ runtime (or whichever version of Python you prefer)
  4. Click the ‘Create function’ button in the bottom right, and you have your first Lambda function!

Deploying and Testing the Simple Function

Once your function is created, you can scroll down to the online editor to change the function code. Note that the main code executes from the lambda_handler() function, but you can create other Python functions outside of it that are called by lambda_handler().

Once your code is ready to go, click the ‘Deploy’ button on the left, and you can test your code over on the ‘Test’ tab. Once you hit the ‘Test’ button, a green window should pop up showing the execution results, including any console output.

Lambda with Additional Packages

Now that we created just a simple Lambda function, we’re ready to turn it up a notch by adding additional Python packages. While the Lambda runtime provides a lot of base Python packages, there are often packages we need that it doesn’t have (for instance, the Python Pandas package.

When adding extra packages, the deployment process is a little more complex. We have to use a tool called Docker, which basically bundles our Python code and packages together into a single container image that gets deployed to Lambda.

For this workflow, we start by creating the code locally (in this case, we’ll use the Visual Studio Code IDE), deploying it to an AWS Elastic Container Registry (ECR) repo using Docker Desktop, and then setting our Lambda function to read the image from the ECR repo. Note that we’ll also utilize the AWS Identity and Access Management (IAM) service allow our Lambda function to access the ECR repo.

Creating an ECR Repo

To start, we’ll create the ECR repo for us to deploy code to. This can be done in a few simple steps:

  1. Navigate to the ECR page (by searching “elastic container registry” in the console)
  2. Click the ‘Create repository’ button on the top right
  3. Give your repo a name
  4. Click the ‘Create’ button on the bottom right, and you’re done!

Deploying Code to the ECR Repo

With our repo created, we’re ready to write and deploy our code to it. There are 4 main local files that we need (all saved in the same folder):

  1. An app.py file with the Python code for our function (note that it still needs the lambda_handler() function to execute)
  2. A requirements.txt file that lists out all of the Python packages we need
  3. A Dockerfile that tells Docker how to bundle everything together
  4. A deploy_image.sh script that contains all the shell commands for building the image and deploying it to ECR

Note that the full source code for these files is available in our GitHub repo.

Once you have all your files together, you can deploy the image by running your deploy_image.sh script in a Bash terminal (we usually use Git Bash). Note that Docker Desktop has to be running before you run the script.

The script may take a bit to run, but once it finishes, you should see your Docker image show up in your ECR repo (note that you may have to refresh the page).

Creating an IAM Role

With our code deployed, the next thing to do is give our Lambda function access to it. This can be done through an Identity and Access Management (IAM) role. We’ll create a role that has access to the ECR repo, then assign the role to the Lambda function in the next part.

Here are the steps for creating the role:

  1. Navigate to the IAM page (by searching “IAM” in the console)
  2. Select the ‘Roles’ page on the left
  3. Click the ‘Create role’ button on the top right
  4. Make sure that the ‘AWS service’ option is selected
  5. Select the ‘Lambda’ service from the dropdown
  6. Add the ECR Read Only permission (by searching for “ElasticContainerRegistry” and checking the Read Only one)
  7. Give your role a name
  8. Click the ‘Create role’ button in the bottom right, and you’re done!

Creating the Lambda Function from ECR Image

With everything setup, we’re ready to put it all together now into our final Lambda function. This will be very similar to how we created our simple Lambda function, but with a few differences in some of the settings:

  1. Select the ‘Container image’ option (instead of the ‘Author from scratch’ default)
  2. Name the function again, but this time click the ‘Browse images’ button below it
  3. In the window that pops up, find your new ECR repo in the dropdown and select the image you deployed
  4. Select the ‘arm64’ architecture (this aligns with what we specified in the Dockerfile)
  5. Expand the ‘Change default execution role’ section, check ‘Use an existing role’, and select your new role from the dropdown
  6. Click the ‘Create function’ button on the bottom right, and you’re done!

Full Automation Setup

Now that we learned how to add extra packages to our Lambda function, there are a few other AWS services we’ll connect to for our full automation setup:

  • AWS S3 buckets
    • Blob storage that the function can read from and write to
  • AWS Secrets Manager
    • Secure place to store secrets that the function uses (such as API keys and passwords)
  • AWS EventBridge
    • Scheduling the function to run on a regular cadence

Creating an S3 Bucket

We’ll start with the S3 bucket, which is basically just a blob storage directory in the cloud. We can create a new bucket with a few simple steps:

  1. Navigate to the S3 page (by searching “S3” in the console)
  2. Click the ‘Create bucket’ button in the top right
  3. Make sure that the ‘General purpose’ option is selected and give your bucket a name
  4. Click the ‘Create bucket’ button in the top right, and you’re ready to go!

Creating Secrets in the Secrets Manager

Next, we’ll create a few secrets in the secrets manager, which is a secure way to store sensitive information like API keys and passwords. Note that in each high level “secret” you can have multiple different secret values (specified as key/value pairs).

Here are a few steps to create a new secret:

  1. Navigate to the Secrets Manager page (by searching “Secrets Manager” in the console)
  2. Click the ‘Store a new secret’ button in the top right
  3. Select the ‘Other type of secret’ option for miscellaneous secrets
  4. Add your secrets as key/value pairs
  5. Give your overall secret a name
  6. Click the ‘Store’ button on the bottom right, and you’re done!

Referencing other Services with Lambda

With both our S3 bucket and secrets created, we’ll show you how to reference both of them with your Lambda function. Both services can be accessed with your Python code through the boto3 package (see examples below).

Once you’ve added to your function code, you’ll need to re-deploy it to the repo by running the shell script again. Note that every time you re-deploy the image, you’ll need to update your Lambda function to reference the new image too (in your Lambda function page, hit the ‘Deploy new image’ button and select the new image).

Since the function references other AWS services now too, we’ll need to add permissions to our role to allow access to those other services. You can do this by adding the Secrets Manager Read/Write and S3 Full Access permissions to your role (see step 5 below).

Lastly, depending on what your function is doing, you might need to adjust the function timeout limit for longer running tasks. The default timeout is only 3 seconds (it will error out if it runs for longer than that), so you may want to increase this limit depending on the expected run time of your code. You can adjust this under the ‘Configuration’ tab by clicking ‘Edit’ under ‘General configuration’.

Scheduling the Function with EventBridge

With our function set up with everything we need, the last step is to automate it on a schedule (in this case we’ll run it daily at 9:30 AM). This is done through the AWS EventBridge service, where we can create events that trigger things like Lambda function execution.

Here’s a list of steps to setup our schedule:

  1. Edit you’re role’s trust policy to allow the EventBridge scheduler (i.e. navigate to the IAM role you created and hit ‘Edit trust policy’ from the ‘Trust relationships’ tab)
  2. Add scheduler.amazonaws.com as another principal in the trust policy (see code in step 2 below) and click the ‘Update policy’ button
  3. Navigate to the EventBridge page (by searching “EventBridge” in the console)
  4. Select the ‘EventBridge Schedule’ option on the ‘Get started’ section on the right and click ‘Create schedule’
  5. Give the schedule a name
  6. Set the schedule settings:
    • Recurring, rate-based schedule (i.e. the ‘Recurring schedule’ and ‘Rate-based schedule’ options)
    • Your local time zone (from the dropdown)
    • Schedule rate (i.e. every 1 days)
    • Time window flexibility (whether the code should run perfectly on-time, or if the timing can be off by X minutes)
  7. Specify a start and end time for the schedule
    • Note that with a daily schedule, the start time will dictate the time at which the function will run every day (9:30 AM in our example)
  8. Specify Lambda as the target
  9. Select your Lambda function from the dropdown to invoke
  10. Select ‘Use existing role’ under Permissions and select your role
  11. Click the ‘Create schedule’ button in the bottom right, and you’re done!

Conclusion

AWS Lambda is a great tool for automating Python in the cloud, and the free tier limits allow you to do quite a bit for free (or nearly free). After walking through those examples, we hope you feel confident enough to go out an create your own automated cloud pipelines for whatever you need.

We provided the code source files in our Daily Data GitHub repo, so feel free to download those and try it out yourself.

We hope you learned something today, and feel free to drop a comment about what tasks you might automate with Lambda!

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top