Skip to content

AWS - Lambda and selenium

What ?

K3s rpi cluster

I have recently started a slow and awkward dance with Amazon and their serverless solutions. And no, I did not forget you OpenFaaS, daddy still loves you.

Long story short, I decided to automate one of my tasks I often do in work and to learn Lambda at the same time.  This ended up quite a journey, which was successful in the end but almost died in its infancy since the first step I needed to automate was to Log In to one specific page which have very "cleverly" made log in process and can't be done via requests library. But I knew that it can be done via Selenium and Chromedriver.


Lambda functions in AWS are great and if Synthetic Canaries could return actual custom value instead only failed / not failed I would not have to bother with all this., The issue is that you are limited with space of how much you can deploy into single lambda function and that's 250 MB.

Remember, you need to squeeze:

You see the issue now ? 250 MB vs 425 MB+

You can try to fit it in with layers, and I was able to get Selenium, Chromedriver and Chrome into Lambda with layers... but! In the background, it still runs in a small container that simply does not have dependencies for Chrome browser. You will end up with:


chromedriver unexpectedly exited. Status code was: 127

A solution

Well since there is no way to fit it to pure Lambda there is another Lambda option and that is to use your own container.  Here the limits are a bit more relaxed. Your container can be max 10 GB in size.

So I assume you have some basic knowledge about the Lambda, AWS cli, or SAM (I sure did not know I can use my own container as lambda function...)

Creating basic function

Let's look at basic file and folder structure of simple Lambda function that is able to invoke Chromedriver and Chrome. I'm going to use SAM to generate scaffolding for this function.

vladoportos@Odin:~/amazon_aws/test$ sam init
Which template source would you like to use?
        1 - AWS Quick Start Templates
        2 - Custom Template Location
Choice: 1

Cloning from

Choose an AWS Quick Start application template
        1 - Hello World Example
        2 - Multi-step workflow
        3 - Serverless API
        4 - Scheduled task
        5 - Standalone function
        6 - Data processing
        7 - Infrastructure event management
        8 - Machine Learning
Template: 1

 Use the most popular runtime and package type? (Nodejs and zip) [y/N]: N

Which runtime would you like to use?
        1 - dotnet5.0
        2 - dotnetcore3.1
        3 - go1.x
        4 - java11
        5 - java8.al2
        6 - java8
        7 - nodejs14.x
        8 - nodejs12.x
        9 - python3.9
        10 - python3.8
        11 - python3.7
        12 - python3.6
        13 - ruby2.7
Runtime: 9

What package type would you like to use?
        1 - Zip
        2 - Image
Package type: 2

Based on your selections, the only dependency manager available is pip.
We will proceed copying the template using pip.

Project name [sam-app]: sample-app

    Generating application:
    Name: sample-app
    Base Image: amazon/python3.9-base
    Architectures: x86_64
    Dependency Manager: pip
    Output Directory: .
    Next steps can be found in the README file at ./sample-app/

    Commands you can use next
    [*] Create pipeline: cd sample-app && sam pipeline init --bootstrap
    [*] Test Function in the Cloud: sam sync --stack-name {stack-name} --watch

This will generate basic folder structure and files. Important part is to chose Image for package type.

vladoportos@Odin:~/amazon_aws/test/sample-app$ ls

#cd to hello_world
vladoportos@Odin:~/amazon_aws/test/sample-app/hello_world$ ls

Above you can see the folder structure generated for us.

Most important files are:

  • template.yaml  - Telling SAM how to deploy our function.
  • hello_world/Dockerfile - Docker file, we are going to change this.
  • hello_world/ - Our function, I will show you what to add there.
  • hello_world/requirements.txt - Classic requirement file for python app, add your libraries there, so they will be automatically added to your app.


Change it to look like this:

AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: >

  Sample SAM Template for sample-app

    Timeout: 60

    Type: AWS::Serverless::Function
      PackageType: Image
      MemorySize:  512
      Timeout: 300
        - x86_64
          URL: ''
      Dockerfile: Dockerfile
      DockerContext: ./hello_world
      DockerTag: python3.9-v1
I have removed API and Outputs from it, we don't need that now.


  • If you want to rename your function, change the "HelloWorldFunction" here as well.  Also, the name of folder where you function live can be changed here under "DockerContext: ./hello_world" just don't forget to rename the folder as well 🙂
  • I have added an environment variable with URL that I will refer to in the code later.
  • Also I have increased RAM for that function to 512 MB


I have just Faker in it to generate random user-agent for the Chromedriver. Its nifty library to generate fake data that looks convincing 🙂 Look into its documentation here Faker Documentation



Notice I did not put selenium in there... spooky, right ?


Our pièce de résistance, let's look at it:

#based on
FROM as build

RUN yum install -y unzip && \
    curl -Lo "/tmp/" "" && \
    curl -Lo "/tmp/" "" && \
    unzip /tmp/ -d /opt/ && \
    unzip /tmp/ -d /opt/

RUN yum install atk cups-libs gtk3 libXcomposite alsa-lib \
    libXcursor libXdamage libXext libXi libXrandr libXScrnSaver \
    libXtst pango at-spi2-atk libXt xorg-x11-server-Xvfb \
    xorg-x11-xauth dbus-glib dbus-glib-devel -y

RUN pip install selenium

COPY --from=build /opt/chrome-linux /opt/chrome
COPY --from=build /opt/chromedriver /opt/

COPY requirements.txt ./

RUN python3.8 -m pip install -r requirements.txt -t .

# Command can be overwritten by providing a different command in the template directly.
CMD ["app.lambda_handler"]


I used Python3.8 image despite choosing function for Python3.9, for simple unverified reason that I have read some comments on internet that Python3.9 had some issue with selenium and Chromedriver. This is completely unverified, so try 3.9 first and if there are issues switch to 3.8.

I used the official python3.8 image from Amazon, and extended it with all we need. This was based on this repo: Github Repo, credit where credit is due.

You can see that it download the Chromium and Chromedriver, unzip and put them to /opt install selenium etc..

In our function, we need to keep some special setting for chrome to have it run ok.

from tempfile import mkdtemp
import logging
import sys
import os
from faker import Faker
import random
from selenium import webdriver
from import By
from import WebDriverWait
from selenium.webdriver.common.keys import Keys
from import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException, WebDriverException, TimeoutException, NoSuchWindowException

language_list = ['en']

#This part make logging work locally when testing and in lambda cloud watch
if logging.getLogger().hasHandlers():

#Function that setup the browser parameters and return browser object.
def open_browser():
    fake_user_agent = Faker()
    options = webdriver.ChromeOptions()
    options.binary_location = '/opt/chrome/chrome'
    options.add_experimental_option("excludeSwitches", ['enable-automation'])
    options.add_argument('--lang=' + random.choice(language_list))
    options.add_argument('--user-agent=' + fake_user_agent.user_agent())
    chrome = webdriver.Chrome("/opt/chromedriver", options=options)

    return chrome

#Our main Lambda function
def lambda_handler(event, context):

        # Open browser
        browser = open_browser()

        # Clean cookies

        # Open web
        #"Opening web " + os.environ['URL'])

        #get text from google
        target_XPATH = '/html/body/div[1]/div[3]/form/div[1]/div[1]/div[3]/center/input[1]'
        target = WebDriverWait(browser, 20).until(EC.presence_of_element_located((By.XPATH, target_XPATH)))
        return_val = target.text

        #browser.close() might not be required since the container is destroyed anyway after done.
        return {
            "return": return_val

    except AssertionError as msg:
    except TimeoutException:
        logging.error('Request Time Out')
    except WebDriverException:
        logging.error('------------------------ WebDriver-Error! ---------------------', exc_info=True)
        logging.error('------------------------ WebDriver-Error! END ----------------')
    except NoSuchWindowException:
        logging.error('Window is gone, somehow...- NoSuchWindowException')
    except NoSuchElementException:
        logging.error('------------------------ No such element on site. ------------------------', exc_info=True)
        logging.error('------------------------ No such element on site. END ------------------------')

Additional requirements.

Before you push this baby out into the wild world of AWS, you need to create Amazon Elastic Container Registry (ECR) where the container will be stored. The container is 537.96 MB in size... so it's up to you if you create private repo (500 MB free) or public repo where you get 50 GB free. SAM will ask you this repo name when deploying your lambda.

Build, Test, Publish

You can test locally ->Guide<-, if you set up SAM and AWS cli (look that thing up in google) and have docker installed. You can now publish it to AWS and have fun with Selenium, Chrome and Chromedriver in Lambda function.

I hope you liked this short guide and got something useful out of it. Take a break, grab some beverage and maybe one for me too.

Last update: February 7, 2022