A Web Socket Primer hero image
A Web Socket Primer
Oct 11, 2023 14 min read
A Web Socket primer! Everything you need to know about API Gateway Web Sockets to get started building!

The AWS API Gateway has had web socket capability since December 2018. Web Sockets can be used for instant notifications, enabling backend services to send messages or events to one or more listeners connected to the Web Socket service. The AWS web socket offering provides connection management and the capability for sending messages to connected listeners. The rest is up to you! It is a great managed service primitive enabling a multitude of use cases.

This article is intended as a Web Socket primer - it will go into the different components of the AWS API Gateway implementation and provide a primer into what you need to know when building with this service and how each component behaves, enabling you to understand the service and to build solutions effectively.

How Web Sockets Work

Web sockets enable a persistent, bi-directional connection between your client system and backend service. This persistent connection creates a messaging conduit that enables data to flow in both directions, so the communication is not synchronous - it is asynchronous. Because the message flow is bi-directional, you need to consider designing a message protocol to make sense of the data flowing from the front to the back end and vice-versa. It is also important to note that the client when it sends a message - all it does is write the data to its outbound connection, which will send the data to the server. It won't get a response from the server - the data goes directly to the backend.

A really important distinction between API gateway HTTP requests (synchronous) and API Gateway WebSocket requests (bi-directional) is listed in the AWS documentation for Web Sockets:

In the HTTP protocol, in which requests and responses are sent synchronously; communication is essentially one-way. In the WebSocket protocol, communication is two-way. Responses are asynchronous and are not necessarily received by the client in the same order as the client's messages were sent. In addition, the backend can send messages to the client.

You should do some design work around defining a protocol for messaging between the front and back end components - this is not as simple as building a synchronous REST API.

Web Socket Parts

The AWS Web Socket implementation has a set of defined routes you can define and then integrate to various backend services, similar to AWS API Gateway routes. Web Sockets provide the following standard routes: $connect, $disconnect, $default and one or more custom routes. Each defined route has a specific function; together, they provide the building blocks for almost anything! The $connect route also enables Authorization and/or Authentication to provide security to your web socket service. We will cover this in a deeper dive at a later time.

Web Socket Parts

The diagram above shows AWS Lambda Proxies as the integration target, but the AWS API Gateway Web Socket implementation also enables other integration types (just like REST API Gateway routes):

  • Lambda function (Proxy Integration)
  • Lambda function (direct integration)
  • HTTP Integration to integrate to an existing HTTP endpoint
  • AWS Service - for a direct service integration
  • VPC Link - to integrate with a service that isn't accessible over the internet.

A Note about Routes

You do not have to define an integration for every route in the Web Socket gateway. The AWS documentation page for $connect states that this route is optional, and it is. The $disconnect and $default routes are also optional. However, you need at least one route to be defined with an integration for your Web Socket instance to be active and available. The route you define could also be a custom route, and then your Web Socket API gateway will exist!

A common misunderstanding with Web Sockets is that you can only send data to the socket connection via the API Gateway @connections endpoint asynchronously as a notification or broadcast. This is not the only way - the $default and custom routes that you define also have the option to return a JSON payload from the integration execution. The response you return from your integration function is not returned like an API Gateway payload as part of a synchronous request/response cycle to the send message function of the client, so even though your integration can send back a JSON message as a response, it is still received by the client in the same way as a backend broadcast message. This makes the backend route development similar to an API Gateway request/response and does away with requiring permission to use the @connections API. This makes the backend configuration and implementation simpler. However, you still need to consider the design aspects of the messages that will flow between the clients and the backend of your system.

$connect Route

This is the first route web socket clients call to connect and create the open web socket channel for two-way communication between client and server. As mentioned, it is also where you can add authorisation and authentication to secure your socket service, but we are not covering this detail here. The $connect route is also optional, meaning you do not have to provide an integration if you don't need it. In most situations, you will require a $connect integration handler when:

  • You want to you want to be notified when clients connect
  • You want to throttle connections or control who connects
  • You want your backend to send messages back to clients using a callback URL
  • You want to store the connectionId in a backend data store like DynamoDB.
  • You want to enable clients to specify a subprotocol for the connection by using the Sec-Websocket-Protocol field. You can dive deeper into Web Socket subprotocols here.

When you add a route for a Web Socket, the CDK enables configuration for returnResponse in the construct. This configuration option is a boolean that determines if the route should send a response to the client. This configuration option does not affect the $connect route, which will never return a response body to the web socket client. The $connect route integration must return a statusCode field in the integration response payload. The statusCode is a standard HTTP status code that indicates to the API Gateway Websocket how to respond to the client request. A WebSocket $connect request will only be successful if a 200 HTTP Status Code is returned, as shown in the following Lambda code snippet:

export const onConnect = async (event) => { console.log('Lambda Event', JSON.stringify(event)); // Process connection, if OK return 200 return { statusCode: 200, }; };

When the $connect route is called, the following integration event is received by your backend service.

{ "headers": { "Host": "p1var95tx6.execute-api.region.amazonaws.com", "Sec-WebSocket-Extensions": "permessage-deflate; client_max_window_bits", "Sec-WebSocket-Key": "a5zBM+ejSE03/ImUVVJvSQ==", "Sec-WebSocket-Version": "13", "X-Amzn-Trace-Id": "Root=1-6522959c-7af4af697cdee9f30abfbf1a", "X-Forwarded-For": "1.145.209.9", "X-Forwarded-Port": "443", "X-Forwarded-Proto": "https" }, "multiValueHeaders": { "Host": [ "p1var95tx6.execute-api.region.amazonaws.com" ], "Sec-WebSocket-Extensions": [ "permessage-deflate; client_max_window_bits" ], "Sec-WebSocket-Key": [ "a5zBM+ejSE03/ImUVVJvSQ==" ], "Sec-WebSocket-Version": [ "13" ], "X-Amzn-Trace-Id": [ "Root=1-6522959c-7af4af697cdee9f30abfbf1a" ], "X-Forwarded-For": [ "1.145.209.9" ], "X-Forwarded-Port": [ "443" ], "X-Forwarded-Proto": [ "https" ] }, "queryStringParameters": { "query": "hello" }, "multiValueQueryStringParameters": { "query": [ "hello" ] }, "requestContext": { "routeKey": "$connect", "eventType": "CONNECT", "extendedRequestId": "MewPAHivSwMFWOg=", "requestTime": "08/Oct/2023:11:42:20 +0000", "messageDirection": "IN", "stage": "dev", "connectedAt": 1696765340601, "requestTimeEpoch": 1696765340602, "identity": { "sourceIp": "1.145.209.9" }, "requestId": "MexQhHljSwMFa7A=", "domainName": "p1var95tx6.execute-api.region.amazonaws.com", "connectionId": "Mevp8c4gSwLDJJk=", "apiId": "p1var95tx6" }, "isBase64Encoded": false }

$disconnect Route

The $disconnect route is a funny one. It processes a disconnection from the AWS API Gateway Web Socket server. The documentation is very explicit about how this works. It is a best-effort event that may not reach your backend integration since the actual socket connection is already closed. A WebSocket connection can be disconnected by either the client or the server. The client disconnects by closing the connection, or the backend service can use the @connections API using the DELETE verb to close a Web Socket connection. Use this link to dive deeper into the @connections API. The $disconnect route does not return anything to the client (because they are no longer there). This route should clean up any data stored by the $connect route to ensure the web socket connection is no longer registered as connected.

Your service receives the following event when the $disconnect route is called.

{ "headers": { "Host": "p1var95tx6.execute-api.region.amazonaws.com", "x-api-key": "", "X-Forwarded-For": "", "x-restapi": "" }, "multiValueHeaders": { "Host": [ "p1var95tx6.execute-api.region.amazonaws.com" ], "x-api-key": [ "" ], "X-Forwarded-For": [ "" ], "x-restapi": [ "" ] }, "requestContext": { "routeKey": "$disconnect", "disconnectStatusCode": 1000, "eventType": "DISCONNECT", "extendedRequestId": "MeyLFHBTSwMF3_Q=", "requestTime": "08/Oct/2023:11:48:35 +0000", "messageDirection": "IN", "disconnectReason": "", "stage": "dev", "connectedAt": 1696765340601, "requestTimeEpoch": 1696765715476, "identity": { "sourceIp": "1.145.209.9" }, "requestId": "MeyLFHBTSwMF3_Q=", "domainName": "p1var95tx6.execute-api.region.amazonaws.com", "connectionId": "MexQhePZSwMCGmQ=", "apiId": "p1var95tx6" }, "isBase64Encoded": false }

$default Route

This is the Web Socket catch-all route. When a JSON message is sent across the Web Socket connection with no handler, this route handles it. The $default route will also handle ALL non-JSON traffic, which is sent through the Web Socket - so if you are building a service that transfers non-JSON data, this is the route you need to implement to handle your task. If the $default route is not defined then calling an invalid route will result in a forbidden error as shown here:

{ "message": "Forbidden", "connectionId": "Mid3qegySwMCEhg=", "requestId": "Mifl6EhjSwMF3Kw=" }

Custom Routes

Custom routes enable you to create a specific set of handlers based on the content of JSON messages sent through the web socket from connected clients. By default, the route selector uses $request.body.action, but you can set this to many different options. Read through the detailed documentation on the AWS API Gateway docs here to understand the nuances of how web sockets route messages to your custom Lambda code. There are a lot of flexible options.

Here is an example message for the default configuration using $request.body.action to route to your Lambda function. What is interesting about the routeSelectionExpression is how you can combine multiple attributes to create a concatenated key to create nested routes, e.g. ${request.body.service}/${request.body.action} which creates a routeKey of "order/test" (shown in the JSON example below).

{ "requestContext": { "routeKey": "order/test", "messageId": "MewPAdCASwMCIWw=", "eventType": "MESSAGE", "extendedRequestId": "MewPAHivSwMFWOg=", "requestTime": "08/Oct/2023:11:35:21 +0000", "messageDirection": "IN", "stage": "dev", "connectedAt": 1696764684191, "requestTimeEpoch": 1696764921301, "identity": { "sourceIp": "1.255.255.255" }, "requestId": "MewPAHivSwMFWOg=", "domainName": "p1var95tx6.execute-api.region.amazonaws.com", "connectionId": "Mevp8c4gSwLDJJk=", "apiId": "p1var95tx6" }, "body": "{ \n \"service\": \"order\",\n \"action\": \"test\",\n \"data\": {\n \"item\": \"value\"\n }\n}", "isBase64Encoded": false }

Summary of Lambda Response Modes

  • $connect: only cares about the statusCode returned by your Lambda response. All other data is ignored.
  • $disconnect: doesn't care about any response data - the client is already gone, so do your cleanup here!
  • $default: Only cares about the body in the API gateway response, and even then, only when you have explicitly configured the integration to returnResponse. body and headers returned in the Lambda response will be ignored.
  • custom: Is the same as the $default route - only the actual body matters and only when you have "true" in the returnResponse configuration option.

API Gateway Web Socket Quotas and Limits

We can't complete the primer without listing the standard set of quotas and limits that apply to Web Sockets, which are critical for building production-ready solutions. A summary of the main critical ones is here. For the exhaustive list, see quotas and limits in the AWS documentation.

  • Message size is limited to a maximum of 128KB, anything larger will result in the web socket being disconnected with a 1009 Message too big error.
  • Connection duration limit is 2 hours. After 2 hours, the web socket connection will be closed, and the client must reconnect.
  • Idle Connection timeout - 10 minutes. After 10 minutes of idle time, the web socket connection will be terminated, and the client must reconnect.
  • API Gateway integration processing limit is 29 seconds for a Web Socket message. When your backend integration takes longer than 29 seconds, the client will receive an error message like the following:
{ "message": "Endpoint request timed out", "connectionId": "MiWsnfkgywMCF9A=", "requestId": "MiXn3HRWSwMFUHA=" }

Important Note on timeouts - like API Gateway integrations, the API gateway will return a timeout error to the Web Socket connection. However, the backend integration may continue to process up to its limit as per normal API gateway REST request/response.

To test this further, I set up a Lambda route with a lambda timeout set to 2 minutes and made the Lambda process sleep for 45 seconds. After 29 seconds, the web socket client received the API Gateway request timeout error, but the actual lambda continued and completed its execution, as expected.

Summary

Web Sockets are a great bi-directional message channel enabling efficient front to back-end communications. They use a connection-oriented protocol, so you will need a strategy in your overall design to deal with critical notifications when the client is not connected to the web socket. This is quite an issue with apps built entirely for the mobile phone market where the networks are prone to drop-outs or moments of disconnections due to signal blockage. Design with connection drop-outs in mind, and with the limits in mind such as web socket disconnect timeouts (after 10 minutes of inactivity) or hard connection limit timeouts (2 hours maximum connection time).

A common use case for Web sockets is one where backend services send broadcasts to all connected clients. This use case requires the use of a back-end data store to store connection details for sending the broadcasts out. If your solution is for receiving notifications about a back-end task you kicked off, you don't need to create a database. You can use the $request.requestContext.connectionId to send the job status updates to the client starting the back-end task without a database.

Like all AWS Managed Service offerings, understand the quotas and limits that apply to the service and ensure your design considers these constraints as first-class citizens - this is the key to success when building with a Serverless First mindset.

Take into account the routes you need to define. You only need a single route to be defined for the WebSocket API gateway to be initialised and deployed as a working entity. See the A Note about Routes section for more details. If you need to secure your web socket connections, you will need a $connect route defined, and you will need to add security control onto this API gateway (security will be covered in a separate post).

More Reading

  • If you want to go deeper into learning about Web Sockets, there is a more technical description available here.
  • Learn more about registered WebSocket Sub Protocols here.
  • Allen Helton (AWS Serverless Hero) has also published a great series covering Web Sockets on his Ready Set Cloud Blog.