Meet Charlize, our Slack bot intern
How to get started with SlackOps to improve DevOps flows in a small team working within many AWS environments.
Amazon EC2
Amazon API Gateway
Amazon DynamoDB
AWS Tools and SDKs
Boto3 (AWS SDK for Python)
Slack
Python
We launched Chaos Gears as a bunch of nerds eager to help companies adopt cloud solutions as an accelerator for their innovations. In doing so, we automate as much as possible, to save time and stay DRY.
Moving from our prior jobs was not only a technological shift, but — probably more importantly — a shift in culture. We realized that the main enemy in just about everything we do is time itself. Time can be spent on innovation and internal development, and this translates to a bigger chance that you’ll succeed.
As a startup which — amongst others — augments external teams in different kinds of AWS projects, we aim to automate even tiny workflows, tasks, actions — so long as they are repetitive. We continuously improve those automations and expand them.
This approach isn’t a silver bullet, but being able to pinpoint the best candidates for automation goes a long way. And so we did.
Building a serverless Slack bot
We’re humans who are always the bottleneck in processes, no matter how big the team is. Each company working with technology struggles with failures, and we’ve noticed that we had been repeating the same “remedy” tasks whenever something broke.
Instead of having to ask other colleagues about those AWS environments, or requesting others to invoke a Lambda via API, restart an instance or do some other job, our team decided to leverage AWS services and combine them with Slack to build a bot intended to help us easily access and use the automations we put in place.
We’ve called her Charlize — and she was bound to become our new synthetic team member.
Registering a Slack app
Creating a Slack app (which will end up being a bot) begins with 3 easy steps:
Go to https://api.slack.com/apps and click the big green Create new app button. Enter a name for your app and click the next prominent green button.
The Basic Information link on the left hand side of your app’s settings page contains information you’ll need — such as the
CLIENT_ID
andCLIENT_SECRET
needed to authenticate all requests made by your app.

Then it’s time to set up a Redirect URL for your app. This is the endpoint for Slack to pass a unique temporary
code
to whenever a user installs your app in their workplace. Your server will then send back this code, along with yourCLIENT_ID
andCLIENT_SECRET
, exchanging that code for an access token.The Redirect URL must be accessible via
https
(i.e. TLS). Slack (incorrectly) insists thatlocalhost
by itself is not secure. For local development/testing, this means you will need either:
- a public proxy (something as simple as redirectmeto.com can do the trick),
- a backend supporting TLS connections,
- or a public tunnel which supports TLS connections (software like Ngrok).
We used API Gateway as the entry point to our AWS backend, with Lambda handler meant to handle all Install events:
functions:
install:
name: ${self:custom.app_function}-install
description: Install Slack integration
handler: gear_install.handler
role: LambdaRole
environment:
tablename: ${self:custom.app_function}-tokens-${self:custom.stage}
events:
- http:
path: /install
method: get

While there are many OAuth scopes you could request for a full-blown integration with Slack, simple integrations typically get by with just the incoming-webhook
access scope — but keep in mind that it only allows for unidirectional communication.
The incoming-webhook
scope is designed to allow you to request permission to post content into the user’s Slack workspace. It intentionally does not grant read access, making it is perfect for services that want to send posts or notifications to Slack workspaces that might not want to give read access to messages.
We, however, are going to build an installable Slack App that is not tied to a specific workspace — and we will be requesting the actually needed scopes in a slightly different (i.e. installation) flow, explained further down.
For now, this is our starter config:

When a user clicks your Add to Slack button, your CLIENT_ID
gets sent along with the request to Slack’s servers. Slack then redirects the user back to your Redirect URL, along with a single use code
parameter in the query that we need to process.
We’ve coded a Lambda function which leverages AWS SSM Parameter store to keep the CLIENT_ID
and CLIENT_SECRET
, well, secret and centralized — and sends a request with them and the code
to Slack’s oauth.access RPC endpoint. Once we have that token, we store it in an AWS DynamoDB table via put_dynamo_items()
.
def handler(event, context):
logger.info("------Event: {0}".format(event))
code = event['queryStringParameters']['code']
tablename = os.environ['tablename']
token_response = get_token(code, tablename=tablename)
return token_response
def get_token(code, tablename):
logger.info("------Getting token..")
if code == '':
("------Code value is null")
output = {
"statusCode": 400,
"body": "---Error. Code value is null"
}
return output
else:
url = 'https://slack.com/api/oauth.access'
payload = {
"client_id": get_param('CLIENT_ID'),
"client_secret": get_param('CLIENT_SECRET'),
"code" : code
}
data = urllib.parse.urlencode(payload).encode("utf-8")
req = urllib.request.Request(url)
info = urllib.request.urlopen(req, data)
logger.info("------Getting info from url: %s", info.geturl())
response = json.loads(info.read().decode('utf-8'))
if response['ok'] == False:
logger.error("------Problem with the token: %s", response['error'])
output = {
"statusCode": 400,
"body": response['error']
}
return output
else:
logger.info("------Putting token into the DynamoDB table: %s", tablename)
table = SlackArmyDynamo(tablename)
table.put_dynamo_items(
item=response['user_id'],
team=response['team_name'],
team_id=response['team_id'],
token=response['access_token'],
bot=response['bot']
)
output = {
"statusCode": 200,
"body": "Token validated. Put into DynamoDB"
}
return output
def get_param(item):
try:
ssm = boto3.client('ssm')
parameter = ssm.get_parameter(Name=item, WithDecryption=True)
param = str(parameter['Parameter']['Value'])
return param
except ClientError as err:
logger.critical("----Client error: {0}".format(err))
logger.critical("----HTTP code: {0}".format(err.response['ResponseMetadata']['HTTPStatusCode']))
class SlackArmyDynamo(gearDynamo):
def put_dynamo_items(self, item, team, team_id, token, bot):
try:
response = self.table.put_item(
Item={
'team_id': team_id,
'team': team,
'customer_id': item,
'date': datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
'bot': bot,
'token': token
}
)
return response
except ClientError as err:
logger.critical("----Client error: {0}".format(err))
logger.critical("----HTTP code: {0}".format(err.response['ResponseMetadata']['HTTPStatusCode']))
In other words, this is more or less a typical OAuth flow — we simply exchange the temporary code
for a proper access token, which comes in a response with the following shape:
{
"ok": true,
"access_token": "TOKEN",
"scope": "identify,bot",
"user_id": "USER_ID",
"team_name": "Chaos Gears",
"team_id": "TEAM_ID",
"bot": {
"bot_user_id": "BOT_USER_ID",
"bot_access_token": "BOT_ACCESS_TOKEN"
}
}
Now we can use this token to make API calls.
If you encounter a code_expired
error response instead, you could try:
- In your Slack configuration, within OAuth & Permissions, click Reinstall App,
- Reinstalling your app in your workplace via Add to Slack.
Integrating our bot with Slack’s Events API
First of all, Charlize has to somehow become aware of the things happening — i.e. events — in a Slack workspace in order to be able to react to them. Slack provides several protocols that allow us to achieve that, including real-time bidirectional communication. However, those require a permanent running service, while the purpose of Charlize was to occasionally support a relatively small team of engineers.
The word “occasionally” is a good indicator that you might want to consider an asynchronous operating model instead. Slack accommodates this case via an Events API which triggers your selected endpoint(s) via HTTP POST
requests whenever any of the events your app is actively subscribed to happen.
We opted for precisely this path — it ties in neatly with our existing infrastructure and allows us to deal with the “occasional” nature of those calls via serverless functions.

As such, in our case, the Request URL points to an API Gateway endpoint sitting in front of our Lambdas, which we will cover in a moment.
This endpoint must correctly echo the challenge
sent to it by Slack in order for the installation to succeed.
Our gear_event
Lambda performs verification before calling any other internal “action” (another serverless function) and responds with the value of challenge
.
events:
name: ${self:custom.app_function}-events
description: Provides verification before invoking internal functions
handler: gear_events.handler
role: LambdaRole
environment:
tablename: ${self:custom.app_function}-tokens-${self:custom.stage}
function_ec2: ${self:custom.app_function}-ec2-actions
table_instances: ${self:custom.t_instances}
region: ${self:custom.region}
events:
- http:
path: /events
method: post
And here’s the part of gear_event
which is responsible for echoing challenge
:
def get_challenge(body):
logging.info("------Checking event challenge value from slack")
if body['type'] == 'url_verification':
logging.info("------Sending challenge back to Slack")
response = {
"statusCode": 200,
"body": body['challenge']
}
return response
We’ve opted to subscribe to the app.mention
event, so whenever Charlize gets mentioned (i.e. someone writes @charlize in a Slack channel), Slack will notify our endpoint about that:

In essence, we wanted mentions like…

… to result in the bot recognizing them as commands and executing the respective actions necessary to fulfil a given request.

Let’s walk through some of the code necessary to get that going. First up is our Lambda handler which receives event notifications from Slack and defines the flow:
def handler(event, context):
logger.info("------Event: {0}".format(event))
body = json.loads(event['body'])
tablename = os.environ['tablename']
region = os.environ['region']
t_instances = os.environ['table_instances']
function_ec2 = os.environ['function_ec2']
if token_verification(body)['statusCode'] == 200 and body['type'] != 'url_verification':
response = {
"statusCode": 200
}
data = get_service_action(body, tablename, t_instances)
if len(data['data']) > 0:
logging.info("------Command successfully extracted")
if data['data'][0] == 'ec2':
function = Lambda(region)
logging.info("------Invoking Lambda function: %s", function_ec2)
function.invoke_function(functioname=function_ec2, payload=data)
return response
else:
logging.info("------AWS service not bound with any function")
return response
else:
logging.info("------Problem with extraction of the data")
return response
elif token_verification(body)['statusCode'] == 200:
logging.info("------Verification of the url slack challenge")
response = get_challenge(body)
return response
Before we actually handle anything, we need to check whether the right token values have been given to us in the request payload coming through API Gateway. In other words, we’re checking whether the message we got legitimately comes from Slack’s Events API.
def token_verification(body, param='VERIFICATION_TOKEN'):
logging.info("------Checking Verification Token")
if body['token'] != get_param(param):
raise ValueError('InvalidToken')
else:
response = {
"statusCode": 200,
"body": "TokenVerified"
}
return response
Next we check whether the message we got in Slack actually is a command we understand and can translate into a known action. Depending on the action, we also search through EC2 instance metadata in a DynamoDB table containing information about EC2 instances deployed in our clients’ environments (which helps us simplify operations on them).
def get_service_action(body, tablename, t_instances):
message = body['event']['text']
logging.info("------Got the message from Slack: '%s'", message)
botUserId = get_team_id(body, tablename)['bot']['bot_user_id']
botAccessToken = get_team_id(body, tablename)['bot']['bot_access_token']
command = (re.split(('<@'+str(botUserId)+'>'), message))[1].lower()
print(command)
temp, flag = action_selector(command, t_instances)
response = {
"command": command,
"bot_user_id": botUserId,
"bot_access_token": botAccessToken,
"data": temp
}
logging.info("------Bot %s is mentioned in: %s", botUserId, message)
if len(temp) == 0 and flag == '1':
text="Excuse me Sir, I didn't understand the command: " + command
elif len(temp) == 0 and flag == '5':
text="Excuse me Sir. Instance you've mentioned is not recognized"
elif len(temp) == 0 and flag == '2':
text="Excuse me Sir. No AWS service found in the command"
else:
text="Hello Sir. I've got the command: " + str(command)
sendResponse(response, text, data=command)
return response
def get_team_id(body, tablename):
table = gearDynamo(tablename)
logging.info("------Getting info from DynamoDB")
item = table.get_item('team_id', body['team_id'])
return item['Item']
sendResponse()
is then responsible for actually invoking Slack’s RPC API (chat.postMessage
) to send Charlize’s response to the channel the request originated on.
At this stage, the message we send is not the final response yet — it’s merely an ACK
of sorts, whether Charlize understood the command or not, and whether it’s going to be processed or not.
def sendResponse(body, text, data):
params = {
"attachments": [{
"title": "Charlize's response",
"author_name": "ChaosGears",
"text": text,
"color": "#2eb886"
}],
'token': body['bot_access_token'],
'channel': body['event']['channel'],
}
url = 'https://slack.com/api/chat.postMessage'
logging.info("------Requesting: '%s'", url)
data = urllib.parse.urlencode(params).encode("utf-8")
req = urllib.request.Request(url)
info = urllib.request.urlopen(req, data)
logging.info("------Getting info from url: %s", info.geturl())
response = json.loads(info.read().decode('utf-8'))
print(response)
Finally, if our handler()
gets any further data to process, it passes that along to other actions Lambdas asynchronously. We use a helper class to define and then simplify their invocation.
class Lambda(object):
def __init__(self, region, service='lambda'):
try:
self.region = region
self.client = boto3.client(service, self.region)
except ClientError as err:
logging.error("------ClientError: %s", err)
def invoke_function(self, functioname, payload, invoke_type='Event'):
try:
self.client.invoke(FunctionName=functioname, InvocationType=invoke_type, Payload=json.dumps(payload))
except ClientError as err:
logging.error("------ClientError: %s", err)
In essence, we had appropriate code paths for numerous actions
on EC2 instances of varying states
:
{
"services": [
{
"name": "ec2",
"actions": ["check", "stop", "kill", "terminate", "restart", "reboot", "find"],
"states": ["stopped", "running", "terminated"]
}
]
}
But Charlize isn’t really chatty about anything else:

@Charlize, do something for me
With our gateway handler in place and able to translate recognized commands into actions, it’s time we took a closer look at how we implement such action handlers.
action-ec2:
name: ${self:custom.app_function}-ec2-actions
description: Provides actions regarding EC2 service
handler: gear_ec2.handler
environment:
regions: ${self:custom.regions}
tagkey: ${self:custom.tagkey}
tagvalue: ${self:custom.tagvalue}
table_roles: ${self:custom.t_roles}
table_instances: ${self:custom.t_instances}
dest_account: ${self:custom.dest_account}
role: EC2Role
With an EC2Role
definition containing a cross-account IAM Role that gets assumed whenever we want to do something on a customer’s account. Our IAM Roles with account IDs and customer metadata are stored in DynamoDB which we call to get the name of the respective role.

Pretty simple IAM role for an example action
Lambda.
Policies:
- PolicyName: GearSlack-Army-EC2
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action:
- sts:AssumeRole
Resource:
- arn:aws:iam::${self:custom.dest_account}:role/${self:custom.dest_rolename}
- Effect: Allow
Action:
- dynamodb:PutItem
- dynamodb:GetItem
Resource:
- arn:aws:dynamodb:${self:custom.region}:*:table/${self:custom.app_function}-roles-${self:custom.stage}
- arn:aws:dynamodb:${self:custom.region}:*:table/${self:custom.app_function}-instances-${self:custom.stage}
The action
Lambda is able to find EC2 instances and act on them within the right environment and the right permissions.
Once the action is performed, we finish up with a response sent to Slack:
def sendResponse(body, text, data):
if data != '':
params = {
"attachments": [{
"title": "Charlize is saying:",
"text": text,
"color": "#2eb886",
"fields": [{
"title": "Response",
"value": data,
"short": "false"
}]
}],
'token': body['bot_access_token'],
'channel': body['event']['channel']
}
else:
params = {
"attachments": [{
"title": "Charlize is saying:",
"text": text,
"color": "#2eb886"
}],
'token': body['bot_access_token'],
'channel': body['event']['channel']
}
url = 'https://slack.com/api/chat.postMessage'
logging.info("------Requesting: '%s'", url)
data = urllib.parse.urlencode(params).encode("utf-8")
req = urllib.request.Request(url)
info = urllib.request.urlopen(req, data)
logging.info("------Getting info from URL: %s", info.geturl())
response = json.loads(info.read().decode('utf-8'))
print(response)
With all of that implemented and running, we can now ask Charlize:
- “@charlize find number of running EC2 instances”
- “@charlize check number of stopped EC2 instances”
- “@charlize find running EC2 instances”
- “@charlize find running EC2 instances in customer environment”


Until next time, Charlize
We’ve showed one task that Charlize is capable of doing and which saves us time. In our case, quite a lot of it. Obviously this is just one potential idea for how you can integrate your AWS environments with Slack in order to boost your productivity.
SlackOps/ChatOps aren’t a new thing per se — we’ve used solutions akin to Charlize ever since the days of IRC, and even earlier — and they are certainly here to stay. Especially with the advent of new natural language processing breakthroughs driving a new generation of AI that is bound to revolutionize the way we interact with “bots”. And we haven’t even touched on this, and all the possibilities it brings.
That said, start small. Start with a simple task like we did, and then evolve your own Charlize-bot with new “skills”. You’re going to have a lot of fun and gain a new team member capable of performing tasks the humans in your team are tired of.