Better, Simpler Ops and Security with Serverless

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vivamus auctor nisl vehicula nisi semper, non porttitor nisl auctor. Duis nec eros tortor. Duis rutrum lacus arcu, a interdum nibh sodales vel. Suspendisse at euismod ante, et mattis mauris. Donec molestie dui non eleifend pulvinar. Donec pellentesque velit lacus, ut laoreet enim egestas non.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

ZibaSec is a cybersecurity company, and that extends to how we build our underlying application infrastructure. One of the fundamental decisions we made early was to build our platform using all serverless technologies. Specifically, we set out to build an enterprise-grade solution offering that does not use any bare metal, virtual machines, or containers that we (ZibaSec) are directly responsible for maintaining. We’ll explain our reasoning.

Opting for Serverless

We made the gamble that going the serverless route would help us manage our constraints in the most optimal way we could determine.

All of the foundational AWS services we would use were already FedRAMP authorized (e.g. API Gateway, Lambda, SQS, S3, etc). A large portion of FedRAMP controls would be significantly easier to meet without any VM or container infrastructure; there is no need for Nessus or anti-virus if you have no operating systems to deal with. Applying and managing CIS Benchmark compliance is much easier when all you have to deal with is the hypervisor (AWS Console) layer.

Engineering efforts could be focused (mostly) on software development for our products as a result of lower compliance overhead, and a straightforward deployment environment. With Function-as-a-Service mechanisms, the runtime environment is predictable in almost every way which limits “works on my machine” type of problems.

Specifically, we leverage Terraform and Serverless Framework for our deployment stack and it’s worked out so far. Serverless Framework is really good at application-level deployments, meaning it’s good at doing all the wiring for the pieces needed to make an application run. However, it’s not great for supporting components (e.g. AWS Config, CloudWatch, DNS, etc) which is where Terraform is leveraged.

Constraints

As a small company (only 3 of us in late January 2020), we faced a number of constraints with building an enterprise grade SaaS application. 

FedRAMP Compliance

At the start, we knew we were going to need to achieve FedRAMP compliance given our desire to enter the Federal market. Therefore, anything we built needed to use components that wouldn’t run afoul of our goal. This constrained which third-party services we could leverage as part of our architecture.

Engineering Team Size

We started with 2, and were at roughly 5 people by the time we started going through the FedRAMP audit process. This means we couldn’t afford to spend time in what is considered more traditional cloud engineering efforts whether EC2/VM based or something container-based (e.g. Kubernetes), and we also couldn’t afford to spend a ton of time with the overhead that comes with keeping VM/containers compliant. Engineering needed to be focused on writing and shipping application code.

A Need For Scale

From the start, we need to support a hyper-scale customer (Department of Justice) with a staff of roughly 200 thousand individuals. That meant we also needed to build something that could handle that type of traffic. We didn’t have the luxury of designing for scale later on, we needed that right away.

Budget

We were, and still are, a seed stage startup and we’re not exactly swimming in excess cash. We needed to build something that was as cost effective as possible while also letting us deal with the above constraints. However, we didn’t take this to mean “build everything ourselves” it meant “build as little as possible”.

Challenges

Going the serverless route came with challenges that are unique. We’re still working to figure out how to best address these over the long haul, and it’s worth sharing them nonetheless.

Compliance

In the enterprise, serverless architecture’s are still a very new concept and while many shops are currently running some workloads on various FaaS platforms, nobody (except us) is exclusively serverless. This makes compliance interesting in that you need to spend extra care in the narrative shared with auditors and regulators alike so that they can understand the differences. Some of these differences are nuanced (e.g we can’t install anything at the OS level), and others are more pronounced (we have no IP addresses). 

Local Development

When all the fundamental building blocks of your backend are FaaS/PaaS based then it becomes impossible to test locally in a way that is true-to-production. We are currently addressing this with development AWS environments that almost precisely mimic production. We tried various approaches but it ultimately was best to just use the same Terraform and Serverless Framework code but applied to development AWS accounts.

Tracing and Observability

This was initially unaddressed. In 2020, there wasn’t any solid serverless-friendly monitoring solution that was FedRAMP authorized. As much as I would have liked to use Honeycomb or other providers, they were either non-FedRAMP or they only worked with VM-based workloads. As of this writing, AWS X-Ray is now FedRAMP In Process and we’ll likely standardize around that service for our tracing needs.

Benefits

In contrast to the challenges, we gained many benefits. We feel these outweigh the challenges we faced (and still face).

Resilience and Disaster Recovery

As a result of not being responsible for the underlying infrastructure, operating system, or network, we have a very high level of resiliency built into our platform. We can sustain large surges in traffic without much concern, and near-perfect uptime is only ever undercut by the occasional blip on the side of AWS.

Just as important, recovering from a disaster is also simple. We have read replicas of our databases in a different region, and we’re able to spin up our entire serverless application stack in that region in less than 30 minutes. This is something we test several times a year.

Cost Optimization

Our workloads occur in random batches and as a result we benefit from the ephemeral nature of AWS Lambda functions. When things are quiet, we pay almost nothing. When things really start picking up, we pay very little. It is this hyper optimization that allows us to pass on the savings to our customers, providing the best pricing in the industry.

Better Security

Being all-in on serverless has allowed us to achieve a strong security posture. A deep dive into AWS Lambda security has been written by AWS and can be accessed here. Here are some of the benefits:

  • Every interaction initiated by an end user is encapsulated by a singular Lambda function invocation that is not shared with any other tenant.
  • Each Lambda invocation runs for an average of less than 1 second, and no more than 30 seconds. This means that even if an attacker were to somehow compromise the underlying systems they would have almost no time to go further.
  • We are certain that AWS is far more proficient at system patching and flaw remediation than we are, and by leveraging managed services we defer that responsibility to them.
  • Lambda workers run on a fleet of EC2 Servers (that we don't manage)
  • Runs Nitro ⇒ Nitro Hypervisor, Nitro Security Chip, Nitro Cards and Enclaves
  • Workers have a max lease time of 8 hours
  • Function code is zipped and encrypted with AES-GCM
  • Functions created as containers are chunked and encrypted with a combination of AES-CTR, AES-GCM and SHA-256 MAC
  • OS isolation done with cgroups, dedicated namespaces, seccomp-bpf, iptables, chroot
  • /tmp is not accessible across exec environments

Going all-in on serverless isn’t for every use case, everyone, nor for every organization. However, as a fast-moving cybersecurity company, we found it to be an exceptionally useful way to build out a product that could help deliver fast, in a compliant way, while boosting our overall security posture. We haven’t figured out everything, and probably never will, but we’ve learned a lot over the past 18 months and we hope to continue to share our findings with you for years to come.


If you’d like to learn more about ZibaSec, reach out to us at https://zibasec.com or directly via email info@zibasec.io.


You might also be interested in

White Paper

Capabilities Statement

Learn more about PhishTACO's capabilities, including our AWS-based serverless architecture, key features, and some of our clients.

Read the White Paper
White Paper

PhishTACO Platform Overview

We believe there is a better, more honest way to do phishing simulations. We do it differently.

Get the One Pager...

See for yourself

PhishTACO is easy to use, and we’ve made it just as easy to schedule a demo.

Book now with Calendly