November 15, 2019
Ever since FaaS platforms inception, blogs, and benchmarks measuring and analyzing "cold start" emerged. No wonder that the most searched term in the topic of serverless on Google is "cold start". However, using the term "cold start" distracts us from what we really care about - "Invocation Overheads". Although cold starts are a part of "invocation overheads", focusing solely on them is misleading. As developers, measuring cold start duration won't support development decisions, rather, an understanding of the exact overhead that will be added to our services will be useful and can guide our design.
There are quite a few benchmarking tools and blogs for FaaS platforms. However, after using the tools ourselves, we found out that they are misleading and we have some caution advice regarding most of them:
When developing our own FaaS platform we needed a reliable tool to benchmark our platforms and other platforms - that's why we developed FaaSbenchmark (read more about it here). One of the reasons we decided to open-source FaaSbenchmark and launch FaaStest.com is to attempt to provide an accurate and professional way to benchmark FaaS platforms. FaaSbenchmark was developed with a standard that takes into account all the elements of FaaS platforms benchmarking and the results are presented in a comprehended way on FaaStest.com.
It's a common belief that when people refer to "warm starts" they mean that the same container/sandbox is ready to receive a new connection - but that's not accurate. We are here to highlight the differences and explain what we believe the right terminology is.
A FaaS platform might have some or all of the invocation types above. Depending on the load pattern, we might encounter different ratios of invocation types. For example, consider a simple FaaS platform that keeps a container up for an hour after an invocation is finished. When benchmarking this provider, we might encounter a mix of warm and cold starts, with a ratio that results from the load pattern we are testing. In reality, most FaaS platforms probably use prediction algorithms and complex heuristics to optimize their invocation overheads. Optimizations often include partial/full container reuse, and in many scenarios the ratio of cold starts we spot is small.
Another example to understand where we can face a big penalty and it's not part of the traditional "cold start" is Azure bindings, specifically the output binding of a blob. After a function has finished, whatever left in the blob output binding will be dumped to the blob storage. This happens AFTER the user code is done, it's not a part of the cold start, but it's a part of the invocation overhead.
When talking about concurrency in FaaS platforms we should define three terms:
Concurrent functions - this term is divided into 2
Concurrent invocations = the total amount of events - active functions
As the serverless ecosystem evolves, we believe it's important for our terminology to evolve with it. The latency we should talk about and measure is “Invocation Overhead”. Invocation overhead : The time it takes to call the user's function and return the response * in async functions response time is not always relevant. The illustration below depicts the full flow of a generic invocation. As you can see, there is a lot more going on than just cold start. As FaaS platform users - it is very hard to accurately separate cold starts from warm starts overheads without accessing the platform's internals. The benchmark code will run as close as possible to the actual platform servers (for example: aws ec2 for lambda) to minimize the network latency. The test code will attempt to measure the actual duration inside the user function.
We consider the load time of user code as part of the cold start. In simple terms, the full invocation overhead will be calculated as such:
Invocation overhead = First Byte Response time - Request Sent time - Function duration
Note that in order to accurately calculate the invocation overhead we take the latest possible "Request Sent Time" by eliminating other time consuming operations. These operations include but are not limited to: * dns resolve * tcp handshake * tls handshake
In other scenarios for example, other FaaS providers trigger to invoke a function can be the UDP DNS packet, or the second GET HTTP request. for our internal test, when measuring concurrency, we use sleep functions. The benchmarks results will show the invocation overhead as described above.
Deep understanding of FaaS platforms' performance will allow us to understand this big factor of performance while evaluation FaaS platforms and decide if it is the right choice for your project.
Other factors that you should take into account are:
If you decided not to go the FaaS way (pun intended) because of performance / lack of application security / insufficient visibility, Nuweba is the solution for you. Nuweba is a fully cloud-integrated ultra-fast FaaS platform with advanced application security and deep visibility into your functions. Our performance is 10x faster than current FaaS platforms. How much is 10x exactly? For now, you can check the screenshot below to see how much faster Nuweba is.