csci-5802-tooldemo-flamegraphs created by GitHub Classroom
| Description | Use cases | Results | Setup | Run | Evaluation |
Flame graphs are for you if you’re doing something at scale - website, server, program, whatever. If something is done at a large enough scale, the time spent optimizing and finding inefficiencies will pay dividends in the future at scale too. A program would also benefit from flame graph profiling it it simply is very computationally intensive - even if it isn’t done in a typical distributed web based scaling fashion.
Flame graphs are something you use after you’ve written the code and have it running. Flame Graphs are not very helpful during design, development, and testing. By using profilers and tools like Flame Graphs you can develop with the philosophy “code now, optimize later”. You never truly know what is actually an optimization during development anyways. Another benefit to this philosophy is when optimization is not a priority during development and design, “clever” tricks to “speed-up” the program can stay out of the code (they often do little besides confuse future devs anyways).
Snap, Inc (Snapchat parent company) spends $1 billion USD annually on AWS infrastructure
Salesforce spends $400 million USD annually on AWS infrastructure.
Other larger cloud users like Google, Amazon, and Netflix do not publish statistics on this.
Basically, infrastructure, especially cloud infrastructure is not cheap since it effectively eliminates hardware and IT staff costs. Companies spend a lot on infrastructure, and small optimizations can have resounding effects. Because optimizations = faster code = lighter server loads = smaller/fewer server instances.
If a developer, with the right tools, can cause a 1% increase in efficiency, they can potentially save the company $1 million a year. Not only will the developer probably see a nice bonus that year, but a change such as this will have a positive effect on the company’s stock prices (which the developer probably has some of) - win win scenario here.
You might be thinking that a 1% efficiency increase across the board sounds far fetched though. And maybe you’re right. Let’s imagine there is a library (internal or external) used throughout the company on several services. A 10% optimization of this library could easily create a cascading effect across the organization.
In many unoptimized and unprofiled services, there are likely many low hanging optimization fruits giving 10%+ efficiency gains that likely take less than an hour of the developers time to discover and fix.
After the long winded rant about infrastructure costs and optimization, you should profile and use flame graphs because it’s easy and the results (savings) are worthwhile and significant. Something that takes less than an hour can have implications far beyond the cost of time spent.
If you have a web server or service, batch processing system, speed critical system, or anything that has a lot of CPU time, you should try Flame Graph Profiling.
If much of your program’s work is done by external libraries, it’s worth ensuring you’re using them efficiently. It’s also not hard to imagine that a library might have an inefficiency that hasn’t been revealed until now because the library is being used at a scale not before experienced. If this is the case, find the libraries inefficiency, fix it, and submit a pull request.
Some libraries have common patterns where they recommend you use them in a specific way, these include JSON parsers, serialization libraries, string functions, and network/communication libraries.
A frequent pattern that appears in inefficiencies is not treating a library object as a resource. Let’s suppose you need to get some data, do some work on it, then send the result over the network. The network communication here requires serializing the data. A frequent but inefficient pattern is to construct a new serializer object everytime to accomplish this. You can make many serializers across instances, threads, and method calls that do mostly the same thing. A better much more efficient solution is to make a static serializer resource.
There is no catch-all tool for flame graphs yet. There is typically one tool for one (or a small handful of) language(s). Some languages are easier to profile than others.
The easiest languages at this time to profile are:
The languages with the seemingly best tools as of now are Go (tool developed by Uber) and Java (tool developed by Netflix).
GUI/Frontend programs that are not widely used and where most of the time is spent waiting on user input.
Small services - optimization has a smaller return when not done at scale.
If running your web service costs an inconsequential annual amount, it’s likely not worth your time to profile the service for inefficiencies.
These would all be great canidates for profiling - they have streaming data, perform pretty hefty computations, and are done at scale. Unfortunately, these all seem to be C++ code. At the present time, C++ code is not easy to profile and generate flame graphs for. It has been done, but the tooling does not seem complete.
The systems we could have profiled here include:
The obvious candidates are any web company’s server backend, but those are not open source, so we are targeting anything that meets the follow criteria:
Note that we will not be profiling all of these, but rather those that are easiest to set up and build from source.