Allocating classes on the Stack

Cristiano Rodrigues
3 min readMar 27, 2023

--

More and more, we look for ways to avoid pressure on the GC and increase the performance of our applications. If we look at our code, we often use classes that could easily be replaced with data structures in certain code sections, which could then be allocated on the Stack. However, this change could bring other problems that might not be worth it.

This feature is widely used in Java and has been available since DotNet Core version 3, but it is a little-known feature in DotNet and is disabled by default.

So what is this functionality?

It’s called Escape Analysis and its first prototype in DotNet was developed in 2016 and published in DotNet Core 3, according to the documentation.

So what is this Escape Analysis?

Every instance of an object is allocated on the Managed Heap. So imagine if the lifetime of that object was the same as the lifetime of the method where it was allocated. We could then move that object to the Stack, and that’s what Escape Analysis or EA does: it analyzes the scope of an object instance, and if it doesn’t escape, it can be safely moved to the Stack.

This analysis is not so simple, and there are several other points to consider.

Some points observed:

  1. Objects with finalizers cannot be allocated on the stack.
  2. Objects allocated in a loop can only be allocated on the stack if the allocation does not escape the loop iteration in which it is allocated.
  3. There must be a limit on the maximum size of objects allocated on the stack.

To enable this feature, you will need to include an environment variable called COMPlus_JitObjectStackAllocation and set its value to 1. In DotNet7, the environment variable COMPlus_JitObjectStackAllocation was renamed to DOTNET_JitObjectStackAllocation. Considering a simple implementation just to demonstrate that the feature exists, we will create a class that adds two points and use Benchmark.Net to measure the allocations.

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Configs;
using BenchmarkDotNet.Jobs;
using BenchmarkDotNet.Running;
using System.Runtime.CompilerServices;

BenchmarkRunner.Run(typeof(Teste));
Console.ReadLine();

[MemoryDiagnoser]
[DisassemblyDiagnoser(printSource: true)]
[Config(typeof(ConfigWithCustomEnvVars))]
public class Teste
{
private class ConfigWithCustomEnvVars : ManualConfig
{
private const string JitObjectStackAllocation =
"ComPlus_JitObjectStackAllocation";
public ConfigWithCustomEnvVars()
{
AddJob(Job.Default
.WithEnvironmentVariables(
new EnvironmentVariable(JitObjectStackAllocation, "0"))
.WithId("JitObjectStackAllocation Off"));
AddJob(Job.Default
.WithEnvironmentVariables(
new EnvironmentVariable(JitObjectStackAllocation, "1"))
.WithId("JitObjectStackAllocation On"));
}
}

[Params(1, 5)]
public int A { get; set; }
[Params(1, 5)]
public int B { get; set; }
[Benchmark]
public int Calcular()
{
var calculadora = new Calculadora(A, B);
return calculadora.Soma();
}
}

public class Calculadora
{
private int v1;
private int v2;
public Calculadora(int v1, int v2)
{
this.v1 = v1;
this.v2 = v2;
}
internal int Soma()
{
return v1 + v2;
}
}

After running the tests, the result will be:

Note that the same code with the environment variable COMPlus_JitObjectStackAllocation set to 0 generated allocations in GEN 0 and allocated 24 Bytes. In contrast, with the environment variable set to 1, it did not create any allocation and still had the shortest execution time.

We can go a step further and look at the generated assembly code. In the image below, the first section of code refers to COMPlus_JitObjectStackAllocation set to 0, and the second, smaller section refers to COMPlus_JitObjectStackAllocation set to 1.

With COMPlus_JitObjectStackAllocation set to 0, we can notice more code and memory address accesses. However, with COMPlus_JitObjectStackAllocation set to 1, we observe much less code and can see that values stored in the memory positions pointed to by addresses [rcx+8] and [rcx+0C] are moved to registers eax and edx to perform the add operation (sum).

This feature, despite not being widely known, is helpful in cases where reducing pressure on the GC is necessary. Allocating classes directly on the stack will help reduce pressure on the GC, increasing performance in your application.

The example was simple to demonstrate the feature. I invite you to conduct your tests and share them here.

Until next time!

References:

object-stack-allocation
Pull Request Inicial
Egor Bogatov — Twiter
Escape analysis for Java
Escape Analysis in the HotSpot JIT Compiler

--

--

Cristiano Rodrigues
Cristiano Rodrigues

Written by Cristiano Rodrigues

Microsoft MVP | Solutions Architect with more than 15 years in software development. In love for Docker and Kubernetes, a specialist in .NET and SQL Server.

No responses yet