Most of us use malloc() without thinking about what happens underneath. I decided to implement my own memory allocator in C to understand it better. This wasn’t for production use, just to learn how allocation, fragmentation, and concurrency actually behave in practice. I also benchmarked it against glibc’s malloc to see where it stands. Implementation Overview My allocator currently includes: Thread-local cache Free lists (bins) for different size ranges Direct mmap for larger allocations A custom realloc() implementation Benchmark Results glibc malloc Single-threaded: alloc + free (1M iterations): ~26 ms batch alloc/free:3.40ms/0.95ms mixed sizes: ~2.5 ms Multi-threaded: 8 threads: ~57 ms My Allocator Single-threaded: alloc + free (1M iterations): ~83 ms batch alloc/free:1.46ms/0.50ms (faster than glibc) mixed sizes: ~126 ms Multi-threaded: 8 threads: ~791 ms What Worked Batch allocation and free operations were faster than glibc.…