The State of SBOM Generation for C/C++: 2026 Edition
Ritesh Noronha

If you think SBOM generation is a solved problem, you’re probably not working in C/C++.
A good SBOM generator should do three things well: be accurate, be reproducible, and represent what's actually in the binary or firmware you're shipping. For ecosystems with mature package managers and machine-readable dependency metadata, this is materially easier. JavaScript has package-lock.json, Python often has poetry.lock, Rust has Cargo.lock, and Go has go.sum. Those files do not make SBOM generation trivial, but they usually provide a much clearer starting point than C/C++ does.
For C/C++, we have no such luxury. And in the embedded world, where C/C++ dominates, the gap between what SBOM tools produce and what's actually in your firmware is wide enough to drive a truck through.
I've spent a significant amount of time building SBOM tooling for embedded C/C++ projects, and I want to share an honest assessment of where the landscape stands today — what works, what doesn't, and why a combined approach is the only realistic path to a reasonably accurate SBOM.
The Managed Dependency Story: Better, But Not the Whole Picture
Tools like CMake, Conan, vcpkg, and Meson have brought some structure to C/C++ dependency management. If your project uses Conan, you can get a conan.lock file. If you're on vcpkg in manifest mode, you at least have a vcpkg.json manifest and version constraints. CMake's FetchContent declarations also document what is being pulled in at configure time.
Some SBOM tools can use this metadata and produce a reasonable starting point for at least part of the dependency graph. For the subset of the C/C++ world that has adopted modern dependency management consistently, the SBOM story is better than it used to be, even if it is still incomplete.
That subset is small, especially in embedded. Most embedded C/C++ projects I encounter aren't built with Conan or vcpkg. They're built with plain Makefiles — sometimes hand-written, sometimes generated by an IDE like IAR Embedded Workbench or STM32CubeIDE. CMake adoption is growing in the embedded space, but it's far from universal. And even projects that use CMake often have a mix of FetchContent, system libraries, Git submodules, and vendored source that no single tool captures completely.
The `-l` Flag Approach: Useful, But Insufficient
One common technique for identifying C/C++ dependencies is to inspect linker inputs — especially -l flags and linker search paths. If your build links against -lssl -lcrypto -lz, a tool may be able to infer OpenSSL and zlib and enrich that with package-manager metadata when those libraries came from the operating system or a known package source. This can cover both dynamic libraries (.so / .dylib) and static archives (.a) when the build names them explicitly.
For desktop and server builds, this approach can get you reasonably far. The -l flags, combined with linker search paths (-L), often tell you which libraries are being pulled in. Static .a archives linked this way are also more visible than vendored source compiled directly into the target.
But even when this approach succeeds at identifying a component, there's a deeper problem: it often gives you a library name, and sometimes a version, but not much more. NTIA's minimum elements include supplier name, component name, version, unique identifiers, dependency relationships, author of the SBOM data, and timestamp. A -l flag may tell you a library name. A .a file may tell you only a filename like libfoo.a, with little or no machine-readable provenance about who built it, where it came from, or what upstream release it represents. If the library did not come from a package source with metadata, the remaining NTIA fields usually have to be recovered some other way — and for most C/C++ projects, there is no automated "other way."
In the embedded world, the situation is significantly worse. Embedded firmware builds often do not use -l flags pointing to system-installed packages at all. Instead, they compile the RTOS, HAL, network stack, crypto library, and middleware from source — or they link against project-local .a files that were built earlier from vendored code with little attached metadata. The resulting binary is frequently monolithic. There may be no ldd-style runtime dependency information to inspect, no DT_NEEDED entries in ELF headers, and stripped symbols reduce what binary inspection can recover. A static .a file sitting in a project's lib/ directory, built months earlier from an unknown upstream version, is close to a black box for SBOM purposes — and a compliance gap for any regulation that demands more than a component name.
Vendored Code: The Real Challenge
This is where things get genuinely hard, and where I think the C/C++ SBOM problem differs most sharply from ecosystems with strong package-manager conventions.
Vendored code is everywhere in C/C++. Developers copy source files — sometimes a single .c and .h pair, sometimes entire library trees — directly into their project. Libraries like nlohmann/json, mongoose, stb_image, miniz, and lwIP are routinely integrated this way. Their own documentation often tells you to do exactly this: "copy these two files into your project." Once copied, the code becomes indistinguishable from first-party source.
Fingerprinting is one of the main techniques for detecting vendored code, and it works well for the easy cases. If a developer has vendored an unmodified copy of a standalone library, you can hash the files, compare against a corpus of known releases, and often get a confident match. For small, self-contained libraries, this can be quite reliable.
But vendoring has complications that make fingerprinting a heuristic game rather than a deterministic one.
Modifications are common. Developers vendor code and then modify it — patching bugs, adapting APIs, removing unused functionality. Once a file is modified, exact hash matching fails. You're into fuzzy matching territory: diffing against known versions, using techniques like MinHash or Winnowing for code similarity, or doing AST-level structural comparison. These approaches can work, but they introduce uncertainty. Is this a modified copy of FreeRTOS 10.4.3, or is it FreeRTOS 10.5.1 with different modifications? The answer matters for CVE correlation, and the tooling often can't tell you with confidence.
Vendor SDKs add noise. Silicon vendors like ST, NXP, TI, and Renesas ship SDKs that bundle open-source components — FreeRTOS, lwIP, mbedTLS, FatFS — alongside proprietary HAL code. These bundles often add vendor-specific copyright headers, modify directory structures, rename files, and sometimes fork the upstream code significantly. STM32CubeF4 includes a version of FreeRTOS that's been through ST's adaptation process. Renesas FSP bundles a ThreadX derivative. Fingerprinting these against upstream releases is unreliable without vendor-specific detection logic.
Embedding models and ML-based code similarity can help, and there is interesting academic work on using learned code representations for software component identification. But today these approaches are usually too operationally heavy for routine SBOM generation in CI at scale. They may have a role in one-time audits or high-assurance investigations, but they are not yet a straightforward replacement for simpler techniques.
The Shared Vendor Folder Problem
There's a practical wrinkle that I don't see discussed enough in SBOM conversations. Most C/C++ developers keep their vendored code in separate folders — vendor/, 3rdparty/, external/, lib/ — which makes management easier. This is good hygiene and it helps SBOM tools identify the boundary between first-party and third-party code.
But in practice, especially in embedded shops, a vendored folder is often shared across multiple projects or build targets. A vendor/ directory might contain FreeRTOS, lwIP, mbedTLS, FatFS, and a dozen other libraries. Project A uses FreeRTOS and lwIP. Project B uses FreeRTOS and mbedTLS. Project C uses all of them.
An SBOM generator that simply inventories the entire vendor folder will produce an SBOM that lists all the vendored components — but that SBOM is incorrect for any individual build artifact. It overstates what's actually in the binary. If Project A doesn't use mbedTLS, listing it in Project A's SBOM creates false positives in vulnerability scans and misrepresents the actual attack surface.
Even if all the code in a vendored folder compiles and links, the linker's --gc-sections flag will garbage-collect any sections that aren't reachable from the entry point. You might compile 30 shared libraries into your build, but if their functions are never called, they contribute zero bytes to the final binary. Listing all of them as components in your SBOM would produce inaccurate, false-positive results.
In many embedded projects, the most reliable way to get this right is to understand what the build system actually compiles and links for a given target. Which brings us back to the build system as a key source of truth, and why build-time analysis is usually essential for embedded C/C++.
The Real World Is Messy: A Combined Approach
No single technique solves the C/C++ SBOM problem. After working through the different approaches — manifest parsing, linker flag inspection, file fingerprinting, binary analysis, build-system instrumentation — I've come to believe that a combined approach is the only path to a reasonably accurate SBOM. Specifically, you need to bring together four signals:
What's being built. The build system — whether it's Make, CMake, Meson, IAR, or something else — knows which source files are compiled and which libraries are linked for a given target. Instrumenting the build process gives you ground truth about what goes into the binary. This is the foundation.
What's being vendored. Directory fingerprinting, file hashing, and version string extraction can identify known open-source components in vendor directories. The key is to correlate this against what the build system actually consumes — not just what's present on disk. A component in vendor/ that never gets compiled into your target shouldn't appear in your SBOM.
Which platform or microcontroller it's built for. The target platform constrains which vendor SDKs and HAL libraries are in play. If you're building for an STM32F4, you know that STM32CubeF4 HAL components are likely present. If you're targeting a Renesas RA6M4, you're looking at Renesas FSP. Platform awareness lets you narrow the search space and apply vendor-specific detection heuristics. It also helps with CPE/PURL mapping, since embedded component naming is highly vendor-specific.
Static dependency management. Understanding how static libraries (.a, .lib) are linked, which object files they contain, and where they originated from completes the picture. Map files (.map), linker scripts, and archive contents all provide clues. For IAR projects, .map files list every object file linked into the final image. For GCC-based builds, ar -t on static archives reveals their contents.
Each of these signals alone is incomplete. Together, they can get you to a more accurate SBOM, which is usually what matters for vulnerability management, procurement, and regulatory reporting.
Developer Context Still Matters
Automated discovery is important, but for C/C++ projects, especially embedded ones, developer-supplied context still matters a lot. The development team often knows which vendored library was copied in, where it came from, which patches were applied, and which components belong to which build variant. That context is exactly the kind of information automated scanners struggle to reconstruct reliably from source trees, archives, and stripped binaries.
That does not mean manual or developer-maintained manifests are a silver bullet. They can go stale, miss dependencies, or drift from reality if they are not tied closely to the build process. But in practice, they can be a useful complement to automated analysis, especially for provenance, supplier, licensing, and build-variant metadata that is difficult to infer after the fact.
Structured component metadata is the missing layer in SBOMs—and Interlynk is helping define it. With sbomasm new generate sbom functionality, we’re enabling teams to capture developer context where it originates, making automated scanning and build-time analysis significantly more reliable.
Where Do We Go From Here?
The C/C++ SBOM problem won't be solved by a single tool or a single technique. It requires a layered approach that combines build-system intelligence, vendored code detection, platform-aware heuristics, and community-maintained corpora of known embedded libraries and their fingerprints.
The embedded C/C++ community also needs to invest in better dependency hygiene — adopting package managers where feasible, maintaining explicit manifests for vendored code, and treating SBOM generation as a first-class build artifact rather than an afterthought.
For those of us building tooling in this space, the opportunity is clear: whoever can reliably bridge the gap between the messiness of real-world C/C++ projects and the structured, accurate SBOMs that regulations demand will be solving one of the hardest remaining problems in software supply chain security.
And that'll be an SBOM you will be able to trust and verify.
Citations
CMake, "FetchContent," official documentation: https://cmake.org/cmake/help/latest/module/FetchContent.html
NTIA, "The Minimum Elements For a Software Bill of Materials (SBOM)," July 2021: https://www.ntia.gov/sites/default/files/publications/sbom_minimum_elements_report_0.pdf
European Commission, "Cyber Resilience Act - Implementation," accessed April 9, 2026: https://digital-strategy.ec.europa.eu/en/factpages/cyber-resilience-act-implementation
FDA, "Medical Device Software Guidance Navigator," including "Cybersecurity in Medical Devices: Quality System Considerations and Content of Premarket Submissions": https://www.fda.gov/medical-devices/regulatory-accelerator/medical-device-software-guidance-navigator
The White House, "Executive Order on Improving the Nation's Cybersecurity" (Executive Order 14028), May 12, 2021: https://www.whitehouse.gov/briefing-room/presidential-actions/2021/05/12/executive-order-on-improving-the-nations-cybersecurity/
sbomasm, "Your SBOM assembler" : https://github.com/interlynk-io/sbomasm