Dynamic and static linking and flatpaks and stuff

01 Sep 2023
Drew

This is gonna be notes for myself as much as anything else. I often forget what static linking is exactly and have to look it up. Otherwise this is just my working through my thoughts towards containerised applications and immutability and stuff. Expect rambles. I’m not a coder, there may be errors/simplifications.

Dynamic linking

This is the ‘norm’ on Linux. This is where (ideally) there’s one copy of ffmpeg installed and any program that wants to make use of ffmpeg (i.e. anything that wants to play or encode video and audio) uses that same version.

The big upside of this way of doing things is that it’s efficient. One copy of each library - what’s the point of having shared libraries if they’re not shared?

The (considerable) downside is that managing this - keeping track of the web of dependencies - is very complex, and each distro must do this itself. That’s a lot of duplicated effort. Dynamic linking necessitates complex, and sometimes fragile, package management systems.

A graph of dependencies on my fairly sparse (~650 packages) system, generated with pacgraph:

An upside of having that complex package management system, though, is that we can update everything on our system all in one go, and that process is (ideally) very efficient since nothing is duplicated.

But…

What happens if two programs need different versions of ffmpeg?

Well, there are two basic approaches to this: accommodate that or don’t.

Arch, for example, takes the general approach of having only one version (the latest) of each library. If the latest version of mpv doesn’t work with the latest version of ffmpeg then this is a bug, and must be fixed.

The advantage of Arch’s approach is simplicity from a packaging point of view. Drawbacks include the inability to, within this system, use old software, and a lack of stability (meaning “things keep changing” not “things keep crashing”).

Debian takes the other approach - allowing different versions of libraries to be concurrently installed. This allows for the use of old software, atomic upgrades (i.e. updating/installing one application without updating the whole system) and allows for stability. But it compounds the complexity of package management.

From the point of view of a user, dynamic linking is kinda ideal. It makes the most efficient use of storage, memory and bandwidth and allows for the easy installation and maintenance of all software from the OS itself through to applications.

For the developer and maintainer, though, it’s potentially a nightmare. It means that a package has to be created for every piece of software on every distro. Not so bad for established software - the process will often be largely automated and will be the responsibility of each distro’s maintainer for that package. But for a developer making new or niche software which is not yet in repos, it’s quite a burden. To distribute their software for Linux they’d have to create and maintain a package for every distro (or at least a big few) - a process with which they’re likely to be entirely unfamiliar and which will differ from distro to distro. They’ll also be distributing that package outside of the package manager’s repos (PPAs and Arch’s AUR aside), which is far from ideal.

Summary

Bundling

This is where you bundle the needed libraries with the software. The libraries don’t get installed on the system, they just sit there in the application’s directory. Which seems like a good idea until you realise that, unless you include everything up-to-but-not-including the kernel, those libraries are still making use of system libraries. And at some point, that’ll stop working (as system libraries get updated).

On Linux I’ve only ever seen this approach with proprietary software. It’s how GOG games used to be distributed, for example. I usually take it as a signal that the distributors of the software have no intention whatsoever of maintaining the thing, which held true for GOG.

Appimages are sort of containerised bundles. They’re convenient, in that you can just download a single file and run it without having to worry about dependencies and so on, but they’re maximally inefficient, fiddly to maintain, and don’t confer any of the broader benefits of ‘proper’ containerisation.

I do use a couple of Appimages - one is Libreoffice, which is a piece of software I resent having to have installed for the rare times someone sends me a .docx or .xlsx and expects me to see it how they’re seeing it. I don’t care if Libreoffice is up-to-date, I don’t want its dependencies cluttering up my package management and I don’t enjoy or like it as software. I want to forget it’s there as far as I possibly can and Appimages suit that.

But really, bundling is the worst of all worlds.

Summary

Static linking

This is some special magic whereby the bits of a library that a program uses are included in the program at compile-time. If the program makes use of ffmpeg then the bits of ffmpeg that are used are pulled out and included with the application (in an object file, I believe), meaning that a system running the software doesn’t even need to have ffmpeg installed.

(ffmpeg is probably a terrible example in this case but this is just illustrative)

This approach involves some redundancy - you may have tweaked your system-wide ffmpeg and want to use that - but simplifies packaging and reduces dependencies, making life easier for developers.

This is like the cleverer version of bundling. It shares, but reduces the impact of, bundling’s drawbacks while retaining its advantages.

Summary

Flatpaks (containerisation)

Remember where, under ‘Bundling’, I said:

unless you include everything up-to-but-not-including the kernel

Well, this is kinda that.

A container, in this context, is like another copy of a Linux operating system running on your current OS’s kernel. This may sound like a virtual machine, but it’s not really. A virtual machine emulates a CPU and then runs an entire Linux OS on that emulated CPU. A container is just a bunch of software that believes it’s a Linux OS, running on your current kernel and hardware. VMs involve much more overhead than containers.

Here’s a hopefully illustrative example. I’m using distrobox to run Debian in a container:

As you can see from the output, they (optionally) share the host’s home directory but otherwise Debian’s filesystem is its own - it is fully Debian. Both distros see the same hardware and kernel but they are otherwise separate.

That’s roughly how Flatpaks work. Flatpak applications run within a containerised copy of Linux, running on your ‘host’ OS’s kernel and sharing its hardware. This obviously involves a significant amount of duplication but, within Flatpak itself, duplication is minimised through the use of shared ‘runtimes’ which applications make use of.

This approach is, effectively, a universal package system for all Linux distros, which is great for developers.

It also provides some ‘sandboxing’ - separation between applications and the base system - you can restrict which parts of the filesystem and what devices a containerised application has access to.

Summary

Going further (immutability)

So if you take that idea of isolating applications from the base system to its natural conclusion you could completely isolate the base system from applications. Make the base system essentially ‘read-only’ - applications cannot alter it in any way.

This is immutability, and there are a few ways to get there. One way is to make all applications containers, this is the path Ubuntu is going down with its Snaps. This would work with Flatpaks, too.

NixOS takes a different approach. On NixOS, every package is installed to its own directory which means that multiple versions of dependencies can exist at the same time. A program will use the exact dependencies it is built against. This isn’t containerisation as such but shares some of the benefits, while also retaining the benefits of dynamic linking.

Everything in NixOS - which packages are installed and also runtime configurations - is (or at least can be) configured in one place. The system can easily be ‘rolled back’ to a previous state and that configuration can easily be copied to another machine to produce the exact same setup.

Immutability is an attractive idea - you can have a very stable base system and run whatever bleeding edge applications you like without ever risking the integrity of that base system.

Summary (of immutability in general)

Conclusion

I used to be very dubious of containerised applications. The efficiency of dynamic linking seems so intrinsically linuxy and, while package management was a hard problem to solve, it seems to work great now from a user’s perspective.

Flatpaks seemed to be a move to appease developers of proprietary software which is something I have no interest in at all.

But then I realised I could install Krita without pulling in half of KDE. And install Steam and play games without having 32bit libraries - which were only used by Steam and games - installed in my base system. All that stuff is still there, of course, but it’s tucked away where I don’t have to look at it, and it doesn’t get updated every time I update my base system.

Valve have gone with an immutable base on the Steam Deck for good reason, immutability is very appealing. The advantages of running Debian stable plus the advantages of, say, Arch, with some additional advantages on top. Of course, if immutability is implemented through containers then it carries with it the disadvantages of containers in general, and if it’s implemented in the NixOSy way then you still have complex package management. But the upsides seem well worth it to me and I expect distros with an immutable base to become the norm over the next decade or so.