Java Bloating up applications

Hello,

I recently installed the PolyMC package(org.polymc.PolyMC) it is a minecraft client.
Minecraft itself uses java, one of the things i noticed however that the package size is huge, not because of the size of the application in itself, but because it uses the OpenJDK17 and 8 extensions(for newer and older versions of the game). I would expect Java itself stored separately but it does not seem to be the case.

This also means that every time java gets updated it shows up as an application update and not as a java update.

Does that mean that for every application using the OpenJDK extension i have multiple Java same runtimes stored on my system? And that they all have to update separately?

Created relevant issue: Use flathub version of openjdk · Issue #18 · flathub/org.polymc.PolyMC · GitHub

Yes that is exactly what that means. Each flatpak package bundles it’s own dependencies and in this case they’re copied from a SDK extension into the package.

it’s deduplicated by OSTree.

Right the OSTree dedup is great! An update is still required for the package that uses it right? Or are packages automatically rebuild as soon as an SDK extension is updated? I can totally imagine an update of an extension going unnoticed for quite some time.

I can’t answer your question but wanted to point out this script that will give you some more info on how the dedup in ostree is working.

I think you posted the wrong link as this points to an askubuntu question about the difference between upstream and downstream.

From what I understand all unique files used by OSTree are stored only once and are probably identified by their hash. The hashes are used by Flatpak to indicate which files belong to a package and are then hard linked to compose the filesystem required to run the application (Under the Hood - Flatpak documentation). So far so good, however:

this is correct and I haven’t seen anything to indicate that all packages that require an SDK extension are automatically rebuild if the SDK extension is updated. So even though there is deduplication using OSTree, that is just based on the hashes of the files and if no rebuilds are triggered you might well have multiple (minor and/or micro) versions of binaries produced by the SDK extension in the filesystem if one package is updated, but another is not.

I’ve been pondering on a solution for that lately and what I’ve been thinking so far is that it would probably be possible to add an extra module to the manifests that use an SDK extension and use the GitHub - flathub-infra/flatpak-external-data-checker: A tool for checking if the external data used in Flatpak manifests is still up to date to make sure a PR is created when the repo is updated. Needless to say that module should not really be built.

Fixed, cheers! Here’s the URL for future readers: flatpak-ostree-dedup-stats.py · GitHub

1 Like

:smiley: seems like it would be handy to have this tool as a flatpak!

@castrojo Thanks, man! you saved the day. I was really looking for that.