I need a list of installation names of all flathub's packages

I am using fedora 40. fedora has rpm repo. also rpm fusion has free and non-free rpm repos. besides rpm repos fedora has fedora flatpaks repo called registry.fedoraproject.org, also flathub repo called dl.flathub.org. I want to write a script that deletes the rpm packages and installs the equivalent flatpak or flathub packages instead. the reason is that i want to use packages that are in isolated boxes to avoid conflicts, to avoid telemetry and data flow between applications and to avoid conflicts. I am going to write a script that will check the installation names of the rpm packages installed from the fedora rpm repository on my system, if there is a fedora flatpaks equivalent it will delete the rpm package and install the fedora flatpaks package. if there is no fedora flatpaks equivalent it will look for a flathub equivalent, if there is a flathub equivalent it will delete the rpm package and install the flathub package. Then it will do the same for rpm packages installed from rpm fusion free and rpm fusion non-free repos. if there is an equivalent, it will delete the rpm packages and install the equivalent fedora flatpak and/or flathub package. To do this, I need to identify the equivalents of rpm packages and flathub packages and fedora flatpaks packages. I need to prepare the rpm package list and flathub package list and the rpm package list and fedora flatpaks package list. Then I need to match the installation names in these lists and introduce the mathematically equal ones into the script.
for example, the name of audacity in fedora’s rpm repository is audacity . install with
sudo dnf install audacity
the name of audacity in the rpm fusion free rpm repository is audacity . although it is installed with the same command, it is necessary to install it with the following command to separate them:
sudo dnf install --enablerepo=rpmfusion-free audacity
The name of flathub installation is org.audacityteam.Audacity . install with
flatpak install flathub org.audacityteam.Audacity
İnstallation name in the fedora flatpaks repo: org.audacityteam.Audacity .
flatpak install fedora org.audacityteam.Audacity
is installed with the terminal command. in the name matching list I will show the equivalents as follows.
Table example of the list;

fedora-rpm rpm-fusion-free-rpm rpm-fusion-non-free-rpm flathub-flatpak fedora-flatpaks
audacity = audacity = - = org.audacityteam.Audacity = org.audacityteam.Audacity

In this example the names were the same. but I have analyzed hundreds of applications. there are applications with very different names. so it is very important to create this list. So in order to create this script I need a list of all packages in the flathub repository with the installation names of all applications. How can I do it?

dnf repolist
  1. List Installation Names of All Packages in the Fedora RPM Repository

You can use the dnf command to list all packages in Fedora’s official RPM repository. The following command lists all packages in Fedora’s current repository:
1.1. Listing the Fedora 40 - x86_64 Repository:

You can use the following command for this repository:

dnf repoquery --repo=fedora > fedora_40_x86_64_packages.txt

1.2. Listing the Fedora 40 openh264 (From Cisco) - x86_64 Repository:

To list packages in the openh264 repository provided by Cisco:

dnf repoquery --repo=fedora-cisco-openh264 > fedora_40_openh264_packages.txt

1.3. List the Fedora 40 - x86_64 - Updates Repository:

You can use this command to list the updated packages in this repository:

dnf repoquery --repo=updates > fedora_40_updates_packages.txt

1.4. All rpm repos’s list (All rpm-fusion repos, including all repos added later):

dnf repoquery --all

This command lists all Fedora RPM packages. You can save the output to a file for further inspection:

dnf repoquery --all > fedora_rpm_packages.txt
  1. RPM Fusion Free List Installation Names of All Packages in the RPM Repository

2.1. RPM Fusion for Fedora 40 - Listing the Free Repository:
You can also use the dnf command to list all packages in the RPM Fusion Free repository. However, here you need to use the --enablerepo option to list only the packages in the RPM Fusion Free repository:

dnf repoquery --repo=rpmfusion-free

This command only lists packages from the RPM Fusion Free repository. Again, you can save the results to a file:

dnf repoquery --repo=rpmfusion-free > rpmfusion_free_packages.txt

2.2. RPM Fusion for Fedora 40 - Free - List Updates Repository:

dnf repoquery --repo=rpmfusion-free-updates > rpmfusion_free_updates_packages.txt
  1. List Installation Names of All Packages in the RPM Fusion Non-Free RPM Repository

3.1. List RPM Fusion for Fedora 40 - Nonfree Repository:

You can use the following command to list all packages in the RPM Fusion Non-Free repository:

dnf repoquery --repo=rpmfusion-nonfree

This command only lists packages from the RPM Fusion Non-Free repository. If you want to save the results to a file:

dnf repoquery --repo=rpmfusion-nonfree > rpmfusion_nonfree_packages.txt

3.2. RPM Fusion for Fedora 40 - Nonfree - Listing the NVIDIA Driver Repository:

dnf repoquery --repo=rpmfusion-nonfree-nvidia-driver > rpmfusion_nonfree_nvidia_packages.txt

3.3. RPM Fusion for Fedora 40 - Nonfree - List Steam Repository:

dnf repoquery --repo=rpmfusion-nonfree-steam > rpmfusion_nonfree_steam_packages.txt

3.4. Listing the RPM Fusion for Fedora 40 - Nonfree - Updates Repository:

dnf repoquery --repo=rpmfusion-nonfree-updates > rpmfusion_nonfree_updates_packages.txt
  1. List Installation Names of All Packages in the Fedora Flatpaks Repository

You can use the flatpak command to list all applications in the Fedora Flatpak repository. Since the Fedora Flatpak repository is OCI based, you can use flatpak remote-ls to list all packages in Fedora’s Flatpak repository:

flatpak remote-ls fedora --columns=ref

This command will show the installation names of all applications in the Fedora Flatpak repository. To save the results to a file:

flatpak remote-ls fedora --columns=ref > fedora_flatpak_packages.txt
flatpak remote-ls fedora --columns=name,ref,version > fedora_flatpak_packages_name_ref_version.txt
flatpak remote-ls fedora --columns=name,ref > fedora_flatpak_packages_name_ref.txt
  1. List the Installation Names of All Packages in the Flathub Flatpak Repository

You can use the flatpak command to list all applications in the Flathub Flatpak repository. Since the Flathub Flatpak repository is OCI based, you can use flatpak remote-ls to list all packages in Flathub’s Flatpak repository:

flatpak remote-ls flathub --columns=ref

This command will show the installation names of all apps in the Flathub Flatpak repository. To save the results to a file:

flatpak remote-ls flathub --columns=ref > flathub_flatpak_packages.txt
flatpak remote-ls flathub --columns=name,ref,version > flathub_flatpak_packages_name_ref_version.txt
flatpak remote-ls flathub --columns=name,ref > flathub_flatpak_packages_name_ref.txt

The 11 master list files (txt) created look unorganized. need to convert them all to the same format. It should contain only the installation name of the application. directories versions types should be deleted. Here are the commands of the python .py file for this:
python file code for editing and format matching for flatpak files ( fedora_flatpak_packages_name_ref.txt and flathub_flatpak_packages_name_ref.txt ) :

# Input and output file names
input_file = "input_flatpak_packages_name_ref.txt"
output_file = "output_flatpak_packages_name_ref_duzgun.txt"

# Open and process the file
with open(input_file, "r", encoding="utf-8") as infile, open(output_file, "w", encoding="utf-8") as outfile:
    for line in infile:
        # Split the line from the gap (pre-excretion and after the gap)
        parts = line.split("\t")
        if len(parts) == 2:
            name = parts[0].strip()  # Boşluktan önceki kısım
            app_part = parts[1].strip()  # Boşluktan sonraki kısım
            
            # Take the first / and second /
            app_name = app_part.split("/")[1]
            
            # Write to file in desired format
            outfile.write(f"{name}\t{app_name}\n")

print(f"Data was successfully written in the {output_file} .")

Replace input_flatpak_packages_name_ref and output_flatpak_packages_name_ref with the name of the file.

python file code for editing and format matching for rpm files ( fedora_40_x86_64_packages.txt , fedora_40_openh264_packages.txt , fedora_40_updates_packages.txt , rpmfusion_free_packages_duzgun.txt , rpmfusion_free_updates_packages.txt , rpmfusion_nonfree_packages.txt , rpmfusion_nonfree_updates_packages.txt , rpmfusion_nonfree_nvidia_packages.txt , rpmfusion_nonfree_steam_packages_.txt ) :

# Name of the file to be processed
input_file = "input_packages.txt"
# Name of the output file
output_file = "output_packages_duzgun.txt"

# Open and process the file
with open(input_file, "r") as infile, open(output_file, "w") as outfile:
    for line in infile:
        # Get the part before the first '-0:'
        package_name = line.split('-0:')[0]
        # Write the name of the package to the output file
        outfile.write(package_name + "\n")

print(f"Package names were written to the {output_file} file.")

Replace input_packages and output_packages with the name of the file.

In some txt files, a line was skipped, left blank, that is, there are empty lines. remove_empty_lines.sh to delete these empty lines:

#!/bin/bash

# Get the name of the file from which you want to delete blank lines
file=$1

# Delete empty lines and write to a temporary file
grep -v '^$' "$file" > temp_file.txt

# Overwrite original file with temporary file
mv temp_file.txt "$file"

echo "Empty lines have been deleted from '$file'."


Contents of the .py python file to separate the application names from the application installation names from the fedora_flatpak_packages_name_ref.txt and flathub_flatpak_packages_name_ref.txt files and to combine the application names by avoiding repetition to create a txt list named combined_apps_list.txt:

# Names of two TXT files
file1 = "flathub_flatpak_packages_name_ref.txt"  # First list file
file2 = "fedora_flatpak_packages_name_ref.txt"  # Second list file
output_file = "combined_apps_list.txt"  # output file

# Create a cluster (set) to store application names
apps_set = set()

# Read first file
with open(file1, "r", encoding="utf-8") as f1:
    for line in f1:
        app_name = line.rsplit(maxsplit=1)[0]  # Splits the line from the end, delete after the last space
        apps_set.add(app_name)  # Add application name to cluster

# Read second file
with open(file2, "r", encoding="utf-8") as f2:
    for line in f2:
        app_name = line.rsplit(maxsplit=1)[0]  # Splits the line from the end, delete after the last space
        apps_set.add(app_name)  # Add application name to cluster

# Convert the results to an ordered list and write to file
with open(output_file, "w", encoding="utf-8") as out:
    for app in sorted(apps_set):  # Write the application names in order
        out.write(app + "\n")

print(f"Application names were successfully written to {output_file}.")

If you want to combine these files into a 12 txt file so that the artificial intelligence software can match these applications and write their equivalents side by side in a table:
.bat file:

for file in *.txt *.csv; do
    echo "=============================" >> merged.txt
    echo "Dosya: ${file}" >> merged.txt
    echo "=============================" >> merged.txt
    grep -v '^$' "$file" >> merged.txt  # Filter empty rows
    echo "" >> merged.txt
done

We have 12 lists. Each list is a separate file in txt format:
fedora_40_x86_64_packages_duzgun.txt ; fedora_40_openh264_packages_duzgun.txt ; fedora_40_updates_packages_duzgun.txt ; rpmfusion_free_packages_duzgun.txt ; rpmfusion_free_updates_packages_duzgun.txt ; rpmfusion_nonfree_packages_duzgun.txt ; rpmfusion_nonfree_updates_packages_duzgun.txt ; rpmfusion_nonfree_nvidia_packages_duzgun.txt ; rpmfusion_nonfree_steam_packages_duzgun.txt ; fedora_flatpak_packages_name_ref_duzgun.txt ; flathub_flatpak_packages_name_ref_duzgun.txt ; combined_apps_list.txt

Based on the 12 lists, we can create an updated Python script. Below I’m sharing a Python script with the updates you want. This script will read the 12 lists and create a CSV file with the equivalents for each application side by side.

  1. Summary of Required Steps:

    You will read the five separate lists.
    By comparing the application names, you will find equivalents by combining the same ones.
    For each application, you will find the installation names in the Fedora RPM, Fedora RPM-Updates, Fedora RPM openh264, RPM Fusion Free, RPM Fusion Free-Updates, RPM Fusion Non-Free, RPM Fusion Non-Free-Updates, RPM Fusion Non-Free Nvidia-Driver, RPM Fusion Non-Free Steam, Flathub and Fedora Flatpaks repositories and write them to a CSV file.

You can do this using Python. I am sharing a step-by-step Python script for the process below:
2. Python Script

The following Python script will read five separate lists and merge and write the equivalent applications in CSV format:

import csv

# Function to load lists (reads each file line by line)
def load_list(filename):
    with open(filename, "r", encoding="utf-8") as file:
        return [line.strip() for line in file]

# Names of list files
repo_names = [
    "fedora_40_x86_64_packages_duzgun.txt",
    "fedora_40_openh264_packages_duzgun.txt",
    "fedora_40_updates_packages_duzgun.txt",
    "rpmfusion_free_packages_duzgun.txt",
    "rpmfusion_free_updates_packages_duzgun.txt",
    "rpmfusion_nonfree_packages_duzgun.txt",
    "rpmfusion_nonfree_updates_packages_duzgun.txt",
    "rpmfusion_nonfree_nvidia_packages_duzgun.txt",
    "rpmfusion_nonfree_steam_packages_duzgun.txt",
    "fedora_flatpak_packages_name_ref_duzgun.txt",
    "flathub_flatpak_packages_name_ref_duzgun.txt"
]

# Read combined application list file
combined_apps_file = "combined_apps_list.txt"
combined_apps = load_list(combined_apps_file)

# Load all lists and store them in a dictionary structure
repos = {repo_name: load_list(repo_name) for repo_name in repo_names}

# Get a unique combination of all package names (sum the names from each repo)
all_packages = set(combined_apps)  # Initially only app names in combined_apps

for package_list in repos.values():
    all_packages.update(package_list)

# write to CSV file
with open("merged_packages.csv", "w", newline="", encoding="utf-8") as csvfile:
    csvwriter = csv.writer(csvfile)
    
    # Headings (1st column represents application name, other columns represent repo names)
    headers = ["Application Name"] + [name.replace("_duzgun.txt", "") for name in repo_names]
    csvwriter.writerow(headers)

    # Navigate through all package names and create a row based on which repo each package is in
    for package in sorted(all_packages):
        row = [package]  # First column application name

        # Add the corresponding package or "-" sign to the line for each repo
        for repo_name in repo_names:
            if package in repos[repo_name]:
                row.append(package)
            else:
                row.append("-")
        
        # Write row to CSV
        csvwriter.writerow(row)

print("Packages have been successfully written to merged_packages.csv.")


  1. Python Script Description:

    The load_list(filename) function reads each .txt file and returns a list of lines.
    We collect all app names in a set data structure so that app names are not repeated.
    When writing to CSV there will be 11 columns for each application name:
    Application: Application name
    Fedora RPM Repos Installation name in Fedora RPM repositories
    RPM Fusion Free Repos: Installation name in the RPM Fusion Free repositories
    RPM Fusion Non-Free Repos: Installation name in the RPM Fusion Non-Free repositories
    Flathub Installation name on Flathub
    Fedora Flatpak Installation name in Fedora Flatpaks
    In each repository, we write the installation name if that application is available, or the “-” sign if it is not.

  2. Run the script:

    Make sure Python is installed on your system. If not, you can install it using the following command in the terminal:

sudo dnf install python3

Put the five .txt files in the same directory and save the Python script above as a file (e.g. create_csv.py).

Run the script by running the following command in the terminal:

python3 create_csv.py

After running this command, the app_equivalents.csv file will be created in the same directory. You can view this CSV file as a table.

He combined these lists and created a table. But he could not match the installation names of the packages, he could not write the equivalent package side by side. because in order to do this, he would have to check and understand individual words. Then he had to compare the words, hypothesize, and find the equivalent. He had to have intelligence for this. It seems that there is no alternative other than artificial intelligence that can combine and balance these databases. Please share with me if you know an open source artificial intelligence model that can equalize txt lists and write their equivalents on the same line. I need this for data analysis and data processing. I will post the results here.

Additionally, I prepared a python script to edit the fedora_flatpak_packages_name_ref.txt and flathub_flatpak_packages_name_ref.txt files and export them to csv files. csv content is as follows: application name, application installation name. Python code is as follows:

import csv

# Input and output file names
input_file = "input_flatpak_packages_name_ref.txt"  # input file
output_file = "output_flatpak_packages_name_ref_duzgun.csv"  # output file

# Open and process the file
with open(input_file, "r", encoding="utf-8") as infile, open(output_file, "w", newline='', encoding="utf-8") as outfile:
    csvwriter = csv.writer(outfile)
    
    for line in infile:
        # Cut line at last space
        parts = line.rsplit("\t", 1)  # Separate from the right by the last space
        if len(parts) == 2:
            app_name = parts[0].strip()  # Application name section
            app_part = parts[1].strip()  # The part after the space

            # Get between first and second / sign
            processed_name = app_part.split("/")[1]  # Get after first /
            processed_name = processed_name.split("/")[0]  # Take before second /

            # Write application name and rendered name to CSV
            csvwriter.writerow([app_name, processed_name])

print(f"Data was successfully written to {output_file}.")

Replace input_flatpak_packages_name_ref.txt with the name of the file. Repeat the process for both txt files.
Replace output_flatpak_packages_name_ref_duzgun with the name of the file. for example: flathub_flatpak_packages_name_ref_duzgun.csv . Repeat the process for both files.
The reason I prepared these csv files was so that the artificial intelligence could at least synchronize the name of the flatpak applications with the installation names of the applications.

For scripts: google drive scripts link
For lists: google drive link - packages lists txt and csv files