[MDBF-297] Provide mirror manager service for RPM Created: 2021-11-11  Updated: 2022-04-07  Resolved: 2022-04-07

Status: Closed
Project: MariaDB Foundation Development
Component/s: None
Affects Version/s: N/A
Fix Version/s: N/A

Type: Task Priority: Major
Reporter: Daniel Black Assignee: Faustin Lammler
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: 0d
Time Spent: 7h
Original Estimate: Not Specified

Issue Links:
Relates
relates to MDBF-8 Repository tool: include all the yum ... Closed

 Description   

https://github.com/fedora-infra/mirrormanager2 would solve:

  • temporary and permanent RPM unavailability on mirrors for all.
  • not require the user to change URLs
  • and would provide a sane point in collecting statistics on downloads.


 Comments   
Comment by Faustin Lammler [ 2021-11-11 ]

Hi Daniel!
This looks like a good idea.

Here are some comments/questions:

  • How would we integrate/sync mirrormanager2 with actual repository tool configuration (just only point to mirrormanager2 for fedora/rhel)?
  • I went through the documentation and it is very poor IMO, above all I am missing the "How to not create a SPOF documentation".

Here is what I could determine from rpmfusion:

  • they use RR DNS with one HZ and one OVH machine;
  • they seems to serve also in IPv6 (good);

And some more questions:

  • How do they sync those 2 machines?
  • Is there a mechanism if those 2 machines have downtime?
  • Do you have any pointer on how the infra would look like with say 2 mirrors (EU/US)?
  • What would be required in terms of CPU/RAM (and storage above all)?
  • What maintenance tasks are needed to update/remove/add mirrors?

My guess is that this is not a very heavy service and it should not be too complicated to setup and manage, but before jumping into an implementation, we should define better what the infra would look like IMO.

If you have any contact at rpmfusion, I would be happy to contact their SRE to get some experience feedback.

Ping some more person to join that discussion:

I am also happy to setup a quick POC in the next days so we can understand better how this works, let me know.

Cheers!

Comment by Daniel Black [ 2021-11-12 ]

> How would we integrate/sync mirrormanager2 with actual repository tool configuration (just only point to mirrormanager2 for fedora/rhel)?

Obviously both maintain mirror lists, I don't have good answer on syncing these. Looks like it has its own crawler - utility/mm2_crawler

On enable release. mirrormanager has some APIs.

repository tool to show repo config, example fedora one below, will just need the Major version in the output.

/etc/yum.repos.d/fedora.repo

[fedora]
name=Fedora $releasever - $basearch
metalink=https://mirrors.fedoraproject.org/metalink?repo=fedora-$releasever&arch=$basearch
enabled=1
countme=1
metadata_expire=7d
repo_gpgcheck=0
type=rpm
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-$releasever-$basearch
skip_if_unavailable=False
 
[fedora-debuginfo]
name=Fedora $releasever - $basearch - Debug
metalink=https://mirrors.fedoraproject.org/metalink?repo=fedora-debug-$releasever&arch=$basearch
enabled=0
metadata_expire=7d
repo_gpgcheck=0
type=rpm
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-$releasever-$basearch
skip_if_unavailable=False
 
[fedora-source]
name=Fedora $releasever - Source
metalink=https://mirrors.fedoraproject.org/metalink?repo=fedora-source-$releasever&arch=$basearch
enabled=0
metadata_expire=7d
repo_gpgcheck=0
type=rpm
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-$releasever-$basearch
skip_if_unavailable=False

> I went through the documentation and it is very poor IMO, above all I am missing the "How to not create a SPOF documentation".

RR DNS, local reverse proxy, database replication between sites (galera or multisource).

> How do they sync those 2 machines?

I assume a common database. This is more an information service that points to mirrors on user install/update time.

> Is there a mechanism if those 2 machines have downtime?

Not that I can immediately see.

> Do you have any pointer on how the infra would look like with say 2 mirrors (EU/US)?

Two different data centers, sure.

> What would be required in terms of CPU/RAM (and storage above all)?

Storage, looks like at least one needs to be colocated with a mirror. CPU/RAM/BW, not sure.

> What maintenance tasks are needed to update/remove/add mirrors?

I'm sure the POC will answer this.

Comment by Faustin Lammler [ 2021-11-18 ]

Hi!
While working on trying to create a POC with mirrormanager2, I accidentally found another option that seems quite promising (and would work for both rhel and debian based distributions):
https://github.com/etix/mirrorbits

Even if the project seems to be a bit abandoned (latest commit 23 Jan), it is used by some important project (VLC being the biggest).
So I decided to implement it and give it a try. Here is a documentation that I have followed for those interested: https://jellyfin.org/posts/mirrorbits-cdn/

And here is how to use the new CDN mirror:

I have created a new DNS record mirror.mariadb.org and 2 CNAME (deb.mariadb.org and rpm.mariadb.org).
Those are not mandatory but I think they may help us in the future to load balance or better read our statistics.

Fedora34:

[mariadb]
name = MariaDB
baseurl = https://rpm.mariadb.org/yum/10.7/fedora34-amd64
gpgkey=https://rpm.mariadb.org/yum/RPM-GPG-KEY-MariaDB
gpgcheck=1

Debian11:

deb https://deb.mariadb.org/repo/10.5/debian bullseye main

If you want to understand the routing decision on a particular file, just add

?mirrorlist

to the URL, example:
https://rpm.mariadb.org/yum/10.7/fedora33-amd64/sha256sums.txt?mirrorlist

You can also check the global mirror status page: https://mirror.mariadb.org/mirrorstats.
On that page you can see that some mirror are disabled. This happens automatically if:

  • some files are missing on the mirror (every mirror is scanned regularly);
  • the mirror is not reachable (check every minute);
  • mirrorbits was unable to scan the mirror (via ftp or rsync), 3 FTP mirrors are problematic so far.

Regarding the last point, I only added mirrors that also propose FTP or RSYNC (necessary for the scan).

SPOF considerations:

  • for now, there is only one Hetzner VM but we could in the future add more nodes (mirrorbits can use redis-sentinel);
  • there is a fallback mechanism if mirrorbits is down or if a file is only present on the ref mirror (osuosl.org and archive.mariadb.org are configured so far).

You can verify the fallback mechanism with the following file (that is only present on the reference mirror):
https://mirror.mariadb.org/repo/test_only_on_de_fallback.gz

There is probably some tuning necessary above all with the nginx reverse proxy regexp that is not my cup of tea . Here is the nginx conf:

server {
...
  # index index.html
  server_name deb.mariadb.org mirror.mariadb.org rpm.mariadb.org; # managed by Certbot
 
  location / {
    autoindex on;
  }
 
  # forwards to mirrorbits
  location ~ ^/(?<fwd_path>.*)(?<fwd_file>Release|RPM-GPG-KEY-MariaDB|\.bz|\.bz2|\.lz|\.gz|\.changes|\.ddeb|\.deb|\.dmg|\.dsc|\.exe|\.rpm|\.txt|\.xml|\.xz|\.zip)$ {
    proxy_set_header Host $host;
    proxy_set_header X-Forwarded-For $remote_addr;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_pass http://127.0.0.1:8080;
    proxy_buffering off;
  }
  location ~ ^/(?<fwd_path>.*)(\?mirrorlist)$ {
    proxy_set_header Host $host;
    proxy_set_header X-Forwarded-For $remote_addr;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_pass http://127.0.0.1:8080/$fwd_path?mirrorlist;
    proxy_buffering off;
  }
  location /mirrorstats {
    proxy_pass http://127.0.0.1:8080/?mirrorstats;
    proxy_buffering off;
  }
}

Comment by Daniel Black [ 2021-11-19 ]

So 302 redirects, looking pretty good (as concept, and quick behaviour check, I haven't checked nginx confg).

302 redirects

wget -S  'https://rpm.mariadb.org/yum/10.7/fedora33-amd64/sha256sums.txt'
--2021-11-19 07:34:14--  https://rpm.mariadb.org/yum/10.7/fedora33-amd64/sha256sums.txt
Resolving rpm.mariadb.org (rpm.mariadb.org)... 162.55.42.214, 2a01:4f8:1c17:e53d::1
Connecting to rpm.mariadb.org (rpm.mariadb.org)|162.55.42.214|:443... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 302 Found
  Server: nginx
  Date: Fri, 19 Nov 2021 07:34:15 GMT
  Content-Type: text/html; charset=utf-8
  Content-Length: 0
  Connection: keep-alive
  Cache-Control: private, no-cache
  Link: <https://mirror.digitalpacific.com.au/mariadb/mariadb-10.7.1/yum/fedora/33/x86_64/sha256sums.txt>; rel=duplicate; pri=1; geo=au
  Location: https://mirror.aarnet.edu.au/pub/MariaDB/mariadb-10.7.1/yum/fedora/33/x86_64/sha256sums.txt
Location: https://mirror.aarnet.edu.au/pub/MariaDB/mariadb-10.7.1/yum/fedora/33/x86_64/sha256sums.txt [following]
--2021-11-19 07:34:16--  https://mirror.aarnet.edu.au/pub/MariaDB/mariadb-10.7.1/yum/fedora/33/x86_64/sha256sums.txt
Resolving mirror.aarnet.edu.au (mirror.aarnet.edu.au)... 202.158.214.106, 2001:388:30bc:cafe::beef
Connecting to mirror.aarnet.edu.au (mirror.aarnet.edu.au)|202.158.214.106|:443... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 200 OK
  date: Fri, 19 Nov 2021 07:34:16 GMT
  server: Apache
  last-modified: Sat, 06 Nov 2021 22:13:23 GMT
  etag: "1323-5d0261171c44a"
  accept-ranges: bytes
  content-length: 4899
  referrer-policy: no-referrer
  x-content-type-options: nosniff
  x-frame-options: deny
  x-xss-protection: 1; mode=block
  cross-origin-embedder-policy: require-corp
  cross-origin-resource-policy: same-origin
  cross-origin-opener-policy: same-origin
  content-type: text/plain; charset=UTF-8
Length: 4899 (4.8K) [text/plain]
Saving to: ‘sha256sums.txt’

install MariaDB-server

(87/92): perl-libs-5.32.1-477.fc34.x86_64.rpm                                                                                                                                                                 5.8 MB/s | 2.0 MB     00:00    
(88/92): perl-overloading-0.02-477.fc34.noarch.rpm                                                                                                                                                             55 kB/s |  23 kB     00:00    
(89/92): MariaDB-common-10.7.1-1.fc34.x86_64.rpm                                                                                                                                                               73 kB/s |  89 kB     00:01    
(90/92): MariaDB-server-10.7.1-1.fc34.x86_64.rpm                                                                                                                                                              5.0 MB/s |  18 MB     00:03    
(91/92): MariaDB-client-10.7.1-1.fc34.x86_64.rpm                                                                                                                                                              684 kB/s | 8.6 MB     00:12    
(92/92): galera-4-26.4.9-1.fc34.x86_64.rpm                                                                                                                                                                    1.0 MB/s |  12 MB     00:12    
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total                                                                                                                                                                                                         2.9 MB/s |  53 MB     00:18     
MariaDB                                                                                                                                                                                                       3.4 kB/s | 8.2 kB     00:02    
Importing GPG key 0x1BB943DB:

Comment by Faustin Lammler [ 2022-01-14 ]

Here is what I have done:

  • improve the LE certificate generator to be able to generate wildcard certificates (*.mariadb.org);
  • deploy on the new mirror server the wildcard certificate (so it can respond to every mariadb.org subdomains --> yum.mariadb.org was the goal);
  • create a 301 redirect on the new mirror (people using yum.mariadb.org have now to use rpm.mariadb.org/yum):

$ cat /etc/nginx/sites-enabled/yum.mariadb.org
server {
  listen 80 ;
  listen [::]:80 ;
  listen 443 ssl;
  listen [::]:443 ssl;
 
  server_name yum.mariadb.org;
  return 301 $scheme://rpm.mariadb.org/yum$request_uri;
}

  • tested in a fedora docker container with the /etc/hosts modification to point to the new mirror manager.

Everything looks good so far but I would like that at least @Vicențiu Ciorbaru confirms that he understand what I have done and maybe take a quick look. Once OK, I will add the mirror manager IP to yum.mariadb.org (currently 2 OVH IP are serving it). This way we would get 1/3 of the yum traffic hitting the new server through RR DNS and we could see if we have errors and how the new server behaves. Ping @Daniel Black also if you have any comments...

You can use this new setup on a fedora/centos/redhat with the following line in your /etc/hosts file:

162.55.42.214   yum.mariadb.org

This is the MariaDB.repo file for yum.mariadb.org:

[mariadb]
name = MariaDB
baseurl = https://yum.mariadb.org/10.6/fedora34-amd64
gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=1

This is the MariaDB.repo file for rpm.mariadb.org (that should replace it in the future):

[mariadb]
name = MariaDB
baseurl = https://rpm.mariadb.org/yum/10.6/fedora34-amd64
gpgkey=https://rpm.mariadb.org/yum/RPM-GPG-KEY-MariaDB
gpgcheck=1

Comment by Daniel Black [ 2022-02-14 ]

$  host rpm.mariadb.org
rpm.mariadb.org is an alias for mirror.mariadb.org.
mirror.mariadb.org has address 162.55.42.214
mirror.mariadb.org has IPv6 address 2a01:4f8:1c17:e53d::1

http://rpm.mariadb.org/repo/10.6/ has only debian and ubuntu repositories.

Comment by Faustin Lammler [ 2022-02-14 ]

It's under yum:
https://rpm.mariadb.org/yum/

Or am I missing something?

Comment by Daniel Black [ 2022-02-14 ]

my bad, was looking at yum.mariadb.org urls

Comment by Faustin Lammler [ 2022-03-31 ]

Little change in the URL, now you should use https://rpm.mariadb.org, example:

[mariadb]
name = MariaDB
baseurl = https://rpm.mariadb.org/10.6/fedora34-amd64
gpgkey=https://rpm.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=1

And in a near future, probably today, the same goes for deb.mariadb.org:

deb https://deb.mariadb.org/10.7/debian bullseye main

Comment by Faustin Lammler [ 2022-04-07 ]

Here is a summary of the URLS:

Regarding the last one, the DNS has not been changed for the moment, see MDBF-356.

Generated at Thu Feb 08 03:36:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.