[MDBF-15] Implement a fast, complete, archive download server Created: 2020-04-06  Updated: 2022-02-01  Resolved: 2021-11-10

Status: Closed
Project: MariaDB Foundation Development
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major
Reporter: Ian Gilfillan Assignee: Faustin Lammler
Resolution: Done Votes: 0
Labels: download
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
relates to MDEV-22673 mariadb-backup packages are not mirro... Closed

 Description   

archive.mariadb.org contains all (actually there are a few releases missing) MariaDB releases.

This server is very slow. http://ftp.hosteurope.de/mirror/archive.mariadb.org/ is an alternative archive, and much faster, but they unfortunately no longer mirror everything.

We need a faster version of archive.mariadb.org - usage should be low, but when someone does request an old version, it should be delivered quickly!



 Comments   
Comment by Ian Gilfillan [ 2020-04-07 ]

dbart has added the missing release (the others were pulled releases), so archive.mariadb.org is complete now

Comment by Faustin Lammler [ 2020-06-09 ]

Will check what's happening to that server. One temporary solution could be to cache it with cloudflare.

Comment by Faustin Lammler [ 2020-06-09 ]

Hi,
so here is my proposition for archive.mariadb.org.

1. subscribe to https://www.hetzner.com/storage/storage-box/bx50
The actual size of archive is:

faustin@hasky:~$ du -sh /ds1819/data/
3.9T	/ds1819/data/

So starting with 5TB seems reasonable.
Then upgrading to https://www.hetzner.com/storage/storage-box/bx60 is still possible in a mater of minutes without downtime.
The traffic between our HZ VM and the storage box is inside HZ network and so unlimited.

2. mount the storage box using SSHFS protocol on the hz-www VM

3. configure the new archive.mariadb.org website on hz-www (nginx)
hz-www is the VM that hosts some of our static websites (collation-charts.org, hbm.mariadb.org, mirmon.mariadb.org).

I suggest we use cloudflare caching for archive.mariadb.org because:

  • it's HTTP so no TLS dirty interception from cloudflare;
  • the SSHFS protocol comes with some latency overhead and caching on cloudflare CDN will improve drastically the responsiveness of the browsing (above all for non EU users);
  • if for some reason, there is a download pic for a particula archive package, once it as been delivered a first time by HZ to cloudflare, bandwidth is going to be absorbed by cloudflare.

4. give dbart access to hz-www so he can update the new archive.mariadb.org by scp/rsync or his preferred solution.

awren, kaj this would add a monthly fee of 26.28€ on our HZ invoice.

Comment by Faustin Lammler [ 2020-06-09 ]

To complete this, maybe greenman or dbart can do some storage projection for the future (see bellow)?

faustin@hasky:/ds1819/data$ du -sh mariadb-10.5.3
53G	mariadb-10.5.3

So every release adds more ore less 53GB of data to the archive.
Having the release rhythm could tell us when 5TB and 10TB would not be sufficient anymore...

Comment by Faustin Lammler [ 2020-06-12 ]

Due to https://mariadb.com/kb/en/mirror-sites-for-mariadb/+comments/4623#comment_4624, I removed DNS proxy on cloudflare for archive.mariadb.org.
If we want to keep caching but allow rsync, we should use another subdomain (for instance archive-rsync.mariadb.org).
See : https://support.cloudflare.com/hc/en-us/articles/200169156-Identifying-network-ports-compatible-with-Cloudflare-s-proxy

Comment by Daniel Bartholomew [ 2020-06-12 ]

rsync.mariadb.org is shorter and a bit easier to remember, we just need to document it on the Mirroring MariaDB page (once it is set up of course)

Comment by Faustin Lammler [ 2020-06-12 ]

Ok! I propose we do not modify this for the moment. Once we have the new archive.mariadb.org (I note that I have to also implement rsync service!) we will see if it's necessary to cache it with cloudflare (we will then have metrics of the bandwitdh usage..).

Comment by Faustin Lammler [ 2020-09-09 ]

A new archive server has been implemented:
https://archive.mariadb.org/

It's available in HTTPS and IPv4/IPv6.
It has also a rsync server.

Generated at Thu Feb 08 03:34:50 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.