Operations/Minutes/2024-09-19
OpenStreetMap Foundation, Operations Meeting - Draft minutes
These minutes do not go through a formal acceptance process.
This is not strictly an Operations Working Group (OWG) meeting.
Thursday 19 September 2024, 19:00 London time
Location: Video room at https://osmvideo.cloud68.co
Participants
- Grant Slater (OWG)
- Tom Hughes (OWG)
- Paul Norman (OWG)
Minutes by Dorothea Kazazi.
New action items from this meeting
- Grant to create a ticket for action item [2024-08-08] OPS to evaluate Fastly Security (DDOS) Protections we could use. [Topic: Cloudflare / Fastly]
- Grant to create an IP blocklist. [2024-09-19 Reportage].
- Paul to add the OpenMapTiles application to the next agenda, together with the editor inclusion policy. [2024-09-19 Reportage]
- Grant to come up with estimates for space needs in next 5 years and cost for Ironbelly replacement. [2024-09-19 topic: Ironbelly Replacement?]
- Grant to confirm that the AArnet servers will be removed and to ask the Australian community whether there is interest in hosting/providing a render server in Australia or Asia/Pacific. [2024-09-19 topic: AArnet Servers going away]
Reportage
To work on the builder for Debian packages
Related to action item [2024-08-08] Grant to work on the builder for Debian packages [Topic: apt.openstreetmap.org next steps?].
- Seems ok now, as we've got upstream package builders and if we needed we could fork
- We could automate the upload. There is an issue for it (#999), so the action item will get closed.
To evaluate Fastly Security (DDOS) Protections we could use
Related to action item [2024-08-08] OPS to evaluate Fastly Security (DDOS) Protections we could use. [Topic: Cloudflare / Fastly].
- We can turn on some Fastly Security (DDOS) Protections which can extend the rate-limiting.
- There are more controls than Cloudflare.
- They use a service which seems to be provided by a third party or it might be bought out by them.
- They have set settings. Extends rate limiting.
- Protections enabled up to a degree on our Enterprise account.
Action item
Grant to create a ticket.
To make a reasonable evaluation whether to go with Cloudflare, Fastly or none
Related to action item [2024-07-25] OPS to make a reasonable evaluation whether to go with Cloudflare, Fastly or none. [Topic: Cloudflare keep enabled?]
- We have Cloudflare on right now.
- By disabling Cloudflare we accept the risk that there might be some outages, as previously. We should respond more quickly now that we are familiar with the necessary procedures.
On IP block lists
- The IP block lists were lost during the machine reinstallations with Debian.
- We might have some blocklists with Cloudflare and Grant has an old back-up, which might need cleaning.
Grant to create an IP blocklist.
To do capacity planning for tile.osm.org
Related to action item [2024-06-27] OPS to do capacity planning for tile.openstreetmap.org [Topic: rhaegel usage?].
Paul is looking into capacity.
On rhaegel
- Has fast disks, but the CPUs are a few generations old.
- Ysera + Odin are being upgraded, and rhaegel was significant slower than them.
- There is no decision yet on what will be run on the system.
Grant might need to go to Slough and Amsterdam.
To revisit the OpenMapTiles application
Related to action item [2024-05-02] OPS to revisit the OpenMapTiles application..
Action item
Paul to add the OpenMapTiles application to the next agenda, together with the editor inclusion policy.
Ironbelly Replacement?
Ironbelly currently hosts imagery (more below). We run on it os.openstreetmap.org and the Australian reference service, which will need to be moved to the new machine.
Ironbelly is old
- One hard disk has failed.
- Another disk is at risk of failing, as a single disk failure puts additional strain on the remaining disks.
- There have been 10 tickets on Ironbelly disks since 2017. Each ticket probably needs at least 1-2 hours of work.
Space
- There is 36 TB of data, with nearly all of it in use. Tom has copied over the logs, which have not been expiring correctly for some time. Grant will remove the logs.
Plan: Replace Ironbelly with a more modern HP machine that is good enough, reusing some existing parts.
On cost
- Do not want to spend much money on it.
- NVMe are nice but expensive. The price difference is GBP 6,000.
Suggestions
- Install the disks where applicable and acquire a slightly more modern machine.
- Imagery services os.osm.org, Australian service, and others need to be migrated to the new system.
- Not unreasonable to spend the money to do it.
- Assess the required storage capacity for the next five years.
Other points mentioned during discussion
- We may be spending more on remote assistance ("smart hands") and Grant's time addressing issues with old hard drives than the cost difference between older and newer drives over the past 12 years.
- Recently, two disks were replaced through remote assistance.
Imagery
- Grant wanted to test imagery on S3; however, there were not enough credits available.
- We did not agree to turn off the services over the weekend to save sufficient funds, so the testing was not carried out.
Issues related to imagery
- Source imagery needs to be stored somewhere.
- Brazil source imagery is 110 GB, once converted to suitable format, processed and then retained.
- Suggestion: use the AWS open data program.
- In some cases we might have some imagery which was provided to us, but we can't distribute it.
Brazilian imagery
- Got access to semi-private FTP server to get it.
- The National Mapping Service is involved in mapping activities. They own the imagery and do not consider derivative works created from it to be subject to copyright links to the original work. They have granted us permission to use it.
- We can't use the AWS Open Data program for this case.
South Africa imagery
- Total processed data: 7 TB.
- Source imagery: 16 TB.
- Licence: We have permission to use and distribute the imagery. The South African National Mapping Agency is credited on our copyright page as the source. Agencies are generally unaware of copyright.
- New imagery is published monthly. We will receive the next set, approximately 1 TB, after which we will pause, as it will cover the entire country at a resolution of 36 cm.
Surrey imagery
- It was provided to us, probably under ODbL.
Australian refernce imagery
- Not used much anymore.
- The source imagery is giant TIFF files.
US imagery
- We have for 1 of the bigger counties of Texas.
Action item
Grant to come up with estimates for space needs in next 5 years and cost for Ironbelly replacement.
AArnet Servers going away
https://community.openstreetmap.org/t/aarnet-tile-cache-decomissioning/118953
background:
A community member in Australia had contacted AARnet to get two servers for us. They were identical, and we used one as a cache server and one as a render server.
Issue:
The community member received an email from AARnet stating that the servers will get removed at the end of October. However, he did not inform us directly and instead emailed talk-au to notify them.
- Grant emailed the community member today for clarification.
- The discussion on talk-au seemed to indicate that there is academic interest in hosting and providing replacement servers.
On servers' traffic
- The current traffic is minimal.
- The absence of the servers will result in a slight increase in latency.
Suggestion: Ask the Australian community to get a render server in Australia or Asia/Pacific. Singapore would be expensive.
Action item
Grant to confirm that the AArnet servers will be removed and to ask the Australian community whether there is interest in hosting/providing a render server in Australia or Asia/Pacific.
Fastly Shielding
Shielding
All traffic through a particular server is going to go through one Fastly point, so it has another layer of caching. It picks a local cache to your backend server and that's the single point of entry. Essentially, it's two-tier caching and the inner tier is what accesses your server, not the edge.
- Shielding didn't improve health check failures or latencies in previous tests, even though it's going from Fastly to Fastly.
- This may have been influenced by the fact that the test was in Australia.
- There seemed to be some race with the timeouts and we probably had changed the interval or the timeout.
- Fastly requires us to enable shielding when using their image transformation service to avoid redoing the imagery work on each edge cache.
- This issue is not a priority.
Decision: Drop the image transformation.
Suggestions
- Do an experiment with shielding at a different region, we could seek Fastly's assistance in setting it up.
- Paul would like to work on serving error tiles as images.
- Use the tester distribution and then do a migration.
Any other business
Grant's travel to Amsterdam
Still deciding on the best travel option, but leaning towards driving and taking the ferry, as:
- it costs about the same as other travel options,
- runs overnight, eliminating the need for expensive accommodation in Amsterdam,
- allows for easier transport of bulky items, such as label printers and tools.
Replacement in motherboard that failed
- Plans to replace the motherboard in the failed machine, which was mentioned by Paul in a previous discussion.
- Gen9 model.
- For GBP 45, there’s potential to get the machine operational again, though it’s a gamble.
- NVMe drives are now cheaper than SATA drives (Enterprise), especially for larger capacity drives.
Action items reviewed at the beginning of the meeting
- 2024-08-22 Guillaume to make a limited in scale experiment to assess impact and practicality. E.g. look if there are clients that say they support WEBP and don't. [Topic: Fastly image recoding]
- 2024-08-22 Guillaume to keep OPS in the loop about what Fastly says. [Topic: Fastly image recoding]
- 2024-08-22 Grant to talk to Guillaume on setting up the testing about image recoding and shielding. [Topic: Fastly image recoding]
2024-08-08 Grant to work on the builder for Debian packages [Topic: apt.openstreetmap.org next steps?]2024-08-08 Grant and Tom to discuss apt.osm.org supports api upload [Topic: apt.openstreetmap.org next steps?]- 2024-08-08 OPS to evaluate Fastly Security (DDOS) Protections we could use. [Topic: Cloudflare / Fastly
- 2024-07-25 Grant to determine the Cloudflare API call to block IPs, in order to deal with scrappers [Topic: Cloudflare keep enabled?]
- 2024-07-25 OPS to make a reasonable evaluation whether to go with Cloudflare, Fastly or none. [Topic: Cloudflare keep enabled?]
- 2024-06-27 OPS to do capacity planning for tile.openstreetmap.org [Topic: rhaegel usage?]
- 2024-05-02 OPS to revisit the OpenMapTiles application. # 2024-06-13 They haven't responded to the questions. Paul to email them again.
Action items that have been stricken-through are completed, removed, or have been moved to GitHub tickets.