Operations/Minutes/2024-10-31

From OpenStreetMap Foundation

OpenStreetMap Foundation, Operations Meeting - Draft minutes

These minutes do not go through a formal acceptance process.
This is not strictly an Operations Working Group (OWG) meeting.

Thursday 31 October 2024, 19:00 London time
Location: Video room at https://osmvideo.cloud68.co

Participants

Minutes by Dorothea Kazazi. Including notes by Grant.

Absent

Vector tile serving

Notes by Grant:

We want another machine to run the vector backend.
dribble taginfo -> tabaluga.
dribble will become a vector with Debian 12. (Grant to reinstall)
Maybe ship spare disk to AMS? (2x 960TB)
New Server Quote for Vector? (2024 budget) - Paul

How do we prevent font assets becoming a publicly used resources? Hash files? Something in a path?

Discussion on Vector style + fonts potentially published as npm package? Easy for osm.org

Suggestions

  • Put Paul's Vector Tiles as a featured layer on www.osm.org.
  • Set up a second machine for Vector Tiles, otherwise there will be no redundancy. Currently, Paul's VTs running on a VM at Cloud Ferrero.
  • Put Dulcy in next year's budget.

-VT storage requirement: 1.5 TB, which will increase.
-Taginfo: uses 72 GB on Dribble.

Machines

  • Dribble (Vector tile server - HPE ProLiant DL360 Gen10): using 1 TB and has around 4TB. It stores 2 copies of planet files (~ 260 GB total). Most space from a user testing planet tiler, stored in /home (home=~767 GB).
  • Tabaluga (Taginfo server - HP ProLiant DL360 Gen9): does not have enough storage, ~480 GB.

Suggestions for Dribble:

  • Delete one copy of the planet.
  • Remove the user testing planet tiler.

Other suggestions

  • Taginfo: move from Dribble (14 cores and specs of rendering server) to Tabaluga (16 cores and not adequate storage for VT).
  • Contact Sarah Hoffmann or Jochen Topf. We could move them to Debian 12 at the same time, as all packages are properly available.

On setting up a second machine for Paul's Vector Tiles

Suggestion: Get a Xeon Scalable v2 HP Gen10, which is about €4,000 - within the limits of our remaining budget.

Plan for AMS

  • Move the gateway off Ironbelly (Site gateway - Supermicro X9DR3-F) and turn it off, as it's not power efficient.
  • Grant needs to work out what to do with the imagery data on Ironbelly.
  • Lockheed: decision needed whether we'll keep it or get more disks.

On Nominatim

  • Dulcy (Nominatim geocoding server): in AMS. Very low power compared to the other Nominatim servers and does not carry much load.
  • Vhagar (HPE ProLiant DL360 Gen10) is also Gen10 in AMS running Nominatim.

Suggestion: Get a new server in US (e.g. Oregon), replacing Dulcy.

Other points mentioned during discussion

  • We need special machines only for imagery and the API database.

Software-side: on fonts

Concern: Paul wants to put a demo on vector.osm.org and is concerned about needing to keep around forever the fonts it will point to, if other people use them.

  • Paul currently using at a different font stack than what he would like to, because someone else is hosting that font.
  • The same issue will also have to be solved for the webpage.

Suggestions:

  • Put the fonts and the style behind a CDN.
  • Do it through the website, as we can handle automatic hashing of the urls, so people hopefully won't rely on them.

If people copy the style file from the website, and ignore the fact that the font is buried inside as a reference, it will break.

  • Paul to publish a node package that includes the style file and the built fonts. Then OPS can pull that and generate RAILS assets with hash parts. Then, at time of new releases by Paul, he will send a PR to update te version of the package. OPS can point people to that package or to upstream.

Other points mentioned during discussion

  • Preferable to avoid the iD model, where a third-party project is copied completely into our code.
  • The style is built in JavaScript.

Action items

  • Move Taginfo from Dribble to Tabaluga.
  • Ask Sarah about shutting down Dulcy.
  • Grant to reformat Dribble with Debian 12.
  • Paul to get a quote for a new server, which might run Nominatim.

Cloudflare

Notes by Grant:

Issue with DNS round robin. server(s1,s2,3) mod client IP = backend used (regardless of if backend i up) Team are aware of the issue now.

Plan is to move to fastly once evaluation done of DDOS/security protection

Issue

CloudFlare's backend selection doesn't automatically failover when a server is down, when using round-robin. With payment, you can enable monitoring the backend, but they don't do any backend retries. when you're using round robin.

Suggestion

Turn off Cloudflare at some point, as the free plan is probably not suitable for us.

Other points mentioned during discussion

  • The issue might be due to requests not reaching the backend, if something is cached.
  • Tom prefers having a CDN for the static assets rather than for the live ones.

On blocking scrapers

  • For non-static assets, it's harder to block scrapers as this needs to be done at the CDN level.
  • On blocking at the machine level: We traditionally have blocked by IPs, and the IP addresses we see on the machines are not of the craper, it's of Fastly or Cloudflare, so we would stil accept incoming traffic.
  • Both Cloudflare and Fastly have APIs to block people. For Cloudflare, you have to know the ID of a record to delete it, which essentially means querying all records first to find the ID.

On taking a server out of rotation

  • We have to wait for the DNS to expire, which is the way we did it before we moved to Cloudflare.

Plan: Grant to do the work on Fastly first.


OpenMapTiles application

Notes by Grant:

OpenMapTiles 1) Needs a name, ask them. 2) Technical implemenation review for osm.org

Consensus seemed to be: conditionally add OpenMapTiles as a featured layer, subject to technical considerations and proposal of a name different from "OpenMaptiles", as it is a style on top of OMT, or "OSM tiles".

On distinction

  • Layer is not distinct on a visual basis compared to current ones, only on technical basis.
  • Suggestion: Accept their outdoor style.
  • It might have been better if they used a different style than one cloning OSM Carto. However, they did not put another layer forward.

On novelty

  • They do languages.

On having two vector layers on www.osm.org

  • We shouldn't add a featured layer, if we're going to remove it in a few months.
  • Suggestion: Add both OMT and Paul's Vector Tiles. One will be closed and the other one open.

On the suggestion to add a distinct (V) next to the list of Vector Layers on www.osm.org

  • Preferably not, as it mostly doesn't matter to the user what the back-end technology to render is.

Other points mentioned during discussion

  • All previous featured layers were Mapnik, but were visually distinct.
  • Layers that become featured on osm.org get a lot of feedback - example, the most recent addition.
  • There's no reason to remove an existing layer, because we're adding a new one.
  • We need to sort out the vector technology, in order to add them.
  • Paul's style is VersaTiles Colorful.

Action items

  • Paul to contact OMT about their application to be added as a featured layer on www.osm.org.
  • Paul to look where OMT pulls their fonts from and their style file.

Editor inclusion policy

Motion

Other points mentioned during discussion

  • It would remove the urgency from finalising the Editor inclusion policy soon.

Action item

Paul to email the board about the Rapid application.


Imagery

Grant sent an email some weeks ago about getting some machines for imagery.

While in Amsterdam, Grant found a machine costing less than GBP 500, excluding the disks:

  • It's a 380 with 14 bays: 2 at the back for the OS, included in the price.
  • 840 RAID controller (performance-boosted).
  • 12 bays at the front, of which 7 are used. Grant has put some of his non-enterprise disks, which are not very fast.

Suggestion:

  • Get new disks.

On budget

  • Karm and Eddie done.
  • We have EUR 2K for upgrade contingency.
  • The machine fits in the budget - ideally needs new disks to lower maintenance.
  • Biggest expense is the disks, not the machine.

Other points mentioned during discussion

  • We can get a Gen10 later.
  • The disks in Ironbelly are dead and were faulty.

Action items reviewed at the beginning of the meeting

  • 2024--09-19 Grant to create a ticket for action item [2024-08-08](https://hackmd.io/su12wMb9TR2kd1I5lLJ8vw) OPS to evaluate Fastly Security (DDOS) Protections we could use. [Topic: Cloudflare / Fastly]
  • 2024--09-19 Grant to create an IP blocklist. [Topic: Cloudflare keep enabled?][2024-09-19 Reportage] - Discussion during [2024-07-25](https://hackmd.io/iyFjUWl1RY6D_pevem8ciA) OPS to make a reasonable evaluation whether to go with Cloudflare, Fastly or none.
  • 2024--09-19 Grant to confirm that the AArnet servers will be removed and to ask the Australian community whether there is interest in hosting/providing a render server in Australia or Asia/Pacific [2024-09-19 topic: AArnet Servers going away]
  • 2024-08-22 Guillaume to make a limited in scale experiment to assess impact and practicality. E.g. look if there are clients that say they support WEBP and don't. [Topic: Fastly image recoding]
  • 2024-08-22 Guillaume to keep OPS in the loop about what Fastly says. [Topic: Fastly image recoding]
  • 2024-08-22 Grant to talk to Guillaume on setting up the testing about image recoding and shielding. [Topic: Fastly image recoding]
  • 2024-08-08 OPS to evaluate Fastly Security (DDOS) Protections we could use. [Topic: Cloudflare / Fastly]
  • 2024-07-25 Grant to determine the Cloudflare API call to block IPs, in order to deal with scrappers [Topic: Cloudflare keep enabled?]
  • 2024-07-25 OPS to make a reasonable evaluation whether to go with Cloudflare, Fastly or none. [Topic: Cloudflare keep enabled?]
  • 2024-06-27 OPS to do capacity planning for tile.openstreetmap.org [Topic: rhaegel usage?]
  • 2024-05-02 OPS to revisit the OpenMapTiles application. # 2024-06-13 They haven't responded to the questions. Paul to email them again.

Action items that have been stricken-through are completed, removed, or have been moved to GitHub tickets.