Operations/Minutes/2025-05-29

From OpenStreetMap Foundation

OpenStreetMap Foundation, Operations Meeting - Draft minutes

These minutes do not go through a formal acceptance process.
This is not strictly an Operations Working Group (OWG) meeting.

Thursday 29 May 2025, 19:00 London time
Location: Video room at https://osmvideo.cloud68.co

Participants

Minutes by Dorothea Kazazi, including notes by Grant.


New action items from this meeting

  • Grant to pass to the LWG the privacy question about publishing a .csv file of user descriptions for suspended or deleted accounts. [Topic: Publish osm.org spam profile dataset?]

Publish osm.org spam profile dataset?

Grant tried to create a decent word-based training set for identification of spam. He had conversations with Minh and others.

Suggestion:

  • Publish a .csv file of user descriptions for suspended or deleted accounts with 0 edits.
  • For the filter, we could use a time cut off and a size limit for the user profile.

For the ham data, the intention is to use the user description of active osm.org accounts. It is effectively public data, while we don't actively publish the list of users.

The goal is for someone to configure a filter that the OWG can use.

Potential issues:

  • We don't have a distinction between users who deleted themselves and users who we have deleted.
  • Some users link to their osm wiki profiles, so it is easy to figure out who they are. OWG don't "see" any privacy issue, but should likely get approval from LWG.

Other points mentioned during discussion

  • From the suspended accounts, the non-spammy ones are limited.
  • Suggestion: The Rails port to query an endpoint.
  • In the future, IP addresses and email addresses may be considered for the spam score.

Spam: SELECT u.description FROM users u WHERE u.status IN ('deleted', 'suspended') AND u.changesets_count = 0 AND u.description != AND u.display_name NOT ILIKE 'user\_% AND u.creation_time >= 2024-05-01 AND length(u.description) <= 1MB.

Ham:
SELECT u.description FROM users u WHERE u.status IN ('active') AND u.changesets_count >= 10 AND u.description != AND u.display_name NOT ILIKE 'user\_% AND u.creation_time >= 2020-01-01'

Action item: Grant to pass the privacy question to the LWG.


Vector Tiles

Paul did a load test on Dribble, which is slower than Faffy. Raster tiles at the standard layer peaks at 5,000 backend requests/sec. Faffy 15,000 backend req/s max.

  • Grant to discuss cache headers with Paul once we have some realistic traffic.

Changes by Tom Tom fixed a raster mod_tile caching config issue. Related: https://github.com/openstreetmap/chef/commit/f4d7ffa4de206890c7ed7a5e91c135d3be442099

-mod-tile issue: it doesn't properly merge the server level config with the virtual host config. So, all the settings we had for setting the tile expiry at server level, were not being applied in our virtual host. Everything was just getting the random three hours plus the 20% of the difference between when it was last rendered.

  • zooms 0 - 9: set at seven days.
  • zooms 10 - 13: set at one day, even through 13 is dynamic and not pre-rendered.
  • zooms >13: shorter.

On effect of changes: Some improvement in hit ratio.

  • The request rate seems up from 82% to 87%.
  • Queue lengths seem lower, probably because the expiries are now longer.


Next steps for Paul Work on:

  • Coastlines (essential)
  • Right to Left on demo-page.
  • Tlekiln
  • Final review of shortbread.
  • Another pass of tile optimisation.
  • Paul will PR to osm.org when ready.

AWS backups

Grant to work on the AWS backing-up process.

Plan: start sending backups to respective folders: "daily", "weekly", "monthly" - each with different expiry rules. Essentially, we will keep the longer ones and expire the shorter ones automatically.


Action items reviewed at the beginning of the meeting

  • 2025-05-01 Grant to follow-up with Australian hosting again. [Topic: OSUOSL funding / issues]
  • 2025-05-01 Grant to see if other University offers are still available and what hardware would be required. [Topic: OSUOSL funding / issues
  • 2025-03-20 Grant to investigate whether Karm's latency spike on 10 Jan 2025 is due to IO or network. Most likely IO. Karm may need upgrading to handle sync. [Topic: Database Server Upgrades] Done.
  • 2025-03-20 Grant to negotiate with HE.net if we can get better cost from them as a fallback link (which he had proposed), to allow budget spend elsewhere. [Topic: HE.net]
  • 2025-03-20 Grant to follow-up with two new people. Will see if can support onboarding them. 1 with Chef PR, changes to PR still pending. 1 asking about container work. [Topic: New people] Done
  • 2025-03-20 Grant to run an SQL query to identify more email providers used by spammers. [Topic: Spam]
  • 2025-03-06 Grant to present a draft budget at the next meeting.
  • 2024-09-19 Grant to create an IP blocklist script. [Topic: Cloudflare keep enabled?][2024-09-19 Reportage] - Discussion during 2024-07-25 OPS to make a reasonable evaluation whether to go with Cloudflare, Fastly or none.

Action items that have been stricken-through are completed, removed, or have been moved to GitHub tickets.