Sub4
Sub4 is the name I just made up for a project that aims to get the parse time of all pages on the RuneScape wikis below 4 seconds.
We've done a pretty good job caching pageviews via Cloudflare (and the parser cache generally helps a lot too). But because of daily Grand Exchange updates invalidating all the caches, there's a pretty hard lower bound on the number of pages we need to generate from scratch every day. A large percentage of those happen immediately after GEBot runs, which leads to very spiky (read: unstable and expensive) resource utilization, and also unexpectedly slow load times for "randomly selected" users. It's maybe a bit counterintuitive, but because a page can't be cached until it's been parsed, the total resource utilization for certain popular, slow pages (Money making guide) actually grows quadratically with the parse time.
We can't really prevent this spikiness at some level (short of insane suggestions about moving most of the GE stuff outside the MediaWiki layer). It's just a limitation of us being RuneScape, that most other wikis don't need to deal with. But what we do have control over is how long each of those pages take to load.
There's a bunch of techniques we can use to make things faster, a lot of which are actually pretty easy. This improves things for everyone and probably gives us more money to waste at wikifest.
How do I see how long a page takes to load?[edit | edit source]
Just refresh the page and count the seconds. Duh!
No, okay, really what you want to do is click "edit", and then preview the page. If you scroll down to the bottom and click "Parser profiling data", there'll be a box that has a field called "Real time usage". That's how long it took.
The "CPU time usage" is also useful for tracking how much of the time was the actual server thrashing, rather than waiting for a result, although I've found it to be somewhat unreliable (for example, calling out certain waiting-time as the CPU's fault when really it was I/O-related).
How pages get generated[edit | edit source]
(Note: this is a simplified version of my own understanding. Please don't read this if you actually know how MediaWiki works!)
We have multiple layers of caching, which means that most pageviews never need to hit the database to generate content, or run DPL/SMW. But those caches get invalidated if someone edits a page, or a price changes.
When that happens, the page needs to get regenerated from the revision text in the database. The MediaWiki parser reads that text, figures out which templates/modules/etc and such are used, loads those revisions from the database, and repeats the process until all the dependencies are known. Then it does a full parse on all of this content, which often involves more reads (to figure out what color links should be, how to size images, any DPL/SMW, etc).
A page parse for an average article can often be 500 or more total database reads (big pages can be way more!). MediaWiki does synchronous database I/O (i.e. if you need to iterate over a list of things and do a query for each of those, they will happen one at a time). When the response time from the database might be around a millisecond (pretty much regardless of the actual query complexity), this time where we're waiting for a DB response often adds up to a majority of the total parse time.
This is a really important takeaway: In my experience with RuneScape, page parsing time is usually dominated by waiting for database I/O, rather than anything related to the CPU usage of the parser.
Strategies to make page loads faster[edit | edit source]
Here's a few things everyone could help with:
- DPL include is evil - We should almost never, ever use the
include
tag from DPL. I don't fully understand how it grabs template parameters, but I think it parses each of the response pages, which is extremely slow. It's actually even worse than that because technically any included pages count as transclusions, so any time any of those pages change (say, you edit a single achievement), it needs to re-generate the index page, which means re-parsing all of the others too. Most of our worst page speeds are DPL includes, and there's still a lot of fairly easy wins we could get by moving them over to SMW, including Disassembly materials, rune pages and more. In the immortal words of Gaz, "DPL bad, SMW less bad". - Reduce the number of properties read/written by SMW - "DPL bad, SMW less bad", but SMW is still pretty bad - the amount of time it takes to get a SMW #ask response (and I guess also to do a #set) is directly proportional to the number of properties (columns) in the query. This makes it worth it to put as many of the properties into a single "JSON" field that can be read further by Lua (JSON parsing is really fast). If you need a field not just for a printout, but also to filter the output list (e.g.
[[Dropped from::Lesser demon]]
it's okay to have that separately, although you should also consider filtering using categories. - Get rid of 5-year-old userspace DPLs - This is less about parse times proper, but still useful. Any time someone has a DPL in their userspace with an <include>, and one of the included pages gets edited, then their DPL will get re-parsed by the job queue jobs that update page links. This is a huge drain on resources (for example, one person's DPL subpages from years ago used to be responsible for 10% of our entire network's job queue costs). Generally these DPLs were meant for one-time use, but have just sat there for years, constantly using up our resources. I generally blank these when I come across them (unless they're very recent), and you should too.
Here's a few things we should do as a network, but are hard for a single person to help with on-wiki:
- Co-locating infrastructure better - Since page parse time is so closely related to [number_of_db_queries] * [db_round_trip_time]...if we can reduce the round-trip-time by 50%, then we get something fairly close to a 50% reduction in parse time. Our current round-trip-time is about 1ms, which sounds good, until you need to do 1000 database queries to generate the page. It's not clear if this can be improved a ton unless the resources are on the same physical machine (instead of just the same LAN). This is frustrating because it's generally considered bad practice to put your database and webservers on the same physical machine, but can make a massive difference in database latency (something like 10-20 fold). This is why development wikis are often much faster than production ones, because usually everything's just located on the same machine. I think this general logic applies not just for the MySQL layer but also for the Redis caches we're using.
- Replace SMW with something faster - Roughly 50% of our total database reads and 75% of our writes are coming from SMW, and the whole system is so complex that there's not a single person alive that understands the entire codebase. SMW is not optimized for our types of use cases, and moving away from it would almost certainly improve page latency. Cargo is another option, although it also has some performance characteristics that I don't love, and we'd probably need to fork/patch it to make it work well for us. It might be easier to just make our own thing (Bucket, anyone?), but I wouldn't commit to that yet.
- Cache more sub-page-level resources that don't change - We have plenty of memory on the MediaWiki pods that we're not using. If there are opportunities to avoid database queries by just storing things locally (for example, increasing SMW entity cache sizes), then that's a no-brainer. It's hard to know what those would be now, though. We can also generally afford to cache more stuff per-page than English Wikipedia (which is what MediaWiki core is most optimized for), so we should take advantage of that.
- It would also be cool if there was a way for the parser cache to not invalidate parts of the page that haven't changed (e.g. navboxes), but I don't think this exists, and it sounds very hard, and would probably have to be a core MediaWiki change.
Generally the best technique for finding things to speed up, is to profile page loads on a dev wiki. There is a MediaWiki profile tool that gives a pretty decent overview the timing of each method call. I can also generate a list of the worst-offending pages, if that would be useful to folks. Let me know.
Significant wins[edit | edit source]
- GCP migration (Nov 2021) - The major site refactor reduced page loads by about 35% on average, with particularly good (~70%) reductions on pages with lots of SMW asks. We think this was largely due to decreased database latency, but not totally sure.
- RevisionStore::loadSlotRecords caching (Dec 2021) - We shamelessly stole this upstream from Wikimedia, although Kitty added some improvements for the Module NS. Reduces unnecessary DB queries and saved us about 10% on average page load time.
- DropsLine rework (Feb 2022) - Reduced the SMW properties from 10+ to 3, simplified queries and generally reduced the amount of Lua code executed. Reduced the parse time for
{{Drop sources}}
by about 50%, reduced{{Average drop value}}
by about 70%, and reduced{{DropsLine}}
by about 10% (less than I was hoping...) - LinkCache caching (Feb 2022) - Roughly half of our total database queries (~100m/day) were from MediaWiki trying to check whether a page exists, to decide how to display the
link
content of an image. We made it cache more of the responses, which reduced the network-wide database queries by about 40%, and reduced load time by about 10% on average (much more for pages with lots of icons in navboxes). - Money making guide rework - Still in progress, but moving the MMGs to use SMW instead of DPL include made the main index go from 35 seconds to 2(!). This is probably the coolest one so far.
- Achievements rework - Similar story: move to SMW instead of DPL include. Makes the main list page go from 24 seconds to about 4. This might be hard to improve further without replacing SMW, since the output is a 3000-row table, which starts to actually tax the parsing CPU a bit.
- JMods rework - Another move from DPL include to SMW. This has sped up the main Jagex moderator page to go from 8 seconds to 1 second. Each of the team page loading times have also been sped up by a varying amount depending on team size, but each by a noticable amount.
- ProxySQL removal - Profiling revealed that SQL queries were going way slower than we anticipated, which resulted in us removing ProxySQL from the architecture. This sped up SQL queries by a factor of 2-3x and saves $500 a year.
- Mobile ParserCache fragmentation - Removed unnecessary key that made mobile pages get parsed totally separately from desktop.
- Title::newMainPage (not in production yet)
- Media rework - Yet another move from DPL include to SMW. This has speed up the Livestream page loading from around 4 seconds to 1 second and the Video page loading from around 7 seconds to 2 seconds.
- refreshLinks/cirrusSearch parser cache sharing - Grand Exchange update propagation is about twice as fast now (~88 minutes -> ~48 minutes on RSW) because the cirrusSearch indexing reuses the parser cache entry from refreshLinks.
- Titles rework - Another move from DPL include to SMW. This has speed up the Titles page loading from around 5 seconds to 1 second.
- Music track rework - Another move from DPL include to SMW. This has speed up the Music/track list page loading from around 10 seconds to 5 seconds. The output is a ~1400 row rable which is a little taxing on the parsing CPU.
Things you could help with, and/or ongoing projects[edit | edit source]
Ordered roughly from easiest to hardest.
- Go search and destroy
include
in DPL calls. It's evil! - Find
titleregexp
andignorecase
and (if not a prefix search)titlematch
in DPL calls, and make sure that there's some category or template that you can use to restrict the list of pages down further than just doing (e.g.) a full namespace search. - Redo the various lists of quests on RSW to not use DPL
- Redo the music track lists to not use DPL
- Redo the update pages to not use DPL
- Figure out whether (re)moving ProxySQL would improve database latency
- Move htmlCacheUpdate jobs to be more spread out, and/or after refreshLinks, to eliminate(?) the spike in resource usage after high-use templates get edited.
- Help profile down to the millisecond how different parts of rendering contribute to total parse time.
Members[edit | edit source]
Add ur signature if u gotta go fast. Talk to me on Discord (Cook#2222) if you want to know more.
- ʞooɔ 06:15, 14 February 2022 (UTC)
- sub4sub Gaz (talk) 22:46, 16 February 2022 (UTC)
- Christine 23:06, 16 February 2022 (UTC)
- Lenny (talk) 00:58, 17 February 2022 (UTC)
- Badassiel 00:59, 17 February 2022 (UTC)
- sub5 --Legaia 2 Pla ᴛ · ʟ · ᴄ 22:39, 17 February 2022 (UTC)
- Oppose - MrDew (talk) 02:46, 18 February 2022 (UTC)
- BlackHawk (Talk) 21:05, 18 February 2022 (UTC)