(04/28 UPDATE): I’m now running my first-pass patch right here on my own site. 🚀
WordPress is an extremely flexible piece of software, and it comes with many different settings. Some are made visible to users via Admin > Settings
and others are stored invisibly so users aren’t bothered by them, but all of them are saved in a single database table named wp_options.
Today, it looks something like this:
This database table actually has a few interesting qualities to it. Conceptually, it’s a very simple key/value approach to storing any kind of arbitrary information. It’s a distant cousin to all of the meta
database tables WordPress comes with (for posts, comments, terms, and users) and I’m a big fan of the entire meta-data API – it’s now fully implemented across all major object types (except blogmeta
and term_relationshipmeta)
and, honestly, it’s one of the few “complete” APIs you’ll interact with inside of WordPress today, aside from probably roles & rewrite rules.
The options
API, however, is actually quite a bit different from meta
, enough to warrant this blog post, and enough for me to have spent the past 4 days studying it, researching it, and generally trying to find ways to improve how it performs under stress and load. (I’ve come to appreciate meta
even more now, though I have some concerns about it that I’ll share near the end of this post.)
And for all the flexibility that WordPress allows us, once we start taking advantage by adding plugins, themes, multisite, and so on, we are likely to reach a point where the wp_options
database table becomes a graveyard of abandoned options that never get used again. “Full” is a funny (but purposeful) word avoidance choice here. A quick search of the web will tell you that the maximum number of rows a database table like wp_options
can have is around 1 billion rows, but because of how WordPress interacts with them, “full” could actually mean only a few hundred rows, or 1 huge row if you’re being terrible.
Name, Value, Autoload
This table has 2 unique keys – option_id
and option_name
– though WordPress only ever concerns itself with option_name
. A unique key means no two rows can contain the same value in that column, which is like saying “no two switches control the same light-bulb” or “no two drivers licenses identify the same one person.” In WordPress, this means that every setting only has 1 unique entry to look for, making it very easy to know that it’s doing what you’ve asked it to.
The option_value
column is a LONGTEXT
, which means under today’s typical MySQL configurations, one row could contain around 4 billion bytes (or about 4 gigabytes.) You probably don’t want to store that much information there, but I think it helps to understand the theoretical maximum.
Lastly, the autoload
column exists to minimize the number of peeks into the database WordPress needs to load all of the settings, and will only ever have a value of yes
or no.
(This is pretty silly today, and it wouldn’t be an unusual old MySQL performance hack if the autoload
column were indexed.) The reason this column is important, is because every question takes time to answer, and in this case, asking WordPress 100 separate times to look in the database for 100 options would take 99 times longer than asking it to look 1 time for all of them – if you know exactly what options to load automatically, you can get all of them at once – at least, that’s the intended logic anyways.
Load All of the Options!
I won’t bore you with all of the details, but the number of relatively complex decisions that WordPress makes just to load the correct page on a website is both impressive and dumbfounding. Before any of that can happen, though, it needs to peek into the wp_options
database table to understand how to make those decisions in the ways you’ve configured it to. Here’s a screenshot of some of the oldest and most common entries:
Because WordPress is a friendly piece of software, it doesn’t obfuscate anything in the database by default, and it uses a generally concise and simple naming scheme for almost all things – an option like use_smilies
determines whether or not to, use smilies.
Since WordPress 2.2.0, a function named wp_load_alloptions()
(previously named get_alloptions()
) is used to ask one time for all options where autoload
is equal to yes
. Inside this function is where the first option optimizations occur, including our dramatic introduction to the object cache that comes with WordPress by default. It’s another rudimentary key/value storage system, only instead of in a database table like options
or meta
, it’s in memory in PHP, so it’s locally “hot” and very, very fast (much faster than querying a database, for example.)
Every single time you load a page that’s powered by WordPress, it loads all those options so they’re “primed” and ready to be referred back to at anytime until the page has completely finished loading. The nice thing about this, is you can call the same get_option( 'use_smilies' )
function a hundred times, and it will return the same result without ever needing to ask the database whether or not that option exists.
At least, that’s the case with autoloaded options. Everything else… well… that’s another story.
Once, upon a time…
It was a pretty clever idea to load all of the options in one fell swoop, and more-so to allow certain options to be excluded from that. Some options aren’t necessary on every single page-load, so if it’s only used conditionally or periodically, it would be wasteful to always ask for it, keep it cached, and never actually refer to it.
But the autoload
column actually comes with a bunch of added complexity. What happens if you need an option that doesn’t exist in the database yet? Or if an option needs to switch from being autoloaded, or to not be? That means each option you need that isn’t autoloaded will cost 1 additional database query to go out and get. And it means switching an option’s autoload property requires updating the database and the object cache at the same time to reflect those changes in the live environment.
If the autoloaded options all live under the alloptions
cache key, not-autoloaded and nonexistent options naturally live under the notoptions
cache key. I’m not kidding – WordPress also caches the “other” options requests anyways, because if it didn’t, calling get_options( 'bedtime_story' )
100 times would hit the database 100 times looking for an option that doesn’t exist in the database.
So if all options are cached, and not-options are cached, what’s the point of having an autoload
column at all? There’s an argument to be made for abandoning its usage, but backwards compatibility isn’t a feature, it’s an ethos – it can’t just be ignored – and this code goes all the way back to the very first versions of WordPress, so we can’t easily unplug it after all these years, can we?
Oh yeah…
All of that explanation up there leads us to a hidden limitation of the options API deep down in its core. While WordPress comes with a simple local object cache, some persistent object caching drop-ins (Memcache, Redis, etc…) have limitations on how much memory 1 cache key can occupy, and those limitations are much smaller than the 4GB LONGTEXT
theoretical maximum of 1 row, let alone 1 billion theoretical maximum rows all in one alloptions
cache key.
For example, Memcached has a 1MB (default) upper limit for the entire alloptions
array of autoloaded option names & values, and another 1MB upper limit for the notoptions
names and values array. That’s not per-option, it’s per array, for the cumulative size of all items in each individual cache-key, so autoloading them together into 1 primary key becomes a huge problem very quickly.
It’s actually one of those problems that any good web hosting company has been silently solving for you forever, and subsequently a problem that I’m working on solving for them, and for you, and for WordPress going forward.
What’s Next?
13 years ago, WordPress was obviously a very different piece of software than it is today – and we were very different people, so with any luck, our approaches to solving problems back then should be different than how we’d do it today, too. I’ve spent the last 4 full work days doing the deep dive into why the current approach was good then, what seems good now, and what problems might we have going forward.
This post represents part 1 of those 3 steps.
I’ve written and rewritten hundreds of lines of code already, and determined that abandoning the alloptions
and notoptions
cache keys completely is the most desirable option, and using the option_name
as the cache key for each individual option will provide the most headway going forward. I have a few other hypotheses to test, around storing entire objects vs. only the option_value
, around having a separate cache group for notoptions
keys to reside in, and about adding a separate wp_optionmeta
database table to allow for each 1 option to have its own arbitrary attributes.
For me, I get to spend a few more days trying to implement my ideas and testing that they actually perform as good as (or better than) what WordPress is already capable of, and that it adequately spreads the pain around when drop-ins like Memcache or Redis are in use. I’ll report back my findings here, and share them on the relative WordPress core Trac tickets once there’s something worthy of a RFC.
You forgot something
Oh yeah! About the meta-data API, right? Interestingly, when trying to get a piece of meta data about an object in WordPress, the update_meta_cache()
function works similarly to wp_load_alloptions()
in that it queries it’s database table for all rows relative to the object’s ID. One key difference is that all meta is always loaded for the object (there is no autoload
option.) Another is that any changes to meta-data will delete the relevant cache, but options modify the cache directly without deleting it so subsequent calls don’t typically generate additional database queries.
As custom post types start to store more and more things in wp_postmeta
, cache keys may start to reach the same theoretical upper limit that wp_options
has been plagued with because they are not unique. For example, if you store a thousand matched meta keys for a single post, and each of those keys is of any significant size, it could totally hit the same Memcache 1MB default boundary. I don’t know of this happening to anyone yet, but it seems plausible if not inevitable.
If you’ve made it this far in this post, you care about a silly dumb problem as much as I do, so we are instantly friends. If you’re also independently working in this area of WordPress, I’m excited to compare notes and Voltron up and blazing sword our way through this together. 💜