Happy SilkSong Day to all who celebrate
Sep. 4th, 2025 02:47 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
It currently has 97% rating of Overwhelming Positive on Steam based on people playing it for a couple hours. On the other hand, they're only charging $20 for game that crashed Steam, GamePass, the Playstation store, and the Switch Online store when it came out. I think they deserve a bit of kudos for that.
We'll see if I beat Witch Spring or Silksong first.
Stirring Bars are Superstition?
Sep. 4th, 2025 12:08 pm![[syndicated profile]](https://www.dreamwidth.org/img/silk/identity/feed.png)
So you, working organic chemist, set up a reaction in your fume hood. You’re using some sort of flask or vial, there’s some kind of solvent in there, starting materials, reagents. And of course a stirring bar, right? Because you pretty much always stir your reactions, even when all the components are in solution right from the start - right?
Why exactly do we do that? It just feels wrong not to have a reaction mixture stirring somehow. If there’s a solid involved (a metal powder like a hydrogenation catalyst, for example, or a base like potassium carbonate that’s sitting on the bottom of the vessel, then it makes sense. But otherwise? Have you ever wondered what would happen if you just walked away?
Wonder no more: this paper actually addresses that question. And they do it in a relentlessly complete fashion: hundreds of experimental runs across a large variety of reaction types, on scales ranging from milligrams up to kilos (!) Now on that scale I’d be worried about heat transfer effects, but that’s not going to be as much of a consideration with the smaller reactions. So what happens? I think most of us would answer something like “Well, the reaction goes, but maybe it takes longer or works in somewhat lower yield”.
Well, it looks like for the majority of reactions the yields are basically identical under stirred and not-stirred conditions. When they are different, it sure doesn’t seem to be by all that much, and it appears that the unstirred reactions are as likely to have slightly better yields as the stirred ones. If you were presented with these data without being told that stirring was the variable under consideration, you’d have to conclude that whatever it was, it wasn’t really that important, honestly.
This goes for metal-catalyzed couplings, for oxidations, reductions, Friedel-Crafts reactions with aluminum chloride (!), for free-radical and photochemical reactions, electrochemical reactions (!), you name it. An MCPBA epoxidation of 4-methylstyrene, for example, on a 1.2 kilo scale, gave identical 80% yields with and without stirring. The authors end on a provocative note:
In conclusion, we investigated a total of 329 organic chemical reactions in eight categories and 25 types, including 26 chemical reactions scaled up to a gram level, one at the 100-gram level, and one at the kilogram level, and we compared the yields under stirred or standing conditions with otherwise completely identical conditions. We found that the yield fluctuation range under these two conditions was –13% to +31%. Our results showed that for many organic chemical reactions in solution, the effect of stirring or not stirring on the yield is insignificant. This conclusion is consistent with the chemical reactions that occur constantly in Nature. This research result might harm the interests of mixer manufacturers, but it will benefit humanity. Imagine how much electricity could be saved. . .
No more stir bars? Can it be? Note that all of the reactions investigated do have solvent in them - I would not attempt this for solid-phase synthesis reactions (see here!). And I would also not expect anyone working at truly large scale to ever attempt a no-stir protocol, because there are just too many things that can go wrong. But I'm a 100mg-guy, myself. Next time I’m setting up something in the hood I’ll give it a try. . .
Private Assets Need Public Buyers
Sep. 4th, 2025 05:10 pm![[syndicated profile]](https://www.dreamwidth.org/img/silk/identity/feed.png)
The return of tech specs
Sep. 4th, 2025 03:08 pm![[syndicated profile]](https://www.dreamwidth.org/img/silk/identity/feed.png)
I’m confident that going forward, software engineers will need to relearn how to create detailed tech specs for complex changes. It’s also likely that AI will help write and review these specs before implementing them. It’s time to embrace tech specs again because they can be a key to advancing your career.
Tech (or Design) documents are a good idea no matter what, and it’s interesting that AI tools are what is making them more popular again. But not surprising. Everybody works best with lots of context and a plan.
How can I write a C++/WinRT IAsyncOperation where T is not a Windows Runtime type?, part 1
Sep. 3rd, 2025 02:00 pm![[syndicated profile]](https://www.dreamwidth.org/img/silk/identity/feed.png)
I often get asked about how one could use C++/WinRT’s IAsyncOperation<T>
where T
is not a Windows Runtime type.
The T
in IAsyncOperation<T>
must be a Windows Runtime type because the algorithm for generating an interface ID for IAsyncOperation<T>
works only on Windows Runtime types.
But all hope is not lost. You just have to decide which rule you want to break.
The best solution is to break the rule “I’m using IAsyncOperation<T>
.” Instead, use some other coroutine library that supports arbitrary C++ types. Available options include cppcoro::task<T>
, wil::task<T>
(based on my simple_task
, but expanded to support COM via wil::
), and concurrency::task<T>
.
Another option is not to use the T
to return the result, but rather pass the output parameter as an explicit input.
winrt::IAsyncAction DoSomethingAsync(std::shared_ptr<std::optional<Widget>> result) { ⟦ calculations... ⟧ // Store the result result.get()->emplace(blah); co_return; }
This is the most general case in which the Widget
is not inexpensively default-constructible. If the Widget
is cheap to construct, you can avoid the std::optional
:
winrt::IAsyncAction DoSomethingAsync(std::shared_ptr<Widget> result) { ⟦ calculations... ⟧ // Store the result *result = blah; co_return; }
It’s important that we use a shared_ptr
to ensure that the result lifetime extends to the point we store it. You might be tempted to do something like this:
// Code in italics is wrong winrt::IAsyncAction DoSomethingAsync(Widget& result) { ⟦ calculations... ⟧ // Store the result result = blah; co_return; }
The problem is that you have no guarantee that the caller will keep the result
value through to the completion of the coroutine. If the caller returns before DoSomethingAsync
completes (say because it doesn’t await the result, or it encounters an exception before it can await the result), then the “Store the result” will be writing to an already-destroyed object and will corrupt memory.
Another option is to use IAsyncOperation<T>
but break the rule that T
is the thing you want to return. You can instead return an IInspectable
that is hiding another value inside it.
template<typename T> struct ValueAsInspectable : winrt::implements<ValueAsInspectable<T>, winrt::Windows::Foundation::IInspectable> { T value; template<template...Args> ValueAsInspectable(Args&&... args) : value{ std::forward<Args>(args)... } {} };
You can then use one of these “values disguised as an IInspectable
” as your operation’s result.
winrt::IAsyncOperation<winrt::IInspectable> DoSomethingAsync() { ⟦ calculations... ⟧ co_return winrt::make<ValueAsInspectable<Widget>>(blah, blah); } winrt::fire_and_forget Sample() { auto obj = co_await DoSomethingAsync(); auto& widget = winrt::get_self<ValueAsInspectable<Widget>>(obj)->value; widget.Toggle(); }
We make a ValueAsInspectable<Widget>>
and put a Widget
inside it, constructed from the parameters blah, blah
. Upon receiving it, we use the fact that we know that this IInspectable
is really a ValueAsInspectable<Widget>
, and we use get_self
to access the backing C++ class and obtain a reference to the value
hiding inside.
Note that this reference’s lifetime is controlled by the returned object, so you don’t want to do this:
// Code in italics is wrong auto& widget = winrt::get_self<ValueAsInspectable<Widget>>( co_await DoSomethingAsync() )->value;
Because that saves a reference to an object inside an IInspectable
that is about to be destroyed.
This wrapper technique has the downside of requiring you to trust that the IInspectable
returned by DoSomething
really is wrapping a Widget
in the same process. If it’s a remote object, or if it’s wrapping something else, or isn’t a wrapper at all, then you’re grabbing a reference to garbage.
We’ll work on addressing these deficiencies next time, but if you have control over both the producer and consumer of the ValueAsInspectable
, then this version may be sufficient.
We can add some helpers to make it a bit more ergonomic.
template<typename T, typename...Args> winrt::Windows::Foundation::IInspectable MakeValueAsInspectable(Args&&... args) { return winrt::make<ValueAsInspectable<T>>( std::forward<Args>(args)...); } template<typename T> winrt::Windows::Foundation::IInspectable MakeValueAsInspectable(T&& arg) { return winrt::make<ValueAsInspectable< std::remove_reference_t<T>>( std::forward<T>(arg)); }
The first helper lets you write MakeValueAsInspectable(blah)
, and the type of the ValueAsInspectable
will match the type you passed in (after removing references). This lets you write
co_return MakeValueAsInspectable(42);
instead of
co_return MakeValueAsInspectable<int>(42);
The second overload doesn’t add any value but at least creates consistency, so you can just use MakeValueAsInspectable
everywhere.
co_return MakeValueAsInspectable(42); co_return MakeValueAsInspectable<Widget>(blah, blah);
Okay, next time, we’ll look at those safety improvements.
The post How can I write a C++/WinRT <CODE>IAsyncOperation<T></CODE> where <CODE>T</CODE> is not a Windows Runtime type?, part 1 appeared first on The Old New Thing.
all things very
Sep. 3rd, 2025 10:11 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
- Have achieved More Event Prep: both the arrows catalogue updating (albeit not printing), and Folding All The Potions
that printed successfully. - Friend is watching Orphan Black for the first time. I am getting Yelling. It's DELIGHTFUL.
- Yesterday, leaving the lower limbs class that has been prescribed in an attempt to reduce the risk of reinjuring my ankle again, I... turned my ankle. (This is not the good bit.) In more or less the same way I did in April, that was the motivation for the current round of physio, but whether it was the exercises having actually helped anything at all or the fact that I was wearing different (and more supportive) boots or just pure luck, while it's a bit sore it is not e.g. refusing to bear weight any time I don't pay adequately close attention to how I load it, so I'm counting that one as a win.
- We forgot New Elephant Day on Monday (Sheldrick Wildlife Trust calendar) so instead had New Elephant Day today... AND IT AN ADORABLE BABY RHINO. 13/10, etc.
- I am nearly at the point where I think I might be able to read the Wikipedia page on action potentials and derive meaning from it? I'm definitely slightly less confused about the cell biologist's definition of depolarization than I was even yesterday...
Nuclear Fusion in Palladium Metal. No, Really.
Sep. 3rd, 2025 02:28 pm![[syndicated profile]](https://www.dreamwidth.org/img/silk/identity/feed.png)
I’ve written here several times about the spring of 1989, when I was on my post-doc in what was then West Germany, and when the Pons and Fleischmann “cold fusion” story broke in the Financial Times newspaper. I heard about this in a radio news report and immediately went to the Darmstadt train station where I knew I could buy a copy of the newspaper, and I still have it today (the pink FT newsprint has darkened over the years!) I was absolutely elated by the news - there’s no other word for it - and that feeling is in my mind conflated by the similar awe and joy I experienced later that same year as the Berlin Wall fell and a wave of revolutions went through the Warsaw Pact countries. The original cold fusion story was already taking some hits by that time, but no matter: “Bliss was it in that dawn to be alive” and all that. Especially under current conditions, thoughts of that era are capable of inducing almost painful levels of nostalgia for me.
So I have a soft spot for odd fusion mechanisms and always will. That means that this recent paper really made my day, although it is certainly not promising boundless fusion energy itself. I believe that this is an intellectual offshoot of the work that Google funded a few years back into investigating various anomalous fusion claims, but what’s being reported here is not too anomalous, although it’s quite interesting. As some will recall, the whole cold-fusion idea was driven by the hypothesis that since hydrogen (and deuterium) are taken up at high density in metals like palladium, perhaps under some conditions these nuclei could actually be induced to fuse, which would be a far different regime than exists in tokamaks or in stars.
Those “traditional” conditions involve truly extreme temperatures and pressures, because the energy barrier to fusing atomic nuclei is substantial. You can certainly get paid back energetically at the end, though, as witness that big unshielded fusion reactor in the sky: 93 million miles away and you can still feel it on your face while its ultraviolet flux burns your skin! But even though we can’t seem to get deterium nuclei to fuse inside Pd lattices, that doesn’t mean that the metal’s deuterium-concentrating effect is useless. The new paper linked above sets up an electrochemical cell with a palladium cathode in heavy water (D2O). This loads a great deal of D atoms into the palladium, and then the team fired a beam of high-energy deuterium ions at the Pd electrode.
And these conditions did in fact lead to enhanced fusion reactions, about 15% higher than background (as measured when there had been no current applied to the electrochemical cell and thus no real deuterium loading of the palladium. Figure 3 in the paper shows this effect clearly. The neutrons produced by the fusion to helium-3 were detected directly in the experiment - this isn’t one of those “excess-heat” measurements and there is no doubt that fusion is taking place. The effect repeats if you take the palladium target and completely “empty” it by heating under high vacuum and running the same conditions again. The rate of increase and its later leveling off are just what you’d expect from fusion taking place inside the metal (as opposed to in the gas phase or in the plasma used to produce the deuterium ion flux).
It’s important to note that both rates of fusion are actually very low - this effect not enough by itself to lead to any energy breakthroughs, but it really is a significant improvement in a process that has proven very difficult and expensive to improve. There are a lot of avenues here to explore: different metals and alloys, different deuterium loading conditions (and/or loading with helium-3 or tritium as well), alternative ways to blast them metal samples with deuterium, and so on. It would be of great interest to know where the fusion reactions are taking place - up on the surface of the metal or deeper in the lattice? All in all, this makes a number of useful fusion experiments a lot more accessible, and I’m really happy to see it.
By the way, out in the good-old-fashioned-tokamak fusion world, I try to go past Commonwealth Fusion Systems when I drive out to the Devens (MA) area to play disc golf. Just seeing a fusion installation on that scale makes me feel a bit science-fictional! The reactor is set to start up next year, with net-energy-gain demonstrated in 2027. If all goes well, the company plans a commercial 400 MW reactor in Virginia in the 2030s, and Google themselves have already contracted to buy power from it. I wish them luck!
Sports Team Owners Like to Win
Sep. 3rd, 2025 06:32 pm![[syndicated profile]](https://www.dreamwidth.org/img/silk/identity/feed.png)
Advanced PostgreSQL Indexing: Multi-Key Queries and Performance Optimization
Sep. 3rd, 2025 01:19 pm![[syndicated profile]](https://www.dreamwidth.org/img/silk/identity/feed.png)
Welcome to part two of our exploration of Postgres indexes. Be sure to check out part one if you haven’t already. We’ll be picking up exactly where we left off.
Article Series
We have the same books table as before, containing approximately 90 million records.
Editors’ note: Need to bone up on PostgreSQL all around? Our course Complete Intro to SQL & PostgreSQL from Brian Holt will be perfect for you.
Filtering and sorting
Let’s dive right in. Imagine you work for a book distribution company. You’re responsible for publishers and need to query info on them. There are approximately 250,000 different publishers, with a wide variance in the number of books published by each, which we’ll explore.
Let’s start easy. You want to see the top 10 books, sorted alphabetically, for a single publisher.
explain analyze
select *
from books
where publisher = 157595
order by title
limit 10;
This publisher is relatively small, with only 65 books in its catalog. Nonetheless, the query is slow to run, taking almost four seconds.
If you followed the same steps from part 1 to create this same database, note that your ids will be different.

This is hardly surprising; there are a lot of rows in our table, and finding the rows for that publisher takes a while, since Postgres has to scan the entire heap.
So we add an index on, for now, just publisher.
CREATE INDEX idx_publisher ON books(publisher);
We can think of our index in this way. It just helps us identify all the book entries by publisher. To get the rest of the info on the book, we go to the heap.

And now our same query is incredibly fast.

Nothing surprising or interesting.
But now you need to run the same query, but on a different publisher, number 210537. This is the biggest publisher in the entire database, with over 2 million books. Let’s see how our index fares.
explain analyze
select *
from books
where publisher = 210537
order by title
limit 10;

Actually, our index wasn’t used at all. Postgres just scanned the whole table, grabbing our publisher along the way, and then sorted the results to get the top 10. We’ll discuss why a little later, as we did in the prior post, but the short of it is that the random heap accesses from reading so many entries off of an index would be expensive; Postgres decided the scan would be cheaper. These decisions are all about tradeoffs and are governed by statistics and cost estimates.
Previously, we threw the “other” field into the INCLUDE()
list, so the engine wouldn’t have to leave the index to get the other field it needed. In this case, we’re selecting everything. I said previously to be diligent in avoiding unnecessary columns in the SELECT
clause for just this reason, but here, we assume we actually do need all these columns.
We probably don’t want to dump every single column into the INCLUDE
list of the index: we’d basically just be redefining our table into an index.
But why do we need to read so many rows in the first place? We have a limit of 10 on our query. The problem, of course, is that we’re ordering on title. And Postgres needs to see all rows for a publisher (2 million rows in this case) in order to sort them, and grab the first 10.
What if we built an index on publisher
, and then title
?
CREATE INDEX idx_publisher_title ON books(publisher, title);
That would look like this:

If Postgres were to search for a specific publisher, it could just seek down to the start of that publisher’s books, and then read however many needed, right off the leaf nodes, couldn’t it? There could be 2 million book entries in the leaf nodes, but Postgres could just read the first 10, and be guaranteed that they’re the first 10, since that’s how the index is ordered.
Let’s try it.

We got the top 10 books, sorted, from a list of over two million in less than a fourth of a millisecond. Amazing!
More publishers!
Now your boss comes and tells you to query the top 10 books, sorted alphabetically, as before, but over either publisher, combined. To be clear, the requirement is to take all books both publishers have published, combine them, then get the first ten, alphabetically.
Easy, you say assuredly, fresh off the high of seeing Postgres grab you that same data for your enormous publisher in under a millisecond.
You can put both publisher ids into an IN
clause. Then, Postgres can search for each, one at a time, save the starting points of both, and then start reading forward on both, and sort of merge them together, taking the smaller title from either, until you have 10 books total.
Let’s try it!
explain analyze
select *
from books
where publisher in (157595, 210537)
order by title
limit 10;
Which produces this

[Sad Trombone]
Let’s re-read my completely made-up, assumed chain of events Postgres would take, from above.
Postgres can search for each, one at a time, save the starting points of both, and then start reading forward on both, and sort of merge them together, taking the smaller title from either, until you have 10 books total.
It reads like the Charlie meme from Always Sunny in Philadelphia.

If your description of what the database will do sounds like something that would fit with this meme, you’re probably overthinking things.
Postgres operates on very simple operations that it chains together. Index Scan, Gather Merge, Sort, Sequential Scan, etc.
Searching multiple publishers
To be crystal clear, Postgres absolutely can search multiple keys from an index. Here’s the execution plan for the identical query from a moment ago, but with two small publishers for the publisher ids, which each have just a few hundred books

It did indeed do an index scan, on that same index. It just matched two values at once.
Rather than taking one path down the B Tree, it takes multiple paths down the B Tree, based on the multiple key value matches.
Index Cond: (publisher = ANY ('{157595,141129}'::integer[]))
That gives us all rows for either publisher. Then it needs to sort them, which it does next, followed by the limit.
Why does it need to sort them? When we have a single publisher, we know all values under that publisher are ordered.
Look at the index.

Imagine we searched for publisher 8. Postgres can go directly to the beginning of that publisher, and just read:
"Animal Farm"
"Of Mice and Men"
Look what happens when we search for two publishers, 8 and also, now, 21.

We can’t just start reading for those matched records. That would give us
"Animal Farm" "Of Mice and Men" "Lord of The Flies" "The Catcher in The Rye"
The books under each publisher are ordered, but the overall list of matches is not. And again, Postgres operates on simple operations. Elaborate meta descriptions like “well it’ll just merge the matches from each publisher taking the less of the next entry from either until the limit is satisfied” won’t show up in your execution plan, at least not directly.
Why did the publisher id change the plan?
Before we make this query fast, let’s briefly consider why our query’s plan changed so radically between searching for two small publishers compared to an enormous publisher and a small one.
As we discussed in part 1, Postgres tracks and uses statistics about your data in order to craft the best execution plan it can. Here, when you searched for the large publisher, it realized that query would yield an enormous number of rows. That led it to decide that simply scanning through the heap directly would be faster than the large number of random i/o that would be incurred from following so many matches in the index’s leaf nodes, over to the corresponding locations on the heap. Random i/o is bad, and Postgres will usually try to avoid it.
Crafting a better query
You can absolutely have Postgres find the top 10 books in both publishers, and then put them together, sorted, and take the first 10 from there. You just have to be explicit about it.
explain analyze
with pub1 as (
select * from books
where publisher = 157595
order by title limit 10
), pub2 as (
select * from books
where publisher = 210537
order by title limit 10
)
select * from pub1
union all
select * from pub2
order by title
limit 10;
The syntax below is called a common table expression, or a CTE. It’s basically a query that we define, and then query against later.
with pub1 as (
select * from books
where publisher = 157595
order by title limit 10
)
Let’s run it!
The execution plan is beautiful

It’s fast! As you can see, it runs in less than a fifth of a millisecond (0.186ms — but who’s counting)?
Always read these from the bottom:

It’s the same exact index scan from before, but on a single publisher, with a limit of 10, run twice. Postgres can seek to the right publisher, and just read 10 for the first publisher, and then repeat for the second publisher. Then it puts those lists together.
Remember the silly, contrived Postgres operation I made up before?
… and then start reading forward on both, and sort of merge them together, taking the smaller title from either, until you have 10 books total.
You’re not going to believe this, but that’s exactly what the Merge Append on line 2 does
-> Merge Append (cost=1.40..74.28 rows=20 width=111) (actual time=0.086..0.115 rows=10 loops=1)
You can achieve amazing things with modern databases if you know how to structure your queries just right.
How does this scale?
You won’t want to write queries like this manually. Presumably, you’d have application code taking a list of publisher ids, and constructing something like this. How will it perform as you add more and more publishers?
I’ve explored this very idea on larger production sets of data (much larger than what we’re using here). I found that, somewhere around a thousand ids, the performance does break down. But not because there’s too much data to work with. The execution of those queries, with even a thousand ids, took only a few hundred ms
. But the Planning Time started to take many, many seconds. It turns out having Postgres parse through a thousand CTEs, and put a plan together takes time.
Version 2
We’re onto something, for sure. But can we take a list of ids, and force them into individual queries that match on that specific id, with a limit, and then select from the overall bucket of results? Exactly like before, but without having to manually cobble together a CTE for each id?
When there’s a will, there’s a way.
explain analyze
with ids as (
select * from (
values (157595), (210537)
) t(id)
), results as (
select bookInfo.*
from ids
cross join lateral (
select *
from books
where publisher = ids.id
order by title
limit 10
) bookInfo
)
select *
from results
order by title
limit 10;
Let’s walk through this.
Our ids
CTE:
with ids as (
select * from (
values (157595), (210537)
) t(id)
)
This defines a pseudo-table that has one column, with two rows. The rows have values of our publisher ids for the sole column: 157595 and 210537.
values (157595), (210537)
But if it’s a table, how do we query against the column? It needs to have a name. That’s what this syntax is.
t(id)
We gave that column a name of id
.
The results
CTE is where the real work happens.
results as (
select bookInfo.*
from ids
cross join lateral (
select *
from books
where publisher = ids.id
order by title
limit 10
) bookInfo
)
We query against our ids
table, and then use the ugly cross join lateral
expression as a neat trick to run our normal books query, but with access to the publisher value in the ids CTE. The value in the ids CTE is the publisher id. So we’ve achieved what we want: we’re conceptually looping through those ids, and then running our fast query on each.
The term lateral
is the key. Think of (American) football, where a lateral is a sideways pass. Here, the lateral keyword allows us to “laterally” reference the ids.id
value from the expression right beside it; the ids
CTE laterals each id over to the results CTE.
That coaxes Postgres to run its normal index scan, followed by a read of the next 10 rows. That happens once for each id. That whole meta-list will then contain (up to) 10 rows for each publisher, and then this…
select *
from results
order by title
limit 10;
… re-sorts, and takes the first 10.
In my own experience, this scales fabulously. Even with a few thousand ids I couldn’t get this basic setup to take longer than half a second, even on a much larger table than we’ve been looking at here.
Let’s run it!
Let’s see what this version of our query looks like

Still a small fraction of a millisecond, but ever so slightly slower; this now runs in 0.207ms. And the execution plan is a bit longer and more complex.
-> Nested Loop (cost=0.69..81.19 rows=20 width=111) (actual time=0.042..0.087 rows=20 loops=1)
A nested loop join is a pretty simple (and usually pretty slow) join algorithm. It just takes each value in the one list, and then applies it to each value in the second list. In this case, though, it’s taking values from a static list and applying them against an incredibly fast query.
The left side of the join is each id from that static table we built
-> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=4) (actual time=0.001..0.002 rows=2 loops=1)
The right side is our normal (fast) query that we’ve seen a few times now.
-> Limit (cost=0.69..40.48 rows=10 width=111) (actual time=0.024..0.037 rows=10 loops=2) -> Index Scan using idx_publisher_title on books (cost=0.69..2288.59 rows=575 width=111) (actual time=0.023..0.034 rows=10 loops=2) Index Cond: (publisher = "*VALUES*".column1)
However, our nice Merge Append is gone, replaced with a normal sort. The reason is that we replaced discrete CTEs, each of which produced separate, identically sorted outputs, which the planner could identify, and apply a Merge Append to. Merge Append works on multiple, independently sorted streams of data. Instead, this is just a regular join, which produces one stream of data, and therefore needs to be sorted.
But this is no tragedy. The query runs in a tiny fraction of a millisecond, and will not suffer planning time degradation like the previous CTE version would, as we add more and more publisher ids. Plus, the sort is over just N*10 records, where N is the number of publishers. It would take a catastrophically large N to wind up with enough rows where Postgres would struggle to sort them quickly, especially since the limit of 10 would allow it to do an efficient top-N heapsort, like we saw in part 1.
Stepping back
The hardest part of writing this post is knowing when to stop. I could easily write as much content again: we haven’t even gotten into joins, and how indexes can help there, or even materialized views. This is an endless topic, and one that I enjoy, but we’ll stop here for now.
The one theme throughout can be summed up as: understand how your data is stored, and craft your queries to make the best use possible of this knowledge. If you’re not sure exactly how to craft your queries to do this, then use your knowledge of how indexes work, and what you want your queries to accomplish to ask an extremely specific question to your favorite AI model. It’s very likely to at least get you closer to your answer. Oftentimes knowing what to ask is half the battle.
And of course, if your data is not stored as you need, then change how your data is stored. Indexes are the most common way, which we’ve discussed here. Materialized views would be the next power tool to consider when needed. But that’s a topic for another day.
Parting thoughts
Hopefully, these posts have taught you a few things about querying, query tuning, and crafting the right index for the right situation. These are skills that can have a huge payoff in achieving palpable performance gains that your users will notice.
Article Series
Editor’s note: our The Complete Course for Building Backend Web Apps with Go includes setting up a PostgreSQL database and running it in Docker, all from scratch.
Style your underlines
Sep. 2nd, 2025 10:02 pm![[syndicated profile]](https://www.dreamwidth.org/img/silk/identity/feed.png)
We shouldn’t rely on colour alone to indicate that something is interactive.
Then goes on to show how links should be underlined, but that the default underline can be a little intense, and essentially shows how to chill them out. Exactly like we showed! I still think it’s a great balance.
today's tomatoes (before the spring onion and balsamic vinegar)
Sep. 2nd, 2025 10:13 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
(By "today's" I mean not "all of those harvested today, nor even yesterday" but rather "the tomato course with dinner".)
I really love the ridiculous stars on the tops of the Blue Fire.
A Condensate Dissolver
Sep. 2nd, 2025 01:52 pm![[syndicated profile]](https://www.dreamwidth.org/img/silk/identity/feed.png)
I’ve written several times over the last few years about biomolecular condensates, also known as “membraneless organelles” and (confusingly) by several other names. Whatever you call them, these are phase-separated droplets that form reversibly under various cellular conditions, and they often consist of mixtures of proteins and RNA species. There are particular protein sequences and charge distributions that are more common among condensate-formers, and disordered proteins are over-represented as well, but there’s a lot of variety in composition, lifetime, and localization.
I was invited to a meeting on these topics back in 2018 and found the field very interesting. It seemed obvious that these things were performing important cellular functions and that they had been largely overlooked. Well, except by microscopists, who had given them names like “speckles”, “granules”, “bodies” and so on. It was quite a while before their common features were recognized. Not long after that meeting several condensate-focused startups launched, not all of which are still with us. I spent a good amount of time digesting the research in the field, as much as it was actually digestible, and trying to come up with a good actionable application to drug discovery.
But I was unable to. No matter how I came at it, the whole field seemed to still be in too early a state for these opportunities. Indeed, my impression of some of the startups was that they set about trying to bring some order to things to see what they had and what might present the best opportunities, and I suspect that process is still continuing. A few things did stand out: everyone was agreed that proteins like TDP43 and FUS formed stress-granule condensates, and that these seemed to somehow go haywire in some neurodegenerative disorders. The simplest way to look at it was that the condensates didn’t seem to dissolve on cue once the cellular stress was removed, but rather hung around, aging from liquid droplets to something more like gelatinous blobs and eventually to outright solid deposits that were associated with pathological states in neurons. That might be too simple a way to look at it - nothing in neurodegeneration is ever as simple as it looks, from what I’ve seen - but it’s not a bad place to start. And a lot of people have indeed been starting right there.
This recent paper makes the case that interfering with that “condensate aging” process might be beneficial, as hoped, and the authors searched for small molecules that could induce this phenotypic outcome. There had been some condensate-dissolving molecules identified over the years, but they tended to be like the prototype of that group, 1,6-hexanediol. That one really does seem to dissolve condensates, but probably in the same way that it might make a useful additive to a spray for cleaning shower doors: in some ways it’s just a solvent. And as you’d figure, it has a lot of other effects on cells, making it a pretty poor tool for assays and outright hopeless as a potential mode of treatment.
This new paper settled on lipoamide. That makes some chemical sense as a hit - the formation and dissolution of stress granules seems to involve redox changes in some key proteins, and the disulfide group in lipoamide might be expected to get into the middle of that chemistry. The paper shows that various cell lines expressing fluorescently tagged versions of stress granule proteins (such as FUS and TDP43) are sensitive to lipoamide treatment, acutely or with pretreatment before the induced formation of the granules as a preventative. Interestingly, and array of other known condensates in both the cytoplasm and nucleus were unaffected. Even the source of stress granule formation mattered: granule formation on exposure to arsenate, oxidative stress, or osmotic shock was inhibited, while formation after heat treatment or inhibition of glycolosis went on apparently as before. There are stress granules and there are stress granules, it seems.
Nitrogen-isotope-labeled versions of lipoamide or a derivative of it with a photoreactive diazirine group both showed that it partitions well into cells, and that it also gets concentrated in the stress granules (as well as other organelles). SAR exploration around the structure showed that the lipoamide stereochemistry didn’t seem to matter, indicating that it’s probably not a specific protein-binding event that’s in play. And the activity around the amide group was rather flat - some a bit better, some a bit worse. In fact, the entire carboxamide could be deleted without completely losing activity. But the disulfide (the dithiolane ring) was crucial. Six- and seven-membered disulfides were actually inactive (the size of such rings is known to affect the redox potential).
Proteomic work showed that disordered protein regions with lots of arginine and tyrosine residues interacted with lipoamide. FUS itself does not, but two other proteins that are known to localize to stress granules do, SFPQ and SRSF1. Their own sequence oddities are found in their names, which are acronyms for “splicing factor proline- and glutamine-rich” and “serine-arginine rich splicing factor 1”. Knocking out the former impaired the ability of lipoamide to keep stress granules from forming, and knocking out the latter stops it from working at all.
After this came the real test in cells. Lipoamide does actually seem to rescue the functions of FUS and TDP43, presumably by allowing them to be either released from stress granules or by preventing their formation outright, and it actually seems to alleviate the ALS phenotypes in relevant models with both human- and animal-derived cells expressing mutant forms FUS and TDP43. On top of that, it also showed beneficial effects in small-animal model systems in C. elegans and Drosophila. That’s really interesting, but it also raises a lot of questions about the formation of those stress granules and their role in normal cells. The granules in those mutant cells aren’t really associated with oxidative stress, for example, and you also have to wonder about the consequences for normal cells if they are indeed unable to form stress granules after treatment with lipoamide or something like it. They’re presumably there for a reason! And more work will be needed to show a solid link between stress granule dissolution and these phenotypic improvements. But now it looks like there are some good tools with which to get started on that, which is a real improvement.
The Tariffs Are Still Illegal
Sep. 2nd, 2025 05:31 pm![[syndicated profile]](https://www.dreamwidth.org/img/silk/identity/feed.png)
Dubious security vulnerability: Remembering passwords for recently-opened ZIP files
Sep. 2nd, 2025 02:00 pm![[syndicated profile]](https://www.dreamwidth.org/img/silk/identity/feed.png)
A security vulnerability report arrived that claimed that it could obtain unauthorized access to a password-protected ZIP file by the following means:
- Start with a password-protected ZIP file, call it “secret.zip”.
- Create a dummy ZIP file and give it the same password. Call this file “attack”.
- Open attack.zip in Explorer.
- Enter the password when prompted.
- Close attack.zip.
- Copy the secret.zip file on top of the dummy attack.zip file.
- Open “attack.zip” (which is now a copy of secret.zip).
- Observe that Explorer opens the impostor ZIP file without asking for a password. You have obtained unauthorized access to the secret.zip password-protected ZIP file.
As usual, we have to look at who the attacker is, who the victim is, and what the attacker has gained.
The attacker is, I guess, the user who is creating the attack.zip ZIP file and doing the fancy swap-in.
The victim is, I think, the person who created the original password-protected ZIP file “secret.zip”.
And what the attacker gained is access to a password-protected ZIP.
Wait a second, but in order for this trick to work, the attacker must already know the password to the secret.zip ZIP file, because they need to use that same password for the attack.zip ZIP file.
So what the attacker gained is “access to a password-protected ZIP file that they know the password to”, which is not really much of a gain at all. They could have done this in a much simpler way:
- Open secret.zip.
- Enter the password when prompted.
Explorer caches passwords for ZIP files to avoid having to bug the user for the password each time it goes back to the ZIP file.¹ For example, if the ZIP file is open in an Explorer window, and then you extract a file from the ZIP, then the ZIP file needs to be reopened to find that file and extract it. Before asking you for the password, it uses the password you used to open the ZIP file originally, and if that works, then the operation continues without needing to prompt again. It would be super-annoying if you had to re-enter the password for each file you extracted from a ZIP file.
“Bu why does it try the password even when it’s a different ZIP file?”
Well, what exactly is “a different ZIP file”?
If you define it as “A file with the same name but a different last-modified timestamp or with different contents is a different ZIP file”, then it means that any time you modify a password-protected ZIP file (say, to delete a file from it), you will have to re-enter the password. That seems wrong.
The finder here seems to mean that there is some metaphysical concept of “identity” that is broader than “files are byte-for-byte identical” (because they presumably want the password to be remembered even if, say, a single file is removed from the ZIP file), yet more strict than “a file is created” (because they want “overwriting the bytes of one file with the bytes of another file” to change the identity).
Now you’re dealing with some sort of Ship of Theseus thought experiment: Suppose the original file is modified one byte at a time until it matches the replacement file. At what point does it stop being the original file and start being the replacement?
Windows isn’t going to try to solve a philosophical conundrum from ancient Greece.
Windows uses the simple rule that if it has the same path, then it’s worth trying the same password.
But it’s just trying the password you already gave it. It did not magically determine the password for the file. If the password you gave it is incorrect, then Windows will prompt for the password. The only way you can gain access to the ZIP file is if you provide the password.
¹ The cache has session lifetime, so all of these cached ZIP passwords are forgotten when you sign out.
The post Dubious security vulnerability: Remembering passwords for recently-opened ZIP files appeared first on The Old New Thing.
Thoughts on creating a tracking pointer class, part 16: Second attempt to use a list
Sep. 1st, 2025 02:00 pm![[syndicated profile]](https://www.dreamwidth.org/img/silk/identity/feed.png)
Commenter LB pointed out that you can use the splice
method to move nodes between lists. So let’s try that.
The idea is that each tracking_ptr
holds an iterator to a list node that is part of a list controlled by the tracked object. When the tracked object expires its tracking pointers (either due to being moved into or at destruction), the nodes are moved (via splice
) to a holding pen which I will call orphans
. And when the tracked object is moved from, its tracking pointer nodes are spliced into the tracking pointer list of the destination.
template<typename T> using tracker_list = std::list<T*>; template<typename T> using tracker_node_ptr = typename tracker_list<T>::iterator;
We make up some convenient aliases: A tracker list is a list of pointers to trackable objects. And a tracker node pointer is an iterator into that list. It is the tracker node pointer fow which which our tracking_ptr
servers as a wrapper.
template<typename T> struct trackable_object;
We forward-declare the trackable object base class so that we can talk about it in the tracking_ptr
definition.
template<typename T> tracker_list<T> orphans;
And here is our list of orphans. This is where nodes go when they no longer point to anything.
template<typename T> tracking_node_ptr<T> create_node(tracking_list<T>& list, T* value) { return list.emplace(list.end(), value)); }
And since we will be doing this a lot, here’s a helper function to create a new node in a tracking list and return an iterator to it, ready to be wrapped in a tracking pointer.
template<typename T> struct tracking_ptr { private: friend struct trackable_object<T>; tracker_node_ptr<T> m_node; explicit tracking_ptr(tracker_node_ptr<T> const& node) noexcept : m_node(node) { }
As noted earlier, the tracking pointer itself wraps an iterator into the tracking list. When the tracking pointer is created, the trackable_object
will use the private constructor to tell it which node it is tracking. The pointer inside that node points back to whatever object the tracking pointer is tracking; if the tracking pointer has expired, then the node is on the orphan list.
tracker_list& containing_list() noexcept { if (*m_node) { return (*m_node)->trackable_object<T>::m_trackers; } else { return orphans<T>; } }
We can infer the list that contains our list node by seeing if our node has expired. If the pointer inside the list node is non-null, then that pointer points to the object whose list we belong to. If the pointer inside the list node is null, then the tracking pointer has expired, and the node is an orphan and lives in the orphan list.
public: tracking_ptr(tracking_ptr const& other) : m_node(create_node(orphans<T>, nullptr)) { }
Default-constructing a tracking pointer creates an already-orphaned pointer. Make a node for it on the orphans list.
tracking_ptr(tracking_ptr const& other) : m_node(create_node(other.containing_list(), *other.m_node)) { }
Copy-constructing a tracking pointer creates a new node in the same list as the source.
tracking_ptr(tracking_ptr&& other) noexcept : m_node(std::exchange(other.m_node, create_node(orphans<T>, nullptr))) { }
Move-constructing a tracking pointer adopts the node from the source and then orphans the source.
T* get() const noexcept { return *m_node; }
When you ask for the pointed-to object, we get it from the list node. This will return nullptr
for orphans.
~tracking_ptr() { containing_list.erase(m_node); }
Destructing the tracking pointer removes it from whatever list its node belongs to.
⟦ assignment operators elided for expository purposes ⟧ };
I’m not going to bother with the assignment operators since they aren’t really the focus of the discussion.
template<typename T> struct trackable_object { private: friend struct tracking_ptr<T>; tracker_list<T> m_trackers; T* outer() noexcept { return static_cast<T*>(this); }
The trackable object has a list of all the tracking pointers that are tracking it. The helper function outer
gives us a pointer to the outer object.
void update_trackers(T* p) noexcept { for (auto& node : m_trackers) { node = p; } }
When the object wants to change what all its tracking pointers refer to, it can use update_trackers
. This doesn’t move the nodes anywhere; you have to do that as a separate step.
public: trackable_object() = default;
Constructing the trackable object merely constructs an empty list.
~trackable_object() { update_trackers(nullptr); orphans<T>.splice(orphans<T>.begin(), m_trackers); }
When the trackable object destructs, all the active tracking pointers now point to nothing, and all of the nodes are moved to the orphans list.
trackable_object(const trackable_object&) : trackable_object() { }
Copy-constructing a trackable object creates a separate tracking list and doesn’t affect the tracking list of the copied-from object.
trackable_object(trackable_object&& other) noexcept : m_trackers(std::move(other.m_trackers)) { update_trackers(outer()); }
Move-constructing a trackable object transfers the tracking pointers to the new object, and we update those tracking pointers to point to the newly-constructed object.
trackable_object& operator=(const trackable_object&) { return *this; }
Copy-assigning a trackable object does nothing to the tracking pointers.
trackable_object& operator=(trackable_object&& other) noexcept { update_trackers(nullptr); orphans<T>.splice(orphans<T>.begin(), m_trackers); m_trackers.splice(m_trackers.begin(), other.m_trackers); update_trackers(outer()); return *this; }
Move-assigning a trackable object expires its existing tracking pointers (sets the pointers to null and then moves them to the orphans list), and then steals all the tracking pointers from the source object (and setting the pointers to point to the assigned-to object).
tracking_ptr<T> track() { auto new_node = m_trackers.emplace(m_trackers.end(), outer()); return tracking_ptr<T>(new_node); }
To create a tracking pointer, we create a new node in our tracking list and tell the tracking pointer to wrap an iterator to it.
Okay, that’s a quick implementation of the basic design. What do we see?
One annoying thing is that creating an empty tracking pointer performs an allocation. You sort of hope that empty tracking pointers are noexcept constructible and are basically free, but that’s sadly not the case.
Move construction is also potentially-throwing since we have to leave an orphan node in the source object.
We can avoid this problem by using a sentinel iterator to indicate an empty tracking pointer. An obvious choice for this is orphans<T>.end()
since it is already there and doesn’t become invalidated.
Unfortunately, this doesn’t work because you are not allowed to compare iterators from different containers. The comparison if (m_node == orphans<T>.end())
is illegal if m_node
is not an orphan, but the way we set things up, the way to check if a node is an orphan is to dereference it to look at the pointer inside, but we can’t do that either because the end()
iterator is not dereferencable.
One solution is to preallocate a single node on the orphans list and use that for all orphans. Now instead of creating a node on the orphan list, we just grab the orphans<T>.begin()
node. This preallocated node has a null pointer inside, so we can look at the wrapped pointer to detect that we have an orphan, and only then do we compare it against orphans<T>.begin()
to see if we have the special orphan (in which case we never erase it).
Another solution is for the tracking pointer to use a std::optional
to hold the iterator. An empty iterator would mean that the node is already orphaned.
Both of these solutions still have the problem that there is an orphan list at all. In fact, the way we set things up, we have multiple orphan lists, one for each type T
. We can consolidate this down to a single orphan list by using type erasure and making the tracking list a std::list<void*>
, and then applying a lot of casting when we pull pointers out of the nodes.
Another problem with orphan lists is multithreaded access. We already said that tracking pointers and the objects they track must all belong to the same thread due to the possibility of observing an object in mid-move. But a shared orphan list means that all tracking pointers and tracked objects must belong to the same thread because they can collide on the orphan list. You can fix this by adding a mutex to the orphan list, but this could end up being a multithreaded choke point.
We could avoid unsafe multithreaded access to the orphan lists by making the orphan lists immutable: The only node on the orphan lists is the special sentinel. The entries on the tracking list would now have to be a std::pair<T*, tracking_ptr<T>*
: The T*
is used by the tracking pointer to find the trackable object, and the tracking_ptr<T>*
is used by the trackable object so it can hunt down all the active tracking pointers and erase their m_node
and set the m_node
to the special sentinel orphan.
But wait, creating the orphan list could throw an exception due to low memory, so that’s another thing to worry about. We can get rid of the orphan lists entirely by bringing back the std::optional<tracking_node_ptr<T>>
as the m_node
. You would combine this with the std::pair
so that when the tracked object wants to expire its tracking nodes, it can reset all of the m_node
s.
So yes, you could do this with std::list
, but we had to work around multiple weaknesses. The prohibition on comparing iterators from different containers forced us to come up with somewhat clunky workarounds, and the inability to extract list nodes without attaching them to another list forces us to create a “holding pen” for all the orphaned nodes.
The post Thoughts on creating a tracking pointer class, part 16: Second attempt to use a list appeared first on The Old New Thing.
Thoughts on creating a tracking pointer class, part 15: A custom shared pointer
Aug. 29th, 2025 02:00 pm![[syndicated profile]](https://www.dreamwidth.org/img/silk/identity/feed.png)
Last time, we made our trackable object implementation’s constructors and assignment operations non-throwing, although it came at a cost of making the act of creating a tracking pointer potentially-throwing. At the end, I noted that we didn’t use all the features of C++ shared pointers. We never used weak pointers or thread safety, so we can replace shared pointers with a custom version that supports only single-threaded shared references.
The single-threaded simplification is significant because it removes the need for atomic operations and allows the compiler to do reordering and coalescing of reference count manipulations.
template<typename T> struct tracking_ptr_base { tracking_ptr_base() noexcept = default; tracking_ptr_base(tracking_ptr_base const& other) noexcept : m_ptr(other.copy_ptr()) {} tracking_ptr_base(tracking_ptr_base&& other) noexcept = default; ~tracking_ptr_base() = default; tracking_ptr_base& operator=(tracking_ptr_base const& other) noexcept { m_ptr = other.copy_ptr(); return *this; } tracking_ptr_base& operator=(tracking_ptr_base&& other) noexcept = default; operator bool() const noexcept { return get() != nullptr; } protected: friend struct trackable_object<T>; struct data { data(T* tracked) noexcept : m_tracked(tracked) {} unsigned int m_refs = 1; T* m_tracked; }; struct deleter { void operator()(data* p) { if (--p->m_refs == 0) { delete p; } } }; tracking_ptr_base(T* p) noexcept : m_ptr(new data(p)) { } T* get() const noexcept { return m_ptr ? m_ptr->m_tracked : nullptr; } void set(T* ptr) noexcept { m_ptr->m_tracked = ptr; } std::unique_ptr<data, deleter> copy_ptr() const noexcept { if (m_ptr) ++m_ptr->m_refs; return std::unique_ptr<data, deleter>(m_ptr.get()); } std::unique_ptr<data, deleter> m_ptr; };
This looks like a lot of code, but it’s really accomplishing very little.
The basic operations on the pointer are incref and decref. The incref operation increments the m_refs
and the decref operation decrements the m_refs
, destroying the data
if the reference count goes to zero.
Copying the tracking_ptr_base
copies the pointer and performs an incref. Destructing the tracking_ptr_base
performs a decref, and moving the tracking_ptr_base
moves the pointer from the source to the destination, decref’ing any existing pointer in the destination. (The responsibility to decref the pointer moves from the source to the destination.)
By building on top of std::
with a custom deleter, we get cleanup and move implementations for free.
Okay, I think this mostly wraps up the development of a tracking pointer implementation. I have no idea if it is any use, but it was a nice exercise trying to figure out how to implement it.
(Okay, there’s a follow-up I had promised to write after the main series is over. So there’s at least one more part to go.)
The post Thoughts on creating a tracking pointer class, part 15: A custom shared pointer appeared first on The Old New Thing.
Pia and the Little Tiny Things - Page 337
Sep. 2nd, 2025 12:00 am![[syndicated profile]](https://www.dreamwidth.org/img/silk/identity/feed.png)

New comic!
He's still feeling a little mad at life.
--
hey if you didn't check out last update's commentary, I shared some announcements, explaining stuff, but here's a summarized version:
- I'm in the process of leaving Hiveworks, so things are moving, for example the online shop will move to another platform (probably Big Cartel) but that's still in process
- I'm working on another big comic project with a publisher but I need more time to work on it (it's the new version of Headless Bliss!) so that means that
- Little Tiny Things updates will slow down to one update per week (every Tuesdays)! At least for the moment.
Introduction to Postgres Indexes
Sep. 1st, 2025 07:50 pm![[syndicated profile]](https://www.dreamwidth.org/img/silk/identity/feed.png)
This Part 1 (of a 2-part series) is a practical, hands-on, applicable approach to database indexes. We’ll cover what B Trees are with a focus on deeply understanding, and internalizing how they store data on disk, and how your database uses them to speed up queries.
This will set us up nicely for part 2, where we’ll explore some interesting, counterintuitive ways to press indexes into service to achieve great querying performance over large amounts of data.
Article Series
There are other types of database indexes beside B Tree, but B Tree indexes are the most common, which is why they’ll be the exclusive focus of this post.
Everything in these posts will use Postgres, but everything is directly applicable to other relational databases (like MySQL). All the queries I’ll be running are against a simple books database which I scaffolded, and had Cursor populate with about 90 million records. The schema for the database, as well as the code to fill it are in this repo. If you’d like to follow along on your own: sql/db_create.sql
has the DDL, and npx tsx insert-data/fill-database.ts
will run the code to fill it.
We’ll be looking at some B Tree visualizations as we go. Those were put together with a web app I had Cursor help me build.
Editors’ note: Need to bone up on PostgreSQL all around? Our course Complete Intro to SQL & PostgreSQL from Brian Holt will be perfect for you.
Setting some baselines
Just for fun, let’s take a look at the first 10 rows in the books table. Don’t look too close, again, this was all algorithmically generated by AI. The special characters at the beginning were my crude way of forcing the (extremely repetitive) titles to spread out.

That’s the last time we’ll be looking at actual data. From here forward we’ll look at queries, the execution plans they generate, and we’ll talk about how indexes might, or might not be able to help. Rather than the psql terminal utility I’ll be running everything through DataGrip, which is an IDE for databases. The output is identical, except with nicely numbered lines which will make things easier to talk about as we go.
Let’s get started. Let’s see what the prior query looks like by putting explain analyze
before it. This tells Postgres to execute the query, and return to us the execution plan it used, as well as its performance.
explain analyze
select * from books limit 10;

We asked for 10 rows. The database did a sequential scan on our books table, but with a limit
of 10. This couldn’t be a simpler query, and it returned in less than one twentieth of one millisecond. This is hardly surprising (or interesting). Postgres reached in and grabbed the first ten rows.
Let’s grab the first 10 books, but this time sorted alphabetically.

Catastrophically, this took 20 seconds. With 90 million rows in this table, Postgres now has to (kind of) sort the entire table, in order to know what the first 10 books are.
I say kind of since it doesn’t really have to sort the entire table; it just has to scan the entire table and keep track of the 10 rows with the lowest titles. That’s what the top-N heapsort is doing.
And it’s doing that in parallel. We can see two child workers getting spawned (in addition to the main worker already running our query) to each scan about a third of the table. Then the Gather Merge pulls from each worker until it has the top 10. In this case it only needed to pull the top 7 rows from each worker to get its 10; this is reflected in lines 3-9 of the execution plan.
Line 5 makes this especially clear
-> Sort (cost=2839564.34..2934237.14 rows=37869122 width=112) (actual time=20080.358..20080.359 rows=7 loops=3)
Notice the loops=3, and the rows=37 million. Each worker is scanning its share of the table, and keeping the top 7 it sees.
These 3 groups of 7 are then gathered and merged together in the Gather Merge on line 2
-> Gather Merge (cost=2840564.36..11677309.78 rows=75738244 width=112) (actual time=20093.790..20096.619 rows=10 loops=1)
Rather than just slapping an index in, and magically watching the time drop down, let’s take a quick detour and make sure we really understand how indexes work. Failing to do this can result in frustration when your database winds up not picking the index you want it to, for reasons that a good understanding could make clear.
Indexes
The best way to think about a database index is in terms of an index in a book. These list all the major terms in the book, as well as all the pages that the term appears on. Imagine you have a 1,000 page book on the American Civil War, and wanted to know what pages Philip Sheridan are mentioned on. It would be excruciatingly slow to just look through all 1,000 pages, searching for those words. But if there’s a 30 or so page index, your task is considerably simpler.
Before we go further, let’s look at a very basic index over a numeric id
column

This is a B Tree, which is how (most) indexes in a database are stored.
Start at the very, very bottom. Those blue “leaf” nodes contain the actual data in your index. These are the actual id values. This is a direct analogue to a book’s index.
So what are the gold boxes above them? These help you find where the leaf node is, with the value you’re looking for.
Let’s go to the very top, to the root node of our B Tree. Each of these internal nodes will have N key values, and N+1 pointers. If the value you’re looking for is strictly less than the first value, go down that first, left-most arrow and continue your search. If the value you’re looking for is greater than or equal to that first key, but strictly less than the next key, take the second arrow. And so on. In real life the number of keys in each of these nodes will be determined by how many of them can fit into a single page on disk (and will usually be much more than 3).
So with this B Tree, if we want to find id = 33, we start at the root. The id 33 is not < 19, so we don’t take the first arrow. But 33 is >=19 and <37, so we take the middle arrow.
Now we repeat. The id 33 is not < 25, so we don’t take the left most path. The id 33 is not >= 25 AND < 31, so we don’t take the middle path. But 33 is greater than 31 (it better be, this is the last path remaining), so we take the right most path. And that takes us to the leaf node with our key value.

Notice also that these leaf nodes have pointers forward and backward. This allows us to not only find a specific key value, but also a range of values. If we wanted all ids > 33, we could do as we did, and just keep reading.

But — now what? What if we ran a query of SELECT * FROM books WHERE id = 33
– we’ve arrived at a leaf node in our index with … our key. How do we get all the data associated with that key? In other words the actual row in the database for that value?
The thing I’ve left off so far is that leaf nodes also contain pointers to the actual table where that value in question is.

Returning briefly to our analogy with a book’s index, those heap pointers correspond to the page number beside each index entry, telling you where to go in the book to see the actual content.
So the full story to find a single row in our database by an id value, via an index, would actually look more like this:

We’ll talk later about these heap reads, or lack thereof when we get into covering indexes and the Index Only Scan operation.
Bear with me a little longer. Before we look at what an index on title would look like, and create one in our database to run our query against, let’s take a slightly deeper look at B Trees. Internalizing how they work can be incredibly valuable.
B Trees run in O(LogN) time
You may have been taught a fun little math fact in school, that if you were to be given a penny on January 1st, then have your penny doubled on January 2nd, then have that new amount (2 cents) doubled on January 3rd, etc, you’d have about $10 million dollars before February. That’s the power of exponential operations. Anytime you’re repeatedly multiplying a value by some constant (which is all doubling is, for constant 2), it will become enormous, fast.
Now think of a more depressing, reverse scenario. If someone gave you $10 million on January 1st, but with the condition that your remaining money would be halved each day, you’d have a lowly cent remaining on Feb 1st. This is a logarithm; it’s the inverse of exponentiation. Rather than multiplying a value by some constant, we divide it by some constant. No matter how enormous, it will become small, fast.
This is exactly how B Trees work. In our example B Tree above, there were 9 leaf pages. Our internal nodes had up to 3 pointers. Notice that we were able to find out the exact leaf node we wanted by following only 2 of those gold nodes’ arrows (which is also the depth of the tree).
9 divided by 3 is 3
3 divided by 3 is 1
Or, more succinctly, Log39 = 2
Which reads as
Logarithm base 3 of 9 is 2
But these small values don’t really do this concept justice. Imagine if you had an index with whose leaves spanned 4 billion pages, and your index nodes had only 2 pointers, each (both of these assumptions are unrealistic). You’d still need only 32 page reads to find any specific value.
232 = ~4 billion
and also
Log2(~4 billion) = 32.
They’re literally inverse operations of each other.
How deep are real indexes?
Before we move on, let’s briefly look at how deep a real Postgres index is on a somewhat large amount of data. The books table with 90 million entries already has an index defined on the primary key id field, which is a 32 bit integer. Without going into gross detail about what all is stored on a B Tree node (N keys, N+1 offsets to other nodes, some metadata and headers, etc), ChatGPT estimates that Postgres can store between 400-500 key fields on an index on a 32 bit integer.
Let’s check that. There’s a Postgres extension for just this purpose.
CREATE EXTENSION IF NOT EXISTS pageinspect;
and then
SELECT * FROM bt_metap('books_pkey');
which produces
magic | version | root | level | fastroot | fastlevel | last_cleanup_num_delpages | last_cleanup_num_tuples | allequalimage |
---|---|---|---|---|---|---|---|---|
340322 | 4 | 116816 | 3 | 116816 | 3 | 0 | -1 | t |
Note the level 3, which is what our index’s depth is. That means it would take just 3 page reads to arrive at the correct B Tree leaf for any value (this excludes reading the root node itself, which is usually just stored in memory).
Checking the math, the Log450(90,000,000) comes out to … 2.998
Taking an index for a spin
Let’s run a quick query by id, with the primary key index that already exists, and then look at how we can create one on title, so we can re-run our query to find the first 10 books in order.
explain analyze
select *
from books
where id = 10000;
which produces the following

We’re running an index scan. No surprises there. The Index Cond
Index Cond: (id = 10000)
is the condition Postgres uses to navigate the internal nodes; those were the gold nodes from the visualization before. In this case, it predictably looks for id = 10000
Re-visiting our titles sort
Let’s take a fresh look at this query.
select *
from books
order by title
limit 10;
This time let’s define an index, like so
CREATE INDEX idx_title ON books(title);
This index would look something like this (conceptually at least).

Now our query runs in less than a fifth of a millisecond.

Notice what’s missing from this execution plan, that was present on the previous query, when we looked for a single index value.
Did you spot it?
It’s the Index Cond. We’re not actually … looking for anything. We just want the first ten rows, sorted by title. The index stores all books, sorted by title. So the engine just hops right down to the start of the index, and simply reads the first ten rows from the leaf nodes (the blue nodes from the diagrams).
More fun with indexes
Let’s go deeper. Before we start, I’ll point out that values for the pages
column was filled with random values from 100-700. So there are 600 possible values for pages, each randomly assigned.
Let’s look at a query to read the titles of books with the 3 maximum values for pages. And let’s pull a lot more results this time; we’ll limit it to one hundred thousand entries
explain analyze
select title, pages
from books
where pages > 697
limit 100000;

As before, we see a parallel sequential scan. We read through the table, looking for the first 100,000 rows. Our condition matches very few results, so the database has to discard through over 6 million records before it finds the first 100,000 matching our condition
Rows Removed by Filter: 6627303
The whole operation took 833ms.
Let’s define an index on pages
CREATE INDEX idx_pages ON books(pages);
You might notice that the pages column is by no means unique; but that doesn’t effect anything: the leaf pages can easily contain duplicate entries.

Everything else works the same as it always has: the database can quickly jump down to a specific value. This allows us to query a particular value, or even grab a range of values sorted on the column we just looked up. For example, if we want all books with pages > 500, we just seek to that value, and start reading.
Let’s re-run that query from before
explain analyze
select title, pages
from books
where pages > 697
limit 100000;

There’s a lot going on. Our index is being used, but not like before
-> Bitmap Index Scan on idx_pages (cost=0.00..4911.16 rows=451013 width=0) (actual time=38.057..38.057 rows=453891 loops=1)
Bitmap scan means that the database is scanning our table, and noting the heap locations with records matching our filter. It literally builds a bitmap of matching locations, hence the name. It then sorts those locations in order of disk access.
It then pulls those locations from the heap. This is the Bitmap Heap Scan on line 5
-> Parallel Bitmap Heap Scan on books (cost=5023.92..1441887.67 rows=187922 width=73) (actual time=41.383..1339.997 rows=33382 loops=3)
But remember, this is the heap, and it’s not ordered on pages, so those random locations may have other records not matching our filter. This is done in the Index Recheck on line 6
Recheck Cond: (pages > 697)
which removed 1,603,155 results.
Why is Postgres doing this, rather than just walking our index, and following the pointers to the heap, as before?
Postgres keeps track of statistics on which values are contained in its various columns. In this case, it knew that relatively few values would match on this filter, so it chose to use this index.
But that still doesn’t answer why it didn’t use a regular old index scan, following the various pointers to the heap. Here, Postgres decided that, even though the filter would exclude a large percentage of the table, it would need to read a lot of pages from the heap, and following all those random pointers from the index to the heap would be bad. Those pointers point in all manner of random directions, and Random I/O is bad. In fact, Postgres also stores just how closely, or badly those pointers correspond to the underlying order on the heap for that column via something called correlation. So if, somehow, the book entries in the heap just happened to be stored (more or less) in increasing values of pages, there would be a high correlation on the pages column, and this index would be more likely to be used for this query.
For these reasons, Postgres thought it would be better to use the index to only keep track of which heap locations had relevant records, and then fetch those heap locations in heap order, after the Bitmap scan sorted them. This results in neighboring chunks of memory in the heap being pulled together, rather than frequently following those random pointers from the index.
Again, Random I/O tends to be expensive and can hurt query performance. This was not faulty reasoning at all.
But in this case Postgres wagered wrong. Our query now runs slower than the regular table scan from before, on the same query. It now takes 1.38 seconds, instead of 833ms. Adding an index made this query run slower.
Was I forcing the issue with the larger limit of 100,000? Of course. My goal is to show how indexes work, how they can help, and occasionally, how they can lead the query optimizer to make the wrong choice, and hurt performance. But please don’t think an index causing a worse, slower execution plan is an unhinged, unheard of eventuality which I contrived for this post; I’ve seen it happen on very normal queries on production databases.
The road not traveled
Can we force Postgres to do a regular index scan, to see what might have been? It turns out we can; we can (temporarily) turn off bitmap scans, and run the same query.
SET enable_bitmapscan = off;
explain analyze
select title, pages
from books
where pages > 697
limit 100000;
and now our query runs in just 309ms

Clearly Postgres’s statistics led it astray this time. They’re based on heuristics and probabilities, along with estimated costs for things like disk access. It won’t always work perfectly.
When stats get things right
Before we move on, let’s query all the books with an above-average number of pages
explain analyze
select title, pages
from books
where pages > 400
limit 100000;
In this case Postgres was smart enough to not even bother with the index.

Postgres’ statistics told it that this query would match an enormous number of rows, and just walking across the heap would get it the right results more quickly than bothering with the index. And in this case it assumed correctly. The query ran in just 37ms.
Covering Indexes
Let’s go back to this query
explain analyze
select title, pages
from books
where pages > 697
limit 100000;
It took a little over 800ms with no index, and over 1.3 seconds with an index on just pages.
The shortcoming of our index was that it did not include title, which is needed for this query; Postgres had to keep running to the heap to retrieve it.
Your first instinct might be to just add title to the index.
CREATE INDEX idx_pages_title ON books(pages, title);
Which would look like this:

This would work fine. We’re not needing to filter based on title, only pages. But having those titles in the gold non-leaf nodes wouldn’t hurt one bit. Postgres would just ignore it, find the starting point for all books with > 400 pages, and start reading. There’s be no need for heap access at all, since the titles are right there.
Let’s try it.

Our query now runs in just 32ms! And we have a new operation in our execution plan.
-> Index Only Scan using idx_pages_title on books (cost=0.69..30393.42 rows=451013 width=73) (actual time=0.243..83.911 rows=453891 loops=1)
Index Only Scan means that only the index is being scanned. Again, there’s no need to look anything up in the heap, since the index has all that we need. That makes this a “covering index” for this query, since the index can “cover” it all.
More or less.
Heap Fetches: 0
That’s Line 4 above, and it is not as redundant as it might seem. Postgres does have to consult something called a visibility table to make sure the values in your index are up to date given how Postgres handles updates through it’s MVCC system. If those values are not up to date, it will have to hit the heap. But unless your data are changing extremely frequently this should not be a large burden.
In this case, it turns out Postgres had to go to the heap zero times.
A variation on the theme
If you’re using Postgres or Microsoft SQL Server you can create an even nicer version of this index. Remember, we’re not actually querying on title here, at all. We just put it in the index so the title values would be in the leaf nodes, so Postgres could read them, without having to visit the heap.
Wouldn’t it be nice it we could only put those titles in the leaf nodes? This would keep our internal nodes smaller, with less content, which, in a real index, would let us cram more key values together, resulting in a smaller, shallower B Tree that would potentially be faster to query.
We do this with the INCLUDE clause when creating our index (in databases that support this feature).
CREATE INDEX idx_pages_include_title ON books(pages) INCLUDE(title);
This tell Postgres to create our index on the pages column, as before, but also include the title field in the leaf nodes. It would look like this, conceptually.

And re-running that same query, we see that it does run a bit faster. From 32ms down to just 21ms.

To be clear, it’s quite fast either way, but a nice 31% speedup isn’t something to turn down if you’re using a database that supports this feature (MySQL does not).
Pay attention to your SELECT clauses
There’s one corollary to the above: don’t request things you don’t need in your queries; don’t default to SELECT *
Requesting only what you need will not only reduce the amount of data that has to travel over the wire, but in extreme cases can mean the difference between an index scan, and an index-only scan. In the above query, if we’d done SELECT * instead of SELECT title, pages, none of the indexes we added would have been able to help; those heap accesses would have continued to hurt us.
Wrapping up
To say that this post is only scratching the surface would be an understatement. The topic of indexing, and query optimization could fill entire books, and of course it has.
Hopefully, this post has you thinking about indexes the right way. Thinking about how indexes are stored on disk, and how they’re read. And never, ever forgetting about the fact that, when scanning an index, you may still need to visit the heap for every matched entry you find, which can get expensive.
Editor’s note: our The Complete Course for Building Backend Web Apps with Go includes setting up a PostgreSQL database and running it in Docker, all from scratch.
Code deploy happening shortly
Aug. 31st, 2025 07:37 pm![[staff profile]](https://www.dreamwidth.org/img/silk/identity/user_staff.png)
![[site community profile]](https://www.dreamwidth.org/img/comm_staff.png)
Per the dw_news post regarding the MS/TN blocks, we are doing a small code push shortly in order to get the code live. As per usual, please let us know if you see anything wonky.
There is some code cleanup we've been doing that is going out with this push but I don't think there is any new/reworked functionality, so it should be pretty invisible if all goes well.