Scaling financial transactions in a database can be a significant challenge, especially when correctness and consistency are paramount. For one financial system, PostgreSQL’s Serializable Transaction Isolation level was employed to ensure data accuracy. However, this approach led to performance bottlenecks caused by transaction conflicts and therefore retries.


The Problem: Serialization Conflicts in Foreign Key Checks

When dealing with financial transactions, using PostgreSQL’s Serializable Transaction Isolation ensures correctness by avoiding phenomena like dirty reads and phantom reads. However, this isolation level comes with a trade-off: retries due to serialization conflicts.

Despite spending significant amounts of time optimizing the interactions between transactions, the system continued to experience severe performance degradation due to retries stemming from those serialization conflicts.

At one point voices were starting to get louder that at our scale Serializable might not be the right choice anymore, and it became harder and harder for me to fight off those ideas, as i was unable to explain the huge number of conflicts.

Then the - by then - lead engineer for that system found something after another round of detailed investigation.

Index page contention due to sequential keys on foreign key constraint checks

Example: A Financial Transaction in Trouble

To better understand the issue, consider a simplified model of a financial transaction in the database.

Imagine two tables:

  1. Posting: Represents a financial transaction, identified by a PostingID (the primary key).
  2. PostingLines: Represents the individual lines of the financial transaction, each referencing the parent Posting via a foreign key on the PostingID column.

For every financial transaction, a new record is created in the Posting table, and two or more corresponding PostingLines are inserted into the PostingLines table. Here’s how it might look:

Posting Table

PostingID TransactionDate Amount
9 2024-01-01 300.00
10 2024-01-01 150.00

PostingLines Table

LineID PostingID Account Amount
1 9 A123 300.00
2 9 A456 300.00
3 10 A789 150.00
4 10 A123 150.00

In this setup:

  • The PostingID column in PostingLines is a foreign key referencing the Posting table’s primary key.
  • Every time a new financial transaction is created, inserts happen in both the Posting and PostingLines tables.

Here’s where the problem arises:

  • Because the PostingID values are sequential, the B-tree index clusters nearby IDs (e.g., 9 and 10) on the same index page.
  • When two transactions insert Posting and PostingLines records simultaneously, they both trigger foreign key checks on the same index page.
  • This leads to index page contention, causing serialization conflicts under the Serializable Transaction Isolation level.
B-Tree Index Structure
                                    [8]                          
                                /         \
                            [4]             [12]                 
                           /  \             /   \
                   [1,2,3] [4,5,6,7] [8, ->9,10<- ,11] [12,13,14,15]

This phenomenon, known as index page contention due to sequential keys or Last Page Insert Contention, forced retries and significantly impacted throughput.

A Conflict That Isn’t Real

What makes this issue especially frustrating is that there is no actual conflict between the two transactions. Neither transaction’s data references or influences the other.

The conflict occurs purely because of how PostgreSQL optimizes the foreign key checks:

  • PostgreSQL uses the index to validate that a referenced key exists.
  • Since the sequential PostingID values land on the same index page, two transactions attempting inserts at the same time end up reading and writing to the same page.
  • This triggers a serialization conflict due to how the Serializable isolation level works, even though the transactions are completely independent.

The Breakthrough: Solving the Issue with HASH Indexes

After extensive troubleshooting and optimization, the realization struck: the root cause wasn’t the actual data the transactions were fetching — it was the index page contention due to sequential keys. The foreign key checks, relying on a B-tree index with closely spaced IDs, were causing multiple transactions to simultaneously read and write to the same index page.

The solution was straightforward: adding a HASH index on the primary key column used for the foreign key check. Since foreign key checks only require equality comparisons, a HASH index was an ideal choice. By ensuring that keys were distributed across pages based on their hash value, the HASH index avoided the contention issues entirely.

Hash Index Structure (Using hash(key) = key MOD 4)
             ------------------------------------------
            |                Hash Buckets               |
             ------------------------------------------
            | (Page 1)   | (Page 2)       | (Page 3)       | (Page 4)   |
             ----------------------------------------------------
            | 4, 8, 12   | 1, 5, ->9<-,13 | 2,6,->10<-,14  | 3,7,11,15  |
             ----------------------------------------------------

What followed was one of the fastest issue resolutions experienced:

  • The HASH index was implemented in development.
  • Deployed to production within the same afternoon.
  • The result was an immediate and drastic improvement in transaction throughput.

Benchmarking Results: Proof of Improvement

To validate the effectiveness of this approach, performance was benchmarked across three configurations:

Configuration Total Records Exceptions Total Time (ms) Avg Time/Record (ms)
HASH index (Serializable) 500,000 47 116,934 0.234 (no retry on conflict)
B-tree index (Serializable) 500,000 43,255 134,466 0.269 (no retry on conflict)
Read Committed (No HASH index) 500,000 0 115,276 0.231

Insights and Lessons Learned

  • For foreign key checks in PostgreSQL, HASH indexes are a powerful alternative to B-tree indexes when high concurrency and serialization conflicts are issues.
  • Performance bottlenecks in databases are not always about the data being queried but can stem from how indexes interact with transaction isolation levels.
  • While UUIDs offer one solution due to their almost random distribution in B-Tree Indexes, their applicability depends on specific use cases and future developments like UUID v7, which might reintroduce similar issues.
  • We can happily life with having a B-Tree and a Hash Index on the same column

Conclusion

By switching from a B-tree to a HASH index for foreign key checks, a significant performance issue was resolved, allowing the system to handle high transaction volumes while maintaining strict consistency. This simple yet effective change underscores the importance of understanding the nuances of database indexing and transaction behavior.

Selfcontained Runnable Example

Runnable code that produced the Benchmark Data https://github.com/Timmeey/PostgresSerializationConflictIndexTesting

NOTE: This code was produced from my first time playing around with Cursor-AI