So, do users have access to primary keys? The answer is often yes: primary keys are very often exposed to the user, most often in URLs, as we want URLs to be stable (see cool URIs don't change). So, if you want to count the number of rows in a given table, all you have to do is to have the system generate a new entity and inspect its ID. Since the sequence starts at 1 and is incremented 1 by 1, currval roughly corresponds in our case to the number of rows in the table (not exactly, because of deletions). The current value corresponds to the last value used as a primary key. What's the problem? Why serial IDs can be a problem Information disclosure OK, so we have a perfectly good primary key for our entities. We can inspect its current state: > select currval('entity_entity_id_seq'::regclass) In postgresql, serial is implemented through a sequence, which acts a bit like a counter. The row has been inserted, and the entity has an automatically attributed id. > insert into entity (attribute) values ('ohai') We can then insert values in the table postgres will handle the entity_id field. The database does the bookkeeping for you, so it's transparent for the developer. What's a serial ID?īasically, a serial id is a number that increases everytime you insert a row. To be fair, you can configure your foreign key to have ON UPDATE CASCADE to have primary key updates automatically propagated everywhere within the database, but it doesn't solve the problem if you've exposed primary key values outside the DB. Serial IDs are a technical key, as they're not related to the actual contents of its entity. That's why to be able to have stable primary keys, we commonly resort to serial IDs. Relational DBs are very good at storing a relatively small amount of data, and are not well suited for immutable databases. The current state of the world is stored in the database, and entities are subject to mutation. But relational DBs are built upon a fundamentally mutable model. Ideally, semantic keys would be better, as it allows for a very simple way to understand and compare entities. It's constructed when the entity is inserted in the DB. You have two ways to construct your primary key: either as a semantic key, or as a technical key.Ī semantic primary key is extracted from the entities attributes (ie you use one or several fields of your entity as its primary key).Ī technical primary key is completely unrelated to the fields of its entity. The role of a primary key is to provide a stable, indexable reference to an entity. To be able to express a relation between entities, we need to be able to uniquely refer to any entity: that's the role of a primary key. Primary keys, technical keys and semantic keysĪ relational database is a graph where nodes are called entities and edges relations. They can be freely exposed without disclosing sensitive information, they are not predictable and they are performant. Today, I'll talk about why we stopped using serial integers for our primary keys, and why we're now extensively using Universally Unique IDs (or UUIDs) almost everywhere. As big users of PostgreSQL, we had the opportunity of re-thinking the idioms common in the world of relational DBs.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |