Sometimes when working with a SQL Server database, you need to update a specific table row or insert a new row if the one you were trying to update isn’t already there.
This logic is known as an ‘upsert’ operation i.e. either update or insert a row depending on whether it currently exists.
In this article, I am going to show you how to perform a SQL upsert operation.
Specifically, I will demonstrate how to update a value based on a condition and insert a new row if required. Additionally, you will see how the updated or inserted value can be returned without needing to query the database again.
The specific scenario that I am covering involves a somewhat basic requirement.
The requirement is this; to maintain a numeric counter value that holds the current order number. This order number is to be assigned to incoming customer orders in our application.
The order number should start at 1 and should not exceed 999. After reaching a value of 999 the order number should be reset to 1 and start counting up again from there.
An order counter value will need to be stored and incremented each time that a new order number is required. By doing so, each customer should receive a unique order number for their order.
The goal is to increment the stored order counter value and return the updated order number to the calling application. We want to do this without performing any in-memory calculations on the client, as doing so would risk the same order number being issued more than once around the same time. All of the logic to update the counter value and retrieve the order number should happen on the server.
Setting things up
In order to demonstrate the conditional upsert operation described in the previous section, let’s create a sample table in a test database.
I recommend using SQL Server Management Studio to create a blank database and to execute the SQL queries that are documented in this article. Alternatively, you can use the cross-platform Azure Data Studio.
The demonstration table can be called ‘Settings’, defined as follows.
CREATE TABLE [dbo].[Settings]( [Name] [nvarchar](50) NOT NULL, [IntegerValue] [int] NOT NULL ) ON [PRIMARY]
The Settings table contains two columns; ‘Name’ which will hold the name of the setting and ‘IntegerValue’ which will hold a numeric value for the setting.
In our case, there will be a single Settings table row that stores ‘Current Order Number’ as the setting name and the order number as the integer value.
Go ahead and run the above query to create the sample table in a testing database if you want to follow along.
Now that we have a table to work with, let’s take a look at the SQL needed to meet the requirements.
DECLARE @OrderInfo TABLE (OrderNumber INT); UPDATE Settings SET IntegerValue = CASE WHEN IntegerValue < 999 THEN IntegerValue + 1 ELSE 1 END OUTPUT Inserted.IntegerValue INTO @OrderInfo WHERE Name = 'Current Order Number'; IF @@ROWCOUNT = 0 INSERT INTO Settings (Name, IntegerValue) OUTPUT Inserted.IntegerValue INTO @OrderInfo VALUES ('Current Order Number', 1); SELECT OrderNumber FROM @OrderInfo;
Let’s break down the above SQL code.
The first line declares a
TABLE variable with a single
INT type column called ‘OrderNumber’. This will be used to hold the value we are going to return.
UPDATE statement sets the value of the ‘IntegerValue’ column in the ‘Settings’ table. Notice that when using the
SET keyword a condition is added via a
CASE statement. As long as the value of the ‘IntegerValue’ column is less than 999 the value will be incremented by 1, otherwise, the value will be set to 1.
OUTPUT keyword is used to output the value that was updated/inserted.
Normally when issuing an
UPDATE statement with SQL Server no output value is sent back to the client, only the number of rows affected is returned. The
OUTPUT keyword allows the value that was set to be returned without having to query the database in a separate statement. This is very useful as it guarantees that the value that was set is the one that will be output and it improves performance.
In the above example, the
OUTPUT keyword is used in conjunction with the
INTO keyword to output the value into the
TABLE variable. This is needed in our case since the
INSERT statement that follows will possibly execute and we only want to return one value to the client; not two.
@@ROWCOUNT returns the number of rows that were affected by the last SQL statement. We use this value to determine if the preceding code updated an existing row or not. If
@@ROWCOUNT is 0 then an existing row does not exist and a new row will need to be inserted instead.
In this case, an
INSERT INTO statement is issued along with the
VALUES to be inserted. The
OUTPUT statement needs to go in between the columns and values in a similar pattern to the
Lastly, since the value that was either updated or inserted was output into the
TABLE variable, we need to select the value of the ‘OrderNumber’ column from it.
The first time the SQL is executed, a row will be inserted into the Settings table with a value of 1. The next execution will result in the value of the existing row being updated to 2, the third execution will result in the value being updated to 3 and so on.
When it comes to writing your own upsert operation and running it in a production environment, there are some other things you’ll need to consider, especially in regards to concurrency.
Depending on the nature of your query and how your database is being accessed by different clients, you may need to think about how concurrent access will affect the row you are upserting.
Transaction isolation levels
In a SQL database, transaction isolation levels are used to control the type of locks that are acquired when performing read operations. There are four main isolation levels that are defined in the ANSI standard, as follows.
- READ UNCOMMITTED
- READ COMMITTED
- REPEATABLE READ
Note that SQL Server defines a further (non-standard) transaction isolation level called SNAPSHOT which in general can offer greater concurrency at the expense of potential update conflicts.
As an example, if you wanted to use the REPEATABLE READ isolation level, to ensure that no other sessions can update a value that you want to update at the same time, you would craft your SQL as follows.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; BEGIN TRANSACTION; -- UPDATE STATEMENT GOES HERE.
-- ... COMMIT TRANSACTION;
If you wanted to opt for the SERIALIZABLE transaction isolation level to guarantee complete isolation of your query (at the expense of reduced database concurrency due to holding an exclusive lock) you would swap
REPEATABLE READ for
SERIALIZABLE in the above SQL.
Suffice to say there are tradeoffs to make when choosing a transaction isolation level.
I’m not going to go into the finer details of transaction isolation levels in this article, but I will stress to you that it is very important to understand them as a developer who uses SQL Server.
In this article, I have detailed how to perform an ‘upsert’ operation in SQL Server.
A specific scenario involving incrementing an order number was covered to help you understand the concept and to demonstrate how to conditionally update a value and return the precise value that was either updated or inserted.
Lastly, I discussed additional considerations around concurrent access and transaction isolation levels.
I hope you enjoyed this post! Comments are always welcome and I respond to all questions.
If you like my content and it helped you out, please check out the button below 🙂