Order by one column if not null, otherwise by another column, columns are of not matching types

clip art of 
 a double-quote character

Question

I have an application that uses PostgreSQL as RDBMS. There is a table containing orders placed by customers. The table has an id as primary key, an order_number (integer) and a created_at field (timestamp). id and created_at are assigned automatically when the row is created, the order_number is assigned "manually" (when the person performing the order entry process receives the order confirmation, he presses a button and the order_number is assigned as MAX(order_number) + 1, this can happen almost immediately or after a few days, or the order can be cancelled so it never receives a number). In the table there are quite a lot of rows with NULL in the order_number column: the cancelled and pending confirmation orders. This is an excerpt from the table:

id |order_number|created_at             |
---+------------+-----------------------+
201|      210807|2021-06-09 15:24:17.000|
207|            |2021-06-15 13:51:00.000|
203|      210831|2021-06-11 13:34:02.000|
197|            |2021-06-07 13:37:54.000|
196|      210781|2021-06-04 08:33:59.000|
206|      210849|2021-06-15 08:15:42.000|
202|      210817|2021-06-10 10:13:27.000|
199|      210820|2021-06-09 13:21:09.000|
205|      210819|2021-06-11 14:12:20.000|
198|      210785|2021-06-07 14:04:31.000|
200|      210830|2021-06-09 15:03:36.000|
204|      210818|2021-06-11 13:51:06.000|
208|      210878|2021-06-15 14:20:21.000|

I need to order the table by order_number, if it’s not null, otherwise by created_at; that’s what I want to achieve:

id |order_number|created_at             |
---+------------+-----------------------+
196|      210781|2021-06-04 08:33:59.000|
197|            |2021-06-07 13:37:54.000
198|      210785|2021-06-07 14:04:31.000|
201|      210807|2021-06-09 15:24:17.000|
202|      210817|2021-06-10 10:13:27.000|
204|      210818|2021-06-11 13:51:06.000|
205|      210819|2021-06-11 14:12:20.000|
199|      210820|2021-06-09 13:21:09.000|
200|      210830|2021-06-09 15:03:36.000|
203|      210831|2021-06-11 13:34:02.000|
206|      210849|2021-06-15 08:15:42.000|
207|            |2021-06-15 13:51:00.000|
208|      210878|2021-06-15 14:20:21.000|

The clause ORDER BY order_number, created_at will not work. I tried ORDER BY COALESCE(order_number, created_at) as suggested in this stackoverflow question, but that doesn’t work because there is a mismatch between types.

As a last note, I cannot use the id column, because sometime the order number is entered some time after the initial order entry, as you can see in the example ids 199, 200, 203.

This is a fiddle with sample data.

If can help, I implemented in the application (c++) the logic above; data are in a std::vector of Order * objects, then I use sort:

sort(orders.begin(), orders.end(), OrderComparator);

OrderComparator is implemented as follows:

bool OrderComparator(Order * one, Order * two) {
   if(one->getOrderNumber()>0 && two->getOrderNumber()>0) {
       return one->getOrderNumber() > two->getOrderNumber();
   }
   return one->getCreatedAt() > two->getCreatedAt();
}

asked 2021-11-28 by Giorgio R


Answer

There are a few ways to do this, but essentially, you’ll want to calculate a value to be the first sort column. Essentially computing a "sort" value that will figure out the "nearest" order_number for the rows that have a null order_number, then you can sort on that, and use additional columns in your ORDER BY to put the null values in the right order.

In this example, I get the non-null order_number of the row that was created most immediately prior to the null row, and use that as the first sort column. This puts things in approximately the right order–then you can use additional ordering columns to control how the null order numbers show up in relation to the "closest order number" that was used for the initial sorting.

WITH orders_cte AS (
      SELECT *,
             (SELECT o2.order_number 
                FROM orders AS o2
                WHERE o2.created_at <= o.created_at
                AND o2.order_number IS NOT NULL
                ORDER BY created_at DESC
               LIMIT 1) AS primary_sort 
      FROM orders AS o
      )
SELECT id, order_number, created_at
FROM orders_cte
ORDER BY primary_sort, order_number, created_at

answered 2021-11-28 by Andy Mallon