Using OPENJSON, can’t get results to GROUP BY
Question
I’m passing a JSON string to a stored procedure in order to insert warehouse inventory.
Sometimes, there will be multiple line items with the same product, going to the same shelf, on the same pallet (LPN), with the same date. I would like to aggregate those into one line in the table.
The JSON represents individual lines that have been received in, and put away into a location. Here there are 2 identical items:
[{"StockCode": "ABC123", "Qty": "200", "Bin": "E4B4_L", "LPN": "1234", "PutDate": "2022-01-21 18:35:53 UTC"}, {"StockCode": "ABC123", "Qty": "400", "Bin": "E4B4_L", "LPN": "1234", "PutDate": "2022-01-21 18:36:43 UTC"}]
So:
ABC | 200 | 2022-01-21 00:00:00.000 | LPN1234 ABC | 400 | 2022-01-21 00:00:00.000 | LPN1234
Should aggregate to:
ABC | 600 | 2022-01-21 00:00:00.000 | LPN1234
I tried to GROUP BY and SUM on the qty but the resulting lines in the table are still 2 separate lines. I realized the PutDate were not the same due to the timestamp, so I thought for sure casting it to DATE would solve it, but it did not.
SQL script:
ALTER Procedure spc_PutAway (@json NVARCHAR(MAX) = '') AS BEGIN INSERT INTO warehouse (Bin, StockCode, Qty, PutDate, VerDate, LPN) SELECT Bin, StockCode, Sum(Qty) as Qty, CAST(PutDate AS DATE) as PutDate, GETDATE() as VerDate, LPN FROM OPENJSON(@json) WITH ( Bin VARCHAR(20) '$.Bin', StockCode VARCHAR(30) '$.StockCode', Qty DECIMAL(18,6) '$.Qty', PutDate VARCHAR(20) '$.PutDate', LPN VARCHAR(50) '$.LPN' ) WHERE Bin <> 'DEFAULT' GROUP BY StockCode, Bin, PutDate, LPN END
Results in table:
Bin StockCode Qty PutDate VerDate LPN TransID VerCol E4B4_L ABC123 200.000000 2022-01-21 00:00:00.000 2022-01-21 10:52:43.823 1234 1 0x000000000000275D E4B4_L ABC123 400.000000 2022-01-21 00:00:00.000 2022-01-21 10:52:43.823 1234 2 0x000000000000275E
asked 2022-01-24 by hap76
Answer
It’s a coincidence that your query even works.
Normally, you cannot use an alias in the GROUP BY
clause of a query–you have to use the actual column name. For example, If you try to run this simple query you’ll get the error Invalid column name 'ObjectName'
when you try to execute the query (note that it will parse successfully, but only error on execution):
SELECT ObjectName = OBJECT_NAME(object_id), NumberColumns = COUNT(*) FROM sys.columns GROUP BY ObjectName;
To fix this, you’d either use the non-aliased column name, or add in a CTE or subquery to force the layer of abstraction & make it so that the query engine understands the alias.
In your query, you coincidentally use the same name for the shredded column from the OPENJSON...WITH
and the computed alias in your select. This obscures the real issue. If you change the name you’re using to come out of the JSON shredding & the reference in your SELECT
, but continue to use the Alias in your GROUP BY
, you’ll get an Invalid column
error message:
SELECT Bin, StockCode, Sum(Qty) as Qty, CAST(PutDateText AS DATE) as PutDate, GETDATE() as VerDate, LPN FROM OPENJSON(@json) WITH ( Bin VARCHAR(20) '$.Bin', StockCode VARCHAR(30) '$.StockCode', Qty DECIMAL(18,6) '$.Qty', PutDateText VARCHAR(20) '$.PutDate', --changed this (and reference above) LPN VARCHAR(50) '$.LPN' ) WHERE Bin <> 'DEFAULT' GROUP BY StockCode, Bin, PutDate, LPN;
As an aside, now that we’ve changed some of the labels, we can see that the functional equivalent of your original query is actually this:
SELECT Bin, StockCode, Sum(Qty) as Qty, CAST(PutDateText AS DATE) as PutDate, GETDATE() as VerDate, LPN FROM OPENJSON(@json) WITH ( Bin VARCHAR(20) '$.Bin', StockCode VARCHAR(30) '$.StockCode', Qty DECIMAL(18,6) '$.Qty', PutDateText VARCHAR(20) '$.PutDate', LPN VARCHAR(50) '$.LPN' ) WHERE Bin <> 'DEFAULT' GROUP BY StockCode, Bin, PutDateText, LPN;
The "quick" way to make this work is not using the alias, and just using the direct column name (in this case, a formula using the CAST()
function:
SELECT Bin, StockCode, Sum(Qty) as Qty, CAST(PutDateText AS DATE) as PutDate, GETDATE() as VerDate, LPN FROM OPENJSON(@json) WITH ( Bin VARCHAR(20) '$.Bin', StockCode VARCHAR(30) '$.StockCode', Qty DECIMAL(18,6) '$.Qty', PutDateText VARCHAR(20) '$.PutDate', LPN VARCHAR(50) '$.LPN' ) WHERE Bin <> 'DEFAULT' GROUP BY StockCode, Bin, CAST(PutDateText AS DATE), LPN
Personally, I’d use a CTE to shred the JSON into a tabular format, then construct my query using the CTE to do the aggregates. This feels a bit "cleaner" to me and allows me to use the alias names wherever I want. This has an advantage of not having to repeat function calls & keep them in perfect unison between SELECT
& GROUP BY
:
WITH JsonTable AS ( SELECT Bin, StockCode, Qty, CAST(PutDate AS DATE) as PutDate, LPN FROM OPENJSON(@json) WITH ( Bin VARCHAR(20) '$.Bin', StockCode VARCHAR(30) '$.StockCode', Qty DECIMAL(18,6) '$.Qty', PutDate VARCHAR(20) '$.PutDate', LPN VARCHAR(50) '$.LPN' ) ) INSERT INTO warehouse (Bin, StockCode, Qty, PutDate, VerDate, LPN) SELECT Bin, StockCode, SUM(Qty) as Qty, PutDate, GETDATE() AS VerDate, LPN FROM JsonTable WHERE Bin <> 'DEFAULT' GROUP BY StockCode, Bin, PutDate, LPN;
But you could also use an inline subquery in much the same way.
INSERT INTO warehouse (Bin, StockCode, Qty, PutDate, VerDate, LPN) SELECT Bin, StockCode, Sum(Qty) as Qty, CAST(PutDate AS DATE) as PutDate, GETDATE() as VerDate, LPN FROM ( SELECT Bin, StockCode, Qty, CAST(PutDate AS DATE) as PutDate, LPN FROM OPENJSON(@json) WITH ( Bin VARCHAR(20) '$.Bin', StockCode VARCHAR(30) '$.StockCode', Qty DECIMAL(18,6) '$.Qty', PutDate VARCHAR(20) '$.PutDate', LPN VARCHAR(50) '$.LPN' ) ) AS x WHERE Bin <> 'DEFAULT' GROUP BY StockCode, Bin, PutDate, LPN;
answered 2022-01-24 by Andy Mallon