Using OPENJSON, can’t get results to GROUP BY

clip art of 
 a double-quote character

Question

I’m passing a JSON string to a stored procedure in order to insert warehouse inventory.

Sometimes, there will be multiple line items with the same product, going to the same shelf, on the same pallet (LPN), with the same date. I would like to aggregate those into one line in the table.

The JSON represents individual lines that have been received in, and put away into a location. Here there are 2 identical items:

[{"StockCode": "ABC123", "Qty": "200", "Bin": "E4B4_L", "LPN": "1234", "PutDate": "2022-01-21 18:35:53 UTC"}, {"StockCode": "ABC123", "Qty": "400", "Bin": "E4B4_L", "LPN": "1234", "PutDate": "2022-01-21 18:36:43 UTC"}]

So:

ABC | 200 |  2022-01-21 00:00:00.000 | LPN1234
ABC | 400 |  2022-01-21 00:00:00.000 | LPN1234

Should aggregate to:

ABC | 600 |  2022-01-21 00:00:00.000 | LPN1234

I tried to GROUP BY and SUM on the qty but the resulting lines in the table are still 2 separate lines. I realized the PutDate were not the same due to the timestamp, so I thought for sure casting it to DATE would solve it, but it did not.

SQL script:

ALTER Procedure spc_PutAway
(@json NVARCHAR(MAX) = '')
AS
BEGIN


INSERT INTO warehouse (Bin, StockCode, Qty, PutDate, VerDate, LPN)
SELECT Bin, StockCode, Sum(Qty) as Qty, CAST(PutDate AS DATE) as PutDate, GETDATE() as VerDate, LPN
    FROM OPENJSON(@json)
    WITH (
        Bin VARCHAR(20) '$.Bin',
        StockCode VARCHAR(30) '$.StockCode',
        Qty DECIMAL(18,6) '$.Qty',
        PutDate VARCHAR(20) '$.PutDate',
        LPN VARCHAR(50) '$.LPN'
        )
WHERE Bin <> 'DEFAULT'
GROUP BY StockCode, Bin, PutDate, LPN

END

Results in table:

Bin StockCode   Qty PutDate VerDate LPN TransID VerCol
E4B4_L  ABC123  200.000000  2022-01-21 00:00:00.000 2022-01-21 10:52:43.823 1234    1   0x000000000000275D
E4B4_L  ABC123  400.000000  2022-01-21 00:00:00.000 2022-01-21 10:52:43.823 1234    2   0x000000000000275E

asked 2022-01-24 by hap76


Answer

It’s a coincidence that your query even works.

Normally, you cannot use an alias in the GROUP BY clause of a query–you have to use the actual column name. For example, If you try to run this simple query you’ll get the error Invalid column name 'ObjectName' when you try to execute the query (note that it will parse successfully, but only error on execution):

SELECT ObjectName = OBJECT_NAME(object_id),
        NumberColumns = COUNT(*)
FROM sys.columns
GROUP BY ObjectName;

To fix this, you’d either use the non-aliased column name, or add in a CTE or subquery to force the layer of abstraction & make it so that the query engine understands the alias.

In your query, you coincidentally use the same name for the shredded column from the OPENJSON...WITH and the computed alias in your select. This obscures the real issue. If you change the name you’re using to come out of the JSON shredding & the reference in your SELECT, but continue to use the Alias in your GROUP BY, you’ll get an Invalid column error message:

SELECT Bin, StockCode, Sum(Qty) as Qty, CAST(PutDateText AS DATE) as PutDate, GETDATE() as VerDate, LPN
    FROM OPENJSON(@json)
    WITH (
        Bin VARCHAR(20) '$.Bin',
        StockCode VARCHAR(30) '$.StockCode',
        Qty DECIMAL(18,6) '$.Qty',
        PutDateText VARCHAR(20) '$.PutDate', --changed this (and reference above)
        LPN VARCHAR(50) '$.LPN'
        )
WHERE Bin <> 'DEFAULT'
GROUP BY StockCode, Bin, PutDate, LPN;

As an aside, now that we’ve changed some of the labels, we can see that the functional equivalent of your original query is actually this:

SELECT Bin, StockCode, Sum(Qty) as Qty, CAST(PutDateText AS DATE) as PutDate, GETDATE() as VerDate, LPN
    FROM OPENJSON(@json)
    WITH (
        Bin VARCHAR(20) '$.Bin',
        StockCode VARCHAR(30) '$.StockCode',
        Qty DECIMAL(18,6) '$.Qty',
        PutDateText VARCHAR(20) '$.PutDate',
        LPN VARCHAR(50) '$.LPN'
        )
WHERE Bin <> 'DEFAULT'
GROUP BY StockCode, Bin, PutDateText, LPN;

The "quick" way to make this work is not using the alias, and just using the direct column name (in this case, a formula using the CAST() function:

SELECT Bin, StockCode, Sum(Qty) as Qty, CAST(PutDateText AS DATE) as PutDate, GETDATE() as VerDate, LPN
    FROM OPENJSON(@json)
    WITH (
        Bin VARCHAR(20) '$.Bin',
        StockCode VARCHAR(30) '$.StockCode',
        Qty DECIMAL(18,6) '$.Qty',
        PutDateText VARCHAR(20) '$.PutDate',
        LPN VARCHAR(50) '$.LPN'
        )
WHERE Bin <> 'DEFAULT'
GROUP BY StockCode, Bin, CAST(PutDateText AS DATE), LPN

Personally, I’d use a CTE to shred the JSON into a tabular format, then construct my query using the CTE to do the aggregates. This feels a bit "cleaner" to me and allows me to use the alias names wherever I want. This has an advantage of not having to repeat function calls & keep them in perfect unison between SELECT & GROUP BY:

WITH JsonTable AS (
    SELECT Bin, StockCode, Qty, CAST(PutDate AS DATE) as PutDate, LPN
        FROM OPENJSON(@json)
        WITH (
            Bin VARCHAR(20) '$.Bin',
            StockCode VARCHAR(30) '$.StockCode',
            Qty DECIMAL(18,6) '$.Qty',
            PutDate VARCHAR(20) '$.PutDate',
            LPN VARCHAR(50) '$.LPN'
            )
)
INSERT INTO warehouse (Bin, StockCode, Qty, PutDate, VerDate, LPN)
SELECT Bin, 
        StockCode, 
        SUM(Qty) as Qty, 
        PutDate, 
        GETDATE() AS VerDate, 
        LPN
FROM JsonTable
WHERE Bin <> 'DEFAULT'
GROUP BY StockCode, Bin, PutDate, LPN;

But you could also use an inline subquery in much the same way.

INSERT INTO warehouse (Bin, StockCode, Qty, PutDate, VerDate, LPN)
SELECT Bin, StockCode, Sum(Qty) as Qty, CAST(PutDate AS DATE) as PutDate, GETDATE() as VerDate, LPN
FROM (
        SELECT Bin, StockCode, Qty, CAST(PutDate AS DATE) as PutDate, LPN
            FROM OPENJSON(@json)
            WITH (
                Bin VARCHAR(20) '$.Bin',
                StockCode VARCHAR(30) '$.StockCode',
                Qty DECIMAL(18,6) '$.Qty',
                PutDate VARCHAR(20) '$.PutDate',
                LPN VARCHAR(50) '$.LPN'
                )
    ) AS x
WHERE Bin <> 'DEFAULT'
GROUP BY StockCode, Bin, PutDate, LPN;

answered 2022-01-24 by Andy Mallon