SSC-FDM-TD0033

ACTIVITY_COUNT transformation might require manual adjustments

Description

The ACTIVITY_COUNT status variable returns the number of rows affected by an SQL DML statement in an embedded SQL or stored procedure application. For more information check here.

As explained in its translation specification, there is a workaround to emulate ACTIVITY_COUNT's behavior through:

SELECT $1 FROM TABLE(RESULT_SCAN(LAST_QUERY_ID()));

However, this presents some limitations listed below.

Limitations

First case

If ACTIVITY_COUNT is called twice or more times before executing another DML statement, the transformation might not return the expected values.

IN -> Teradata_01.sql
REPLACE PROCEDURE InsertEmployeeSalaryAndLog_1 ()
BEGIN
    DECLARE row_count1 INT;

    INSERT INTO employees (employee_id, first_name, last_name, department_id, salary)
    VALUES (101, 'Alice', 'Smith', 10, 70000.00);

    -- Get the ACTIVITY_COUNT
    SET row_count1 = ACTIVITY_COUNT;
    SET row_count1 = ACTIVITY_COUNT;

    -- Insert the ACTIVITY_COUNT into the activity_log table
    INSERT INTO activity_log (operation, row_count)
    VALUES ('INSERT PROCEDURE', row_count1);
END;

REPLACE PROCEDURE InsertEmployeeSalaryAndLog_2 ()
BEGIN
    DECLARE row_count1 INT;
    DECLARE message VARCHAR(100);

    INSERT INTO employees (employee_id, first_name, last_name, department_id, salary)
    VALUES (101, 'Alice', 'Smith', 10, 70000.00);

    -- Get the ACTIVITY_COUNT
    SET row_count1 = ACTIVITY_COUNT + 1;
    SET row_count1 = ACTIVITY_COUNT;

    -- Insert the ACTIVITY_COUNT into the activity_log table
    INSERT INTO activity_log (operation, row_count)
    VALUES ('INSERT PROCEDURE', row_count1);
END;

In both procedures, ACTIVITY_COUNT is called twice before another DML statement is called. In Teradata, ACTIVITY_COUNT will return the number of rows in the INSERT statement above them, even when called twice. However, since the Snowflake transformation uses LAST_QUERY_ID(), the result depends on the result set held by LAST_QUERY_ID().

InsertEmployeeSalaryAndLog_1() requires no manual adjustments. Check the Query History (bottom-up):

  1. INSERT statement is executed. LAST_QUERY_ID() will point to this statement.

  2. SELECT (first ACTIVITY_COUNT) is executed, and $1 will be 1. LAST_QUERY_ID() will point to this statement.

  3. SELECT (second ACTIVITY_COUNT) is executed; since the last statement result was 1, $1 will be 1 for this SELECT as well.

  4. Finally, row_count1 holds the value 1, which is inserted in activity_log.

On the other side, InsertEmployeeSalaryAndLog_2() does require manual adjustments. Check the Query History (bottom-up):

  1. INSERT statement is executed. LAST_QUERY_ID() will point to this statement.

  2. SELECT (first ACTIVITY_COUNT) is executed, and $1 will be 1. However, notice how QUERY_TEXT has the + 10; this will affect the result that will be scanned. LAST_QUERY_ID() will point to this statement.

  3. SELECT (second ACTIVITY_COUNT) is executed. The result for the last query is 11; thus $1 will hold 11 instead of the expected 1.

  4. Finally, row_count1 holds the value 11, which is inserted in activity_log.

These are the values inserted in activity_log:

LOG_ID
OPERATION
ROW_COUNT
LOG_TIMESTAMP

1

INSERT PROCEDURE

1

2024-07-15 09:22:21.725

101

INSERT PROCEDURE

11

2024-07-15 09:22:26.248

Adjustments for the first case

As per Snowflake's documentation for LAST_QUERY_ID, you can specify the query to return, based on the position of the query. LAST_QUERY_ID(-1) returns the latest query, (-2) the second last query, and so on.

The fix for the problem in InsertEmployeeSalaryAndLog_2() will be to simply specify LAST_QUERY_ID(-2) in the second use of ACTIVITY_COUNT (second SELECT) so that it gets the results from the INSERT statement instead:

...
INSERT INTO employees (employee_id, first_name, last_name, department_id, salary)
        VALUES (101, 'Alice', 'Smith', 10, 70000.00);

           -- Get the ACTIVITY_COUNT
        row_count1 := (
            SELECT
                $1
            FROM
                TABLE(RESULT_SCAN(LAST_QUERY_ID()))
        ) /*** SSC-FDM-TD0033 - 'ACTIVITY_COUNT' TRANSFORMATION MIGHT REQUIRE MANUAL ADJUSTMENTS ***/ + 1;
        row_count1 := (
            SELECT
                $1
            FROM
                TABLE(RESULT_SCAN(LAST_QUERY_ID(-2)))
        ) /*** SSC-FDM-TD0033 - 'ACTIVITY_COUNT' TRANSFORMATION MIGHT REQUIRE MANUAL ADJUSTMENTS ***/;
...

Second case

If ACTIVITY_COUNT is called after a non DML statement was executed, the transformation will not return the expected values.

IN -> Teradata_02.sql
REPLACE PROCEDURE InsertEmployeeSalaryAndLog_3 ()
BEGIN
    DECLARE row_count1 INT;
    DECLARE emp_id INT;
    DECLARE message VARCHAR(100);

    INSERT INTO employees (employee_id, first_name, last_name, department_id, salary)
    VALUES (101, 'Alice', 'Smith', 10, 70000.00);

    SELECT employee_id INTO emp_id FROM employees;
    -- Get the ACTIVITY_COUNT
    SET row_count1 = ACTIVITY_COUNT;
    SET message = 'EMPLOYEE INSERTED - ID: ' || emp_id;

    -- Insert the ACTIVITY_COUNT into the activity_log table
    INSERT INTO activity_log (operation, row_count)
    VALUES (message, row_count1);
END;

Similar to the previous, LAST_QUERY_ID does not point to the correct query and thus returns an incorrect value, which is assigned to row_count1. Check the Query History (bottom-up):

  1. INSERT statement is executed. LAST_QUERY_ID() will point to this statement.

  2. SELECT INTO is executed, and $1 will be 101. LAST_QUERY_ID() will point to this statement.

  3. SELECT (ACTIVITY_COUNT) is executed. The result for the last query is 101; thus $1 will hold 101 instead of the expected 1.

  4. Finally, row_count1 holds the value 101, which is inserted in activity_log.

These are the values inserted in activity_log:

LOG_ID
OPERATION
ROW_COUNT
LOG_TIMESTAMP

1

EMPLOYEE INSERTED - ID: 101

101

2024-07-15 11:00:38.000

Adjustments for the second case

  1. One possible fix is to specify the correct query to return by LAST_QUERY_ID. For example, here LAST_QUERY_ID(-2) will be the correct query to point to.

...
row_count1 := (
            SELECT
                $1
            FROM
                TABLE(RESULT_SCAN(LAST_QUERY_ID(-2)))
               ) /*** SSC-FDM-TD0033 - 'ACTIVITY_COUNT' TRANSFORMATION MIGHT REQUIRE MANUAL ADJUSTMENTS ***/;
               ...
  1. Another possible fix is to use ACTIVITY_COUNT (SELECT) immediately after executing the INSERT statement.

...
INSERT INTO employees (employee_id, first_name, last_name, department_id, salary)
VALUES (101, 'Alice', 'Smith', 10, 70000.00);
-- Get the ACTIVITY_COUNT
       row_count1 := (
    SELECT
        $1
    FROM
        TABLE(RESULT_SCAN(LAST_QUERY_ID()))
       ) /*** SSC-FDM-TD0033 - 'ACTIVITY_COUNT' TRANSFORMATION MIGHT REQUIRE MANUAL ADJUSTMENTS ***/;
SELECT
    employee_id INTO
    :emp_id
FROM
    employees;
       message := 'EMPLOYEE INSERTED - ID: ' || emp_id;
...

Recommendations

  • Make sure to point to the correct query when using LAST_QUERY_ID.

  • Make sure ACTIVITY_COUNT is used immediately after the DML statement to evaluate.

  • If you need more support, you can email us at snowconvert-support@snowflake.com

Last updated