MySQL Input Parameters Add Flexibility to Crosstab Stored Procedures

Wednesday Apr 7th 2010 by Rob Gravelle
Share:

When generating a result set where the query contains an unknown number of column and/or row values we can use a combination of Prepared Statements, which allows us to tailor the output based on the number of data values. We can also add input parameters to a procedure to assign the field names, aliases, and even the aggregate function!

When generating a result set where the query contains an unknown number of column and/or row values we can use a combination of Prepared Statements, which allows us to tailor the output based on the number of data values. We can also add input parameters to a procedure to assign the field names, aliases, and even the aggregate function!

In the previous article of the MySQL crosstab series, entitled MySQL Prepared Statements to Generate Crosstab SQL, we worked on generating the result set where the query contains an unknown number of column and/or row values. Our solution was based on a combination of Prepared Statements, as well as the concat() and group_concat() functions. This approach allowed us to tailor the output based on the number of data values, so that we could free ourselves from having to anticipate changes down the road. In today’s installment, we are going to take things one step further by adding input parameters to our proc, so that we can assign the field names, aliases, and even the aggregate function!

Recap of the Crosstab Stored Procedure

Let’s quickly revisit the stored proc that we created to dynamically generate the SQL code and result set. Inside the procedure, we generated the SQL for the query and saved it to a variable using the SELECT INTO syntax. A Prepared Statement was then utilized to execute the generated code:

CREATE PROCEDURE `p_case_counts_per_region_by_month`() 
LANGUAGE SQL 
NOT DETERMINISTIC 
CONTAINS SQL 
SQL SECURITY DEFINER 
BEGIN  
  SELECT concat(
    "SELECT CASE WHEN Month_Num IS NULL", "\n", 
    "            THEN 'TOTAL'", "\n", 
    "            ELSE Month", "\n", 
    "       END        AS 'Month',", "\n",
    group_concat( DISTINCT concat("       REGION_", REGION_CODE, 
                                  "  AS 'REGION ", REGION_CODE, "',", "\n"
                           )
                  order by REGION_CODE
                  separator '' 
                ),
    "       TOTAL", "\n",
    "FROM  (	SELECT	MONTH(CREATION_DATE)\t\t\t\t\t\t\t\tAS Month_Num,", "\n",
    "\t\tCONVERT(MONTHNAME(CREATION_DATE) USING latin1)\t\t\t\t\t\t\t\t\tAS 'Month',", "\n",  
    group_concat( 
        DISTINCT concat("\t\t\t\tCOUNT(CASE WHEN REGION_CODE ='", REGION_CODE, 
                        "' THEN FEE_NUMBER ELSE NULL END) AS 'REGION_", 
                        REGION_CODE, "',", "\n"
                 )
        order by REGION_CODE
        separator '' 
    ),
    "      	COUNT(*)\t\t\t\t\t\t\t\t\t\t\t\t\tAS 'TOTAL'", "\n",
    "		FROM  TA_CASES", "\n",
    "		WHERE YEAR(CREATION_DATE)=", YEAR(CREATION_DATE), "\n",
    "		GROUP BY Month_Num WITH ROLLUP) AS CA;"
  ) statement
  into @case_counts_per_region_by_month
  FROM TA_CASES
  WHERE YEAR(CREATION_DATE)=1998;

  prepare case_counts_per_region_by_month   
  from @case_counts_per_region_by_month;    
  execute case_counts_per_region_by_month;   
  deallocate prepare case_counts_per_region_by_month; 
END

Running the proc confirmed that it produced the desired results:

mysql> call p_case_counts_per_region_by_month;

Month

REGION 01

REGION 02

REGION 03

REGION 04

REGION 05

TOTAL

April

13

33

76

2

47

171

May

17

55

209

1

143

425

June

8

63

221

1

127

420

July

13

104

240

6

123

486

August

18

121

274

9

111

533

September

25

160

239

2

88

514

October

9

88

295

2

127

521

November

2

86

292

2

120

502

December

1

128

232

6

155

522

TOTAL

106

838

2078

31

1041

4094

Defining the Input Parameters

All of our input parameters will be of type VARCHAR so that they can accommodate strings of variable length. For the sake of simplicity, we’ll assign all of our parameters a size of sixty four, which happens to be the maximum field name length in MySQL, as the following chart affirms:

Identifier

Maximum Length (characters)

Database

64

Table

64

Column

64

Index

64

Constraint

64

Stored Function or Procedure

64

Trigger

64

View

64

Alias

256

Compound Statement Label

16

The AggregateFunction Input Parameter

The AggregateFunction parameter is an especially interesting one in that it will allow us to apply different kinds of calculations on our data. MySQL supports ten such functions:

Name

Description

AVG()

Return the average value of the argument

BIT_AND()

Return bitwise and

BIT_OR()

Return bitwise or

BIT_XOR()(v4.1.1)

Return bitwise xor

COUNT(DISTINCT)

Return the count of a number of different values

COUNT()

Return a count of the number of rows returned

GROUP_CONCAT()(v4.1)

Return a concatenated string

MAX()

Return the maximum value

MIN()

Return the minimum value

STD()

Return the population standard deviation

Here is what the modified proc looks like with the input parameter. The new code is highlighted in Red:

CREATE PROCEDURE `p_case_counts_per_region_by_month` (IN AggregateFunction VARCHAR(64)) 
LANGUAGE SQL 
NOT DETERMINISTIC 
CONTAINS SQL 
SQL SECURITY DEFINER 
BEGIN  
  SELECT concat(
    "SELECT CASE WHEN Month_Num IS NULL", "\n", 
    "            THEN 'TOTAL'", "\n", 
    "            ELSE Month", "\n", 
    "       END        AS 'Month',", "\n",
    group_concat( DISTINCT concat("       REGION_", REGION_CODE, 
                                  "  AS 'REGION ", REGION_CODE, "',", "\n"
                           )
                  order by REGION_CODE
                  separator '' 
                ),
    "       TOTAL", "\n",
    "FROM  (	SELECT	MONTH(CREATION_DATE)\t\t\t\t\t\t\t\tAS Month_Num,", "\n",
    "\t\tCONVERT(MONTHNAME(CREATION_DATE) USING latin1)\t\t\t\t\t\t\t\t\tAS 'Month',", "\n",  
    group_concat( 
        DISTINCT concat("\t\t\t\t", AggregateFunction, "(CASE WHEN REGION_CODE ='", REGION_CODE, 
                        "' THEN FEE_NUMBER ELSE NULL END) AS 'REGION_", 
                        REGION_CODE, "',", "\n"
                 )
        order by REGION_CODE
        separator '' 
    ),
    "      	", AggregateFunction, "(FEE_NUMBER)\t\t\t\t\t\t\t\t\t\t\t\t\tAS 'TOTAL'", "\n",
    "		FROM  TA_CASES", "\n",
    "		WHERE YEAR(CREATION_DATE)=", YEAR(CREATION_DATE), "\n",
    "		GROUP BY Month_Num WITH ROLLUP) AS CA;"
  ) statement
  into @case_counts_per_region_by_month
  FROM TA_CASES
  WHERE YEAR(CREATION_DATE)=1998;

  prepare case_counts_per_region_by_month   
  from @case_counts_per_region_by_month;    
  execute case_counts_per_region_by_month;   
  deallocate prepare case_counts_per_region_by_month; 
END

We can now easily manipulate the data using different aggregate functions, such as SUM():

mysql> call p_case_counts_per_region_by_month("SUM");

Month

REGION 01

REGION 02

REGION 03

REGION 04

REGION 05

TOTAL

April

8114

13638

44372

721

17453

84298

May

5869

25239

107456

228

49511

188303

June

2901

28411

99510

206

42998

174026

July

4949

51261

90859

1498

38913

187480

August

6891

55694

122490

2949

38642

226666

September

7342

73118

106173

442

30324

217399

October

2736

30602

140146

1144

48313

222941

November

1072

35625

121150

544

46920

205311

December

690

50493

89500

1710

58051

200444

TOTAL

40564

364081

921656

9442

371125

1706868

…the STDDEV() function, which calculates the standard deviation:

mysql> call p_case_counts_per_region_by_month("STDDEV");

Month

REGION 01

REGION 02

REGION 03

REGION 04

REGION 05

TOTAL

April

137.7672765

257.2432256

326.3496693

132.5

175.1580777

284.01053

May

128.1031429

278.5697022

324.1868716

0

159.5317314

277.68481

June

168.0364079

242.0570553

289.1989069

0

134.5805142

248.15875

July

128.7976258

280.643283

227.2403142

37.06600719

189.0798114

236.2054

August

131.3816535

269.194428

287.5951348

238.28088

149.3653913

259.11602

September

78.33299177

264.1167589

263.4439228

15

159.1064992

247.41223

October

83.39064696

187.6058641

291.5041152

21

208.0032344

260.70436

November

154

213.7065439

252.6518458

0

216.7769207

237.74581

December

0

228.2511625

254.8578402

29.06888371

208.4755833

234.31828

TOTAL

160.2388869

252.5301334

280.7877818

157.2860091

183.110026

252.83464

…or the MAX() function, to tell us which case had the largest FEE_NUMBER:

mysql> call p_case_counts_per_region_by_month("MAX");

Month

REGION 01

REGION 02

REGION 03

TOTAL

April

691-1663-10524

999-5135-56238

999-3742-984677

999-5135-56238

May

594-1504-47250

999-5135-56379

999-3742-981285

999-5135-56379

June

690-1697-9864

999-5135-56654

999-3742-981607

999-5135-56654

July

594-1504-47295

Z98-5139-36502064

999-3742-981983

Z98-5139-36502064

August

594-1504-47308

999-5135-57282

999-3742-982322

999-5135-57282

September

477-1129-36952777

Z98-5139-36993245

999-3742-982652

Z98-5139-36993245

October

463-1697-10149

999-5135-57882

999-3742-982999

999-5135-57882

November

690-1504-473606

999-5135-58093

999-3742-983152

999-5135-58093

December

690-1697-10161

999-5135-58323

999-3742-983722

999-5135-58323

TOTAL

691-1663-10524

Z98-5139-36993245

999-3742-984677

Z98-5139-36993245

Obviously, it would be beneficial to be able to apply the functions to other fields, so let’s add parameters for them as well.

Going through the code, we can identify six table fields that would benefit from an assossicated input parameter. They are:

  • AggregatedField: Replaces FEE_NUMBER. It is the field on which the aggregate function will be applied.
  • ColumnField: Replaces REGION_CODE. It is the field that is makes up the column data in the result set.
  • RowField: Replaces the Month names field. It is the field that is makes up the row data in the result set.
  • ColumnFieldAlias: Replaces “REGION_0X”. Sets the column headers.
  • RowFieldAlias: Replaces “Month”. Sets the Rows headers in the leftmost column.
  • SortRowField: Replaces the Month Numbers. Sets a field that will be used for sorting the row data.

The AggregatedField parameter poses a particular challenge, which makes it more difficult to implement than the others. For that reason, we will focus on the other ones today, and leave it for the next article.

Adding the remaining input paramters to the proc yields the following code. Again, the new fields appear in Red for easy identification:

CREATE PROCEDURE `p_case_counts_per_region_by_month` (IN AggregateFunction VARCHAR(64),
                                                      IN ColumnField       VARCHAR(64),
                                                      IN RowField          VARCHAR(64),
                                                      IN ColumnFieldAlias  VARCHAR(64),
                                                      IN RowFieldAlias     VARCHAR(64), 
                                                      IN SortRowField      VARCHAR(64))
LANGUAGE SQL 
NOT DETERMINISTIC 
CONTAINS SQL 
SQL SECURITY DEFINER 
BEGIN  
SELECT concat(   
  "SELECT CASE WHEN sort_row_field IS NULL", "\n", 
  "            THEN 'TOTAL'", "\n", 
  "            ELSE row_field", "\n", 
  "       END        AS '", RowFieldAlias, "',", "\n",
          group_concat(DISTINCT concat("       agg_fld_", ColumnField, 
                                       "  AS '", ColumnFieldAlias, ColumnField, "',", "\n"
                                )
                       order by ColumnField
                       separator '' 
          ),
  "       TOTAL", "\n",
  "FROM  (	SELECT	", SortRowField, "\t\t\t\t\t\t\t\tAS sort_row_field," , "\n",
  "\t\t", RowField, " \t\t\t\t\t\tAS row_field,", "\n",  
  group_concat( 
  DISTINCT concat("\t\t\t\t", AggregateFunction, "(CASE WHEN ", "REGION_CODE", " ='", REGION_CODE, 
                  "' THEN FEE_NUMBER ELSE NULL END) AS 'agg_fld_", 
                  REGION_CODE, "',", "\n"
           )
    order by REGION_CODE
    separator '' 
  ),
  "      	", AggregateFunction, "(FEE_NUMBER)\t\t\t\t\tAS 'TOTAL'", "\n",
  "		FROM  TA_CASES", "\n",
  "		WHERE YEAR(CREATION_DATE)=", YEAR(CREATION_DATE), "\n",
  "		GROUP BY SortRowField WITH ROLLUP) AS CA;"
) statement
into @sql_stmt
FROM TA_CASES
WHERE YEAR(CREATION_DATE)=1998;
  prepare case_counts_per_region_by_month   
  from @case_counts_per_region_by_month;    
  execute case_counts_per_region_by_month;   
  deallocate prepare case_counts_per_region_by_month; 
END

Another enhancement to the proc was to change the field aliases of the inner SQL to utilize more generic naming. Hence, “REGION_0X” becomes “agg_fld_0X”, “Month_Num” has been updated to “sort_row_field”, and “Month” is now “row_field”. The outer query will always refer to these fields by these names, regardless of what alias names are passed in.

One of the perks of using dynamic parameters is that you are not limited to field names. Functions such as “MONTH(CREATION_DATE)” and expressions such as “Days / 4” are equally valid. We can now supply both the MONTH(CREATION_DATE)” and "MONTHNAME(CREATION_DATE)" functions to our proc:

mysql>call p_case_counts_per_region_by_month("COUNT", 
-->   "FEE_NUMBER ", 
-->   "MONTHNAME(CREATION_DATE)", 
-->   "REGION ", 
-->   "Month", 
-->   "MONTH(CREATION_DATE)");

Calling the proc with the above paramters reproduces what we generated previously:

Month

REGION 01

REGION 02

REGION 03

REGION 04

REGION 05

TOTAL

April

13

33

76

2

47

171

May

17

55

209

1

143

425

June

8

63

221

1

127

420

July

13

104

240

6

123

486

August

18

121

274

9

111

533

September

25

160

239

2

88

514

October

9

88

295

2

127

521

November

2

86

292

2

120

502

December

1

128

232

6

155

522

TOTAL

106

838

2078

31

1041

4094

Thanks to the changes that we made here today, we can now produce different reports by playing with the parameters. Here’s a report of Citizenship Codes by Region:

mysql>call p_case_counts_per_region_by_month("COUNT", 
-->   "FEE_NUMBER", 
-->   "CITIZENSHIP_CODE", 
-->   "REGION ", 
-->   "CITIZENSHIP CD", 
-->   "CREATION_DATE");

CITIZENSHIP CD

REGION 01

REGION 02

REGION 03

REGION 04

REGION 05

TOTAL

(NULL)

7

13

28

4

60

112

26

0

0

12

0

0

12

33

0

0

0

0

2

2

44

0

0

11

0

1

12

45

0

1

5

0

2

8

48

0

0

1

0

0

1

52

0

0

0

3

0

3

56

0

0

1

0

3

4

59

2

0

0

0

0

2

60

0

0

0

0

2

2

81

0

0

22

0

7

29

83

0

0

2

0

2

4

5139

0

45

0

0

0

45

5205

0

5

0

0

0

5

602

0

0

2

0

0

2

650

0

0

0

0

2

2

651

0

0

1

0

0

1

654

0

0

0

0

15

15

703

0

1

0

0

0

1

711

0

0

10

0

0

10

722

0

0

1

0

0

1

979

0

0

1

0

1

2

999

0

0

1

0

0

1

TOTAL

104

838

2078

31

1041

4092

Note that the totals don’t add up because some rows were ommitted.

Next time, we’ll add a parameter to replace the “FEE_NUMBER” as well as include parameters to choose different tables and criteria.

Additional Resources

MySQL Stored Procedures in MySQL 5.0

» See All Articles by Columnist Rob Gravelle

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved