Manage Redshift query queues with WorkLoad Management

Amazon Redshift is a well-known data warehouse solution with PostgreSQL background. Definitely, it may even be your first choice for Big Data management in the cloud. However, do you know all the possibilities that it brings? In this article, I would like to describe how to improve the way your database is queried in order to take care of its health, performance and user experience.

When managing the database backend one of the things to take care of is access management. Database access is based on users and user groups. They can be used to differentiate users’ privileges related to so-called CRUD operations.
User groups are important in the context of Query Monitoring Rules (QMR) that are part of Redshift’s WorkLoad Management (WLM).

WLM JSON configuration is part of the Redshift cluster parameter group. Every cluster has a default configuration that cannot be modified. To introduce some changes you need to create a new, custom parameter group and assign it to the cluster. You can find the full list of the parameters here. The most powerful parameter though is WLM JSON configuration that is used to define query queues and associated processing rules. Default configuration contains one, default queue that can run up to 5 queries concurrently – no other restrictions applied. With custom WLM JSON configuration you can modify the processing rules of the default queue or add more queues with some restrictions applied. These restrictions are called Query Monitoring Rules (QMR) and this part of Redshift configuration, in my opinion, deserves some attention.

With QMR you can match queries by user group or query group (label) to control it with the rules that you’ll specify, these can be related to:

Query execution time (seconds)
Query queue time (seconds)
Query CPU time (seconds)
Blocks read (1 MB blocks)
Scan row count (rows)
CPU usage (percent)
Memory to disk (1 MB blocks)
CPU skew (ratio)
I/O skew (ratio)
Rows joined (rows)
Nested loop join row count (rows)
Return row count (rows)
Segment execution time (seconds)
Spectrum scan row count (rows)
Spectrum scan (MB)
Query priority

The metrics listed above are called predicates, you can find the table with a detailed description here. Using predicates you can build your custom rules using basic math operations like ‘<’,’>’ and ‘=’ on them or take advantage of one of the predefined templates. For every query monitoring rule you can specify one of the following actions: log, abort or change query priority.

Workload management is a piece of configuration with big potential and there is another proof of it, the one that is enabled by default – Short Query Acceleration (SQA). SQA employs machine learning algorithms to predict query duration and prioritize short queries over long-running ones to save their time in the queue, stuck behind the complex statements. It can be tuned to improve the performance of your system or disabled using custom WLM JSON configuration.

So, how to get things done? You can set up WLM using the following options: Amazon Redshift console,
the AWS CLI, the Amazon Redshift API, AWS CloudFormation or one of the AWS SDKs.

How my journey with WLM started? I have come across a parameter group configuration while researching statement_timeout parameter, which can be applied to the whole cluster or as part of a Redshift transaction. My team wanted to time out user queries on Redshift and handle received error on the Django backend by providing some custom message and HTTP error code. During my research, I came across a chain of deprecated parameters and then I’ve found query_execution_time, which is part of Query Monitoring Rules. First of all, I had to create a user group to match my query monitoring rule, then I have started to prepare a custom configuration with Amazon Redshift console that helped me to produce the final JSON representation of the queues and associated rules. To create a custom queue and QMR you need to open Amazon Redshift service page, on the left side panel choose CONFIG > Workload management, hit `Create` button to create your custom parameter group, then select your parameter group, click `Edit workload queues` button and there you are:

You can add your custom queues and assign them query monitoring rules created manually or using templates. When you go back to the workload management, then in the JSON section you should see a configuration that has been produced:

[
  {
    "user_group": [
      "wlm_limited_readonly_group"
    ],
    "query_group": [],
    "auto_wlm": true,
    "queue_type": "auto",
    "name": "ReadOnlyQueue",
    "rules": [
      {
        "rule_name": "abort_longer_than_50s",
        "predicate": [
          {
            "metric_name": "query_execution_time",
            "operator": ">",
            "value": 50
          }
        ],
        "action": "abort"
      }
    ]
  },
  {
    "auto_wlm": true,
    "user_group": [],
    "query_group": [],
    "name": "Default queue"
  },
  {
    "short_query_queue": true
  }
]

I have prepared a rule that times out queries longer than 50 seconds for the user group: wlm_limited_readonly_group, this rule has been assigned to the newly created queue: ReadOnlyQueue. I highly recommend using the Amazon Redshift console, while making your first steps in WLM, at least to create your first JSON with custom configuration. The new configuration has been created but it has not been associated with the cluster yet, to do that you need to go to your cluster view and click the `Edit` button, there you’ll find the `Database configurations` section, where you can choose your parameter group from the dropdown list. Only one parameter group can be applied at a time. One important note is that this configuration change triggers Redshift’s reboot. It shouldn’t take long, in my case it took a few minutes to have Redshift back operational.

It is a good practice to keep your configuration as a code, in our project, we use AWS CloudFormation for that. As you will see, providing WLM JSON configuration as a code is not so gentle, but all in all, it lets you keep track of your changes.

Resources:

 Redshift:
   Type: "AWS::Redshift::Cluster"
   Properties:
     [..]
     ClusterParameterGroupName: !Ref RedshiftClusterParameterGroup

 RedshiftClusterParameterGroup:
   Type: "AWS::Redshift::ClusterParameterGroup"
   Properties:
     Description: "Custom cluster parameter group"
     ParameterGroupFamily: "redshift-1.0"
     Parameters:
       - ParameterName: "wlm_json_configuration"
         ParameterValue: "[{\"user_group\":[\"wlm_limited_readonly_group\"],\"query_group\":[],\"auto_wlm\":true,\"queue_type\":\"auto\",\"name\":\"ReadOnlyQueue\",\"rules\":[{\"rule_name\":\"abort_longer_than_50s\",\"predicate\":[{\"metric_name\":\"query_execution_time\",\"operator\":\">\",\"value\":50}],\"action\":\"abort\"}]},{\"auto_wlm\":true,\"user_group\":[],\"query_group\":[],\"name\":\"Default queue\"},{\"short_query_queue\":true}]"

To sum up, Workload Management seems to be a powerful part of AWS Redshift configuration. Together with access management, it enables data warehouse administrators to determine and restrict how users interact with the database.
It can be really beneficial for any application backend processing. So, have you already opened up this tab in the AWS Redshift service page? If not, I highly recommend doing that in near future.

Sources:

https://docs.aws.amazon.com/redshift/latest/mgmt/welcome.html

https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-parameter-groups.html

https://docs.amazonaws.cn/en_us/redshift/latest/dg/t_user_group_examples.html

https://docs.aws.amazon.com/redshift/latest/dg/r_query_group.html

https://docs.aws.amazon.com/redshift/latest/dg/wlm-short-query-acceleration.html

https://docs.aws.amazon.com/redshift/latest/mgmt/workload-mgmt-config.html#wlm-json-config-properties

https://docs.aws.amazon.com/redshift/latest/dg/r_statement_timeout.html

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-redshift-clusterparametergroup.html

Author: Mateusz Ziarko, DevOps Engineer @ Appliscale
LinkedIn: https://www.linkedin.com/in/mateusz-ziarko/