Skip to content

Commit 97e7f34

Browse files
committed
[PostgreSQL] update Pr 4311
1 parent e9464ce commit 97e7f34

File tree

2 files changed

+87
-91
lines changed

2 files changed

+87
-91
lines changed
Lines changed: 86 additions & 85 deletions
Original file line numberDiff line numberDiff line change
@@ -1,40 +1,40 @@
11
---
2-
title: High CPU Utilization Across Azure Database for PostgreSQL Elastic Clusters
3-
description: Troubleshoot high CPU utilization across Azure Database for PostgreSQL elastic clusters.
2+
title: Troubleshoot High CPU Utilization in Elastic Clusters
3+
description: How to troubleshoot high CPU utilization across Azure Database for PostgreSQL Elastic Clusters.
44
author: GayathriPaderla
55
ms.author: gapaderla
6-
ms.reviewer: jaredmeade
7-
ms.date: 01/28/2026
6+
ms.reviewer: jaredmeade, maghan
7+
ms.date: 02/17/2026
88
ms.service: azure-database-postgresql
99
ms.subservice: performance
1010
ms.topic: troubleshooting-general
1111
---
1212

13-
# Troubleshoot High CPU Utilization in Azure Database for PostgreSQL Elastic Clusters
13+
# Troubleshoot high CPU utilization in Azure Database for PostgreSQL Elastic Clusters
1414

1515
This article describes how to identify the root cause of high CPU utilization. It also provides possible remedial actions to control CPU utilization when using [Elastic clusters in Azure Database for PostgreSQL](../elastic-clusters/concepts-elastic-clusters.md).
1616

1717
In this article, you learn about:
1818

19-
- How to use tools like Azure Metrics, pg_stat_statements, citus_stat_activity, and pg_stat_activity to identify high CPU utilization.
20-
- How to identify root causes, such as long running queries and total connections
21-
- How to resolve high CPU utilization by using EXPLAIN ANALYZE and vacuuming tables.
19+
- How to use tools like Azure Metrics, `pg_stat_statements`, `citus_stat_activity`, and `pg_stat_activity` to identify high CPU utilization.
20+
- How to identify root causes, such as long running queries and total connections.
21+
- How to resolve high CPU utilization by using `EXPLAIN ANALYZE` and vacuuming tables.
2222

23-
## Tools to Identify High CPU Utilization
23+
## Tools to identify high CPU utilization
2424

25-
Consider the use of the following list of tools to identify high CPU utilization:
25+
Use the following tools to identify high CPU utilization:
2626

2727
### Azure Metrics
2828

29-
Azure Metrics is a good starting point to check the CPU utilization for a specific period. Metrics provide information about the resources utilized during the period in which you are monitoring. You can use the **Apply splitting** option and **Split by Server Name** to view the details of each individual node in your elastic cluster. You can then compare the performance of **Write IOPs, Read IOPs, Read Throughput Bytes/Sec**, and **Write Throughput Bytes/Sec** with **CPU percent**, to view the performance of individual nodes when you observe your workload consuming high CPU.
29+
Azure Metrics is a good starting point to check the CPU utilization for a specific period. Metrics provide information about the resources utilized during the period in which you're monitoring. You can use the **Apply splitting** option and **Split by Server Name** to view the details of each individual node in your elastic cluster. You can then compare the performance of **Write IOPs, Read IOPs, Read Throughput Bytes/Sec**, and **Write Throughput Bytes/Sec** with **CPU percent**, to view the performance of individual nodes when you observe your workload consuming high CPU.
3030

31-
Once you have identified a particular node (or nodes) with higher than expected CPU utilization, you can connect directly to one more nodes in question and perform a more in-depth analysis using the following Postgres tools:
31+
After you identify a particular node (or nodes) with higher than expected CPU utilization, you can connect directly to one or more nodes in question and perform a more in-depth analysis by using the following Postgres tools:
3232

3333
### pg_stat_statements
3434

3535
The `pg_stat_statements` extension helps identify queries that consume time on the server. For more information about this extension, see the detailed [documentation](https://www.postgresql.org/docs/current/pgstatstatements.html).
3636

37-
#### Calls/Mean & Total Execution Time
37+
#### Calls/Mean and total execution time
3838

3939
The following query returns the top five SQL statements by highest total execution time:
4040

@@ -47,7 +47,7 @@ DESC LIMIT 5;
4747

4848
### pg_stat_activity
4949

50-
The `pg_stat_activity` view shows the queries that are currently being executed on the specific node. Monitor active queries, sessions, and states on that node.
50+
The `pg_stat_activity` view shows the queries that are currently running on the specific node. Use it to monitor active queries, sessions, and states on that node.
5151

5252
```sql
5353
SELECT *, now() - xact_start AS duration
@@ -58,7 +58,7 @@ ORDER BY duration DESC;
5858

5959
### citus_stat_activity
6060

61-
The `citus_stat_activity` view shows the distributed queries that are executing on all nodes, and is a superset of `pg_stat_activity`. This view also shows tasks specific to subqueries dispatched to workers, task state, and worker nodes.
61+
The `citus_stat_activity` view is a superset of `pg_stat_activity`. It shows the distributed queries that are running on all nodes. It also shows tasks specific to subqueries dispatched to workers, task state, and worker nodes.
6262

6363
```sql
6464
SELECT *, now() - xact_start AS duration
@@ -67,18 +67,18 @@ WHERE state IN ('idle in transaction', 'active') AND pid <> pg_backend_pid()
6767
ORDER BY duration DESC;
6868
```
6969

70-
## Identify Root Causes
70+
## Identify root causes
7171

72-
If CPU consumption levels are high in general, the following scenarios could be possible root causes:
72+
If CPU consumption levels are high, the following scenarios might be the root causes:
7373

7474
### Long-running transactions on specific node
7575

76-
Long-running transactions can consume CPU resources that lead to high CPU utilization.
76+
Long-running transactions consume CPU resources and lead to high CPU utilization.
7777

7878
The following query provides information on long-running transactions:
7979

8080
```sql
81-
SELECT
81+
SELECT
8282
pid,
8383
datname,
8484
usename,
@@ -91,19 +91,19 @@ SELECT
9191
wait_event,
9292
wait_event_type,
9393
query
94-
FROM pg_stat_activity
95-
WHERE state != 'idle' AND pid <> pg_backend_pid() AND state IN ('idle in transaction', 'active')
94+
FROM pg_stat_activity
95+
WHERE state != 'idle' AND pid <> pg_backend_pid() AND state IN ('idle in transaction', 'active')
9696
ORDER BY now() - query_start DESC;
9797
```
9898

9999
### Long-running transactions on all nodes
100100

101-
Long-running transactions can consume CPU resources that lead to high CPU utilization.
101+
Long-running transactions consume CPU resources and lead to high CPU utilization.
102102

103103
The following query provides information on long-running transactions across all nodes:
104104

105105
```sql
106-
SELECT
106+
SELECT
107107
global_pid, pid,
108108
nodeid,
109109
datname,
@@ -117,19 +117,19 @@ SELECT
117117
wait_event,
118118
wait_event_type,
119119
query
120-
FROM citus_stat_activity
121-
WHERE state != 'idle' AND pid <> pg_backend_pid() AND state IN ('idle in transaction', 'active')
120+
FROM citus_stat_activity
121+
WHERE state != 'idle' AND pid <> pg_backend_pid() AND state IN ('idle in transaction', 'active')
122122
ORDER BY now() - query_start DESC;
123123
```
124124

125125
### Slow query
126126

127-
Slow queries can consume CPU resources that lead to high CPU utilization.
127+
Slow queries consume CPU resources and cause high CPU utilization.
128128

129-
The following query helps identify queries taking longer run times:
129+
The following query helps you identify queries that take longer run times:
130130

131131
```sql
132-
SELECT
132+
SELECT
133133
query,
134134
calls,
135135
mean_exec_time,
@@ -151,94 +151,95 @@ ORDER BY total_exec_time DESC;
151151

152152
### Total number of connections and number of connections by state on a node
153153

154-
Many connections to the database might also lead to increased CPU utilization.
154+
Many connections to the database lead to increased CPU utilization.
155155

156156
The following query provides information about the number of connections by state on a single node:
157157

158158
```sql
159-
SELECT state, COUNT(*)
160-
FROM pg_stat_activity
161-
WHERE pid <> pg_backend_pid()
162-
GROUP BY state
159+
SELECT state, COUNT(*)
160+
FROM pg_stat_activity
161+
WHERE pid <> pg_backend_pid()
162+
GROUP BY state
163163
ORDER BY state ASC;
164164
```
165165

166166
### Total number of connections and number of connections by state on all nodes
167167

168-
Many connections to the database might also lead to increased CPU utilization.
168+
Many connections to the database lead to increased CPU utilization.
169169

170170
The following query gives information about the number of connections by state across all nodes:
171171

172172
```sql
173-
SELECT state, COUNT(*)
174-
FROM citus_stat_activity
175-
WHERE pid <> pg_backend_pid()
176-
GROUP BY state
173+
SELECT state, COUNT(*)
174+
FROM citus_stat_activity
175+
WHERE pid <> pg_backend_pid()
176+
GROUP BY state
177177
ORDER BY state ASC;
178178
```
179179

180-
### Vacuum and Table Stats
180+
### Vacuum and table stats
181+
182+
Keeping table statistics up to date helps improve query performance. Monitor whether regular autovacuuming is happening.
181183

182-
Keeping table statistics up to date helps improve query performance. Monitor whether regular auto vacuuming is being carried out.
184+
The following query helps you identify the tables that need vacuuming:
183185

184-
The following query helps to identify the tables that need vacuuming:
185186
```sql
186-
SELECT *
187-
FROM run_command_on_all_nodes($$
188-
SELECT json_agg(t)
189-
FROM (
187+
SELECT *
188+
FROM run_command_on_all_nodes($$
189+
SELECT json_agg(t)
190+
FROM (
190191
SELECT schemaname, relname
191192
,n_live_tup, n_dead_tup
192193
,n_dead_tup / (n_live_tup) AS bloat
193194
,last_autovacuum, last_autoanalyze
194-
,last_vacuum, last_analyze
195-
FROM pg_stat_user_tables
196-
WHERE n_live_tup > 0 AND relname LIKE '%orders%'
197-
ORDER BY n_dead_tup DESC
195+
,last_vacuum, last_analyze
196+
FROM pg_stat_user_tables
197+
WHERE n_live_tup > 0 AND relname LIKE '%orders%'
198+
ORDER BY n_dead_tup DESC
198199
) t
199200
$$);
200201
```
201202

202-
The following image highlights the output resulting from the above query. The "result" column is a json datatype containing information on the stats.
203+
The following image highlights the output from the preceding query. The `result` column is a JSON data type containing information on the stats.
203204

204205
:::image type="content" source="./media/how-to-high-cpu-utilization-elastic-clusters/elastic-clusters-cpu-utilization-result.png" alt-text="Results returned from query response - including `result` column as a json datatype " lightbox="./media/how-to-high-cpu-utilization-elastic-clusters/elastic-clusters-cpu-utilization-result.png":::
205206

206-
The last_autovacuum and last_autoanalyze columns provide the date and time when the table was last auto vacuumed or analyzed. If the tables aren't being vacuumed regularly, take steps to tune autovacuum.
207+
The `last_autovacuum` and `last_autoanalyze` columns provide the date and time when the table was last autovacuumed or analyzed. If the tables aren't autovacuumed regularly, take steps to tune autovacuum.
207208

208-
The following query provides information regarding the amount of bloat at the schema level:
209+
The following query provides information about the amount of bloat at the schema level:
209210

210211
```sql
211-
SELECT *
212-
FROM run_command_on_all_nodes($$
213-
SELECT json_agg(t) FROM (
212+
SELECT *
213+
FROM run_command_on_all_nodes($$
214+
SELECT json_agg(t) FROM (
214215
SELECT schemaname, sum(n_live_tup) AS live_tuples
215216
, sum(n_dead_tup) AS dead_tuples
216-
FROM pg_stat_user_tables
217-
WHERE n_live_tup > 0
218-
GROUP BY schemaname
217+
FROM pg_stat_user_tables
218+
WHERE n_live_tup > 0
219+
GROUP BY schemaname
219220
ORDER BY sum(n_dead_tup) DESC
220-
) t
221+
) t
221222
$$);
222223
```
223224

224-
## Resolve High CPU Utilization
225+
## Resolve high CPU utilization
225226

226227
Use EXPLAIN ANALYZE to examine any slow queries and terminate any improperly long running transactions. Consider using the built-in PgBouncer connection pooler and clear up excessive bloat to resolve high CPU utilization.
227228

228229
### Use EXPLAIN ANALYZE
229230

230-
Once you know the queries that are consuming more CPU, use **EXPLAIN ANALYZE** to further investigate and tune them.
231+
After you identify the queries that consume more CPUs, use **EXPLAIN ANALYZE** to further investigate and tune them.
231232

232-
For more information about the **EXPLAIN ANALYZE** command, review its [documentation](https://www.postgresql.org/docs/current/sql-explain.html).
233+
For more information about the **EXPLAIN ANALYZE** command, see its [documentation](https://www.postgresql.org/docs/current/sql-explain.html).
233234

234235
### Terminate long running transactions on a node
235236

236-
You can consider terminating a long running transaction as an option if the transaction is running longer than expected.
237+
Consider terminating a long running transaction if the transaction runs longer than expected.
237238

238-
To terminate a session's PID, you need to find its PID by using the following query:
239+
To terminate a session's PID, first find the PID by using the following query:
239240

240241
```sql
241-
SELECT
242+
SELECT
242243
pid,
243244
datname,
244245
usename,
@@ -251,29 +252,29 @@ SELECT
251252
wait_event,
252253
wait_event_type,
253254
query
254-
FROM pg_stat_activity WHERE state != 'idle' AND pid <> pg_backend_pid() AND state IN ('idle in transaction', 'active')
255+
FROM pg_stat_activity WHERE state != 'idle' AND pid <> pg_backend_pid() AND state IN ('idle in transaction', 'active')
255256
ORDER BY now() - query_start DESC;
256257
```
257258

258-
You can also filter by other properties like usename (user name), datname (database name), etc.
259+
You can also filter by other properties like `usename` (user name), `datname` (database name), and more.
259260

260-
Once you have the session's PID, you can terminate it using the following query:
261+
After you get the session's PID, terminate it by using the following query:
261262

262263
```sql
263264
SELECT pg_terminate_backend(pid);
264265
```
265266

266-
Terminating the pid ends the specific sessions related to a node.
267+
Terminating the PID ends the specific sessions related to a node.
267268

268269
### Terminate long running transactions on all nodes
269270

270-
You could consider ending a long running transaction as an option.
271+
Consider ending a long running transaction.
271272

272-
To terminate a session's PID, you need to find its PID, global_pid by using the following query:
273+
To terminate a session's PID, find its PID and global_pid by using the following query:
273274

274275
```sql
275-
SELECT
276-
global_pid,
276+
SELECT
277+
global_pid,
277278
pid,
278279
nodeid,
279280
datname,
@@ -287,22 +288,22 @@ SELECT
287288
wait_event,
288289
wait_event_type,
289290
query
290-
FROM citus_stat_activity WHERE state != 'idle' AND pid <> pg_backend_pid() AND state IN ('idle in transaction', 'active')
291+
FROM citus_stat_activity WHERE state != 'idle' AND pid <> pg_backend_pid() AND state IN ('idle in transaction', 'active')
291292
ORDER BY now() - query_start DESC;
292293
```
293294

294-
You can also filter by other properties like usename (user name), datname (database name), etc.
295+
You can also filter by other properties like `usename` (user name), `datname` (database name), and more.
295296

296-
Once you have the session's PID, you can terminate it using the following query:
297+
After you get the session's PID, terminate it by using the following query:
297298

298299
```sql
299300
SELECT pg_terminate_backend(pid);
300301
```
301302
Terminating the pid ends the specific sessions related to a worker node.
302303

303-
The same query running on different worker nodes might have same global_pids. In that case, you can end long running transaction on all worker nodes use global_pid.
304+
The same query running on different worker nodes might have same global_pid's. In that case, you can end long running transaction on all worker nodes use global_pid.
304305

305-
The following screenshot shows the relativity of the global_pids to session pids.
306+
The following screenshot shows the relativity of the global_pid's to session pid's.
306307

307308
:::image type="content" source="./media/how-to-high-cpu-utilization-elastic-clusters/global-pid-to-session-pid-example.png" alt-text="global pid to session pid reference example" lightbox="./media/how-to-high-cpu-utilization-elastic-clusters/global-pid-to-session-pid-example.png":::
308309

@@ -311,27 +312,27 @@ SELECT pg_terminate_backend(global_pid);
311312
```
312313

313314
> [!NOTE]
314-
> To terminate long running transactions, it is advised to set server parameters `statement_timeout` or `idle_in_transaction_session_timeout`.
315+
> To terminate long running transactions, set server parameters `statement_timeout` or `idle_in_transaction_session_timeout`.
315316
316317
## Clearing bloat
317318

318-
A short-term solution would be to manually vacuum and then analyze the tables where slow queries are seen:
319+
A short-term solution is to manually vacuum and then analyze the tables where slow queries appear:
319320

320321
```sql
321322
VACUUM ANALYZE <table>;
322323
```
323324

324-
## Managing Connections
325+
## Managing connections
325326

326-
In situations where there are many short-lived connections, or many connections that remain idle for most of their life, consider using a connection pooler like PgBouncer.
327+
If your application uses many short-lived connections or many connections that stay idle for most of their life, consider using a connection pooler like PgBouncer.
327328

328329
## PgBouncer, a built-in connection pooler
329330

330-
For more information about PgBouncer, see [connection pooler](https://techcommunity.microsoft.com/t5/azure-database-for-postgresql/not-all-postgres-connection-pooling-is-equal/ba-p/825717) and [connection handling best practices with PostgreSQL](https://techcommunity.microsoft.com/t5/azure-database-for-postgresql/connection-handling-best-practice-with-postgresql/ba-p/790883)
331+
For more information about PgBouncer, see [connection pooler](https://techcommunity.microsoft.com/t5/azure-database-for-postgresql/not-all-postgres-connection-pooling-is-equal/ba-p/825717) and [connection handling best practices with PostgreSQL](https://techcommunity.microsoft.com/t5/azure-database-for-postgresql/connection-handling-best-practice-with-postgresql/ba-p/790883).
331332

332333
Azure Database for PostgreSQL Elastic Clusters offer PgBouncer as a built-in connection pooling solution. For more information, see [PgBouncer](../connectivity/concepts-pgbouncer.md).
333334

334335
## Related content
335336

336-
- [Server parameters in Azure Database for PostgreSQL](../server-parameters/concepts-server-parameters.md).
337-
- [Autovacuum tuning in Azure Database for PostgreSQL](how-to-autovacuum-tuning.md).
337+
- [Server parameters in Azure Database for PostgreSQL](../server-parameters/concepts-server-parameters.md)
338+
- [Autovacuum tuning in Azure Database for PostgreSQL](how-to-autovacuum-tuning.md)

docfx.json

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -269,14 +269,9 @@
269269
]
270270
},
271271
"titleSuffix": {
272+
"articles/postgresql/**/*": "Azure Database for PostgreSQL",
272273
"articles/mysql/flexible-server/**/*.md": "Azure Database for MySQL",
273-
"articles/postgresql/scripts/**/*.md": "Azure Database for PostgreSQL",
274-
"articles/postgresql/flexible-server/**/*.md": "Azure Database for PostgreSQL",
275-
"articles/postgresql/migrate/**/*.md": "Azure Database for PostgreSQL",
276274
"articles/mysql/flexible-server/**/*.yml": "Azure Database for MySQL",
277-
"articles/postgresql/scripts/**/*.yml": "Azure Database for PostgreSQL",
278-
"articles/postgresql/flexible-server/**/*.yml": "Azure Database for PostgreSQL",
279-
"articles/postgresql/migrate/**/*.yml": "Azure Database for PostgreSQL",
280275
"articles/cosmos-db/**/*": "Azure Cosmos DB",
281276
"articles/cosmos-db/mongodb/**/*": "Azure Cosmos DB for MongoDB",
282277
"articles/cosmos-db/postgresql/**/*": "Azure Cosmos DB for PostgreSQL",

0 commit comments

Comments
 (0)