forked from laurenz/pgreplay
-
Notifications
You must be signed in to change notification settings - Fork 0
/
pgreplay.1
203 lines (203 loc) · 7.32 KB
/
pgreplay.1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
.TH pgreplay 1 "" "Jun 2011" "PostgreSQL Utilities"
.SH NAME
pgreplay \- PostgreSQL log file replayer for performance tests
.SH SYNOPSIS
\fBpgreplay\fP [\fIparse options\fR] [\fIreplay options\fR]
[\fB-d\fR \fIlevel\fR] [\fIinfile\fR]
.br
\fBpgreplay\fP \fB-f\fP [\fIparse options\fR] [\fB-o\fP \fIoutfile\fR]
[\fB-d\fR \fIlevel\fR] [\fIinfile\fR]
.br
\fBpgreplay\fP \fB-r\fP [\fIreplay options\fR] [\fB-d\fR \fIlevel\fR]
[\fIinfile\fR]
.SH DESCRIPTION
\fBpgreplay\fR reads a PostgreSQL log file (\fInot\fR a WAL file), extracts
the SQL statements and executes them in the same order and relative time
against a PostgreSQL database cluster.
A final report gives you a useful statistical analysis of your workload
and its execution.
.P
In the first form, the log file \fIinfile\fR is replayed at the time it is
read.
.P
With the \fB-f\fR option, \fBpgreplay\fR will not execute the statements, but
write them to a \(oqreplay file\(cq \fIoutfile\fR that can be replayed with
the third form.
.P
With the \fB-r\fP option, \fBpgreplay\fR will execute the statements in the
replay file \fIinfile\fR that was created by the second form.
.P
If the execution of statements gets behind schedule, warning messages
are issued that indicate that the server cannot handle the load in a
timely fashion.
The idea is to replay a real-world database workload as exactly as possible.
.P
To create a log file that can be parsed by \fBpgreplay\fR, you need to set the
following parameters in \fBpostgresql.conf\fR:
.IP
\fBlog_min_messages=error\fR (or more)
.br
\fBlog_min_error_statement=log\fR (or more)
.br
\fBlog_connections=on\fR
.br
\fBlog_disconnections=on\fR
.br
\fBlog_line_prefix=\(aq%m|%u|%d|%c|\(aq\fR (if you don\(aqt use CSV logging)
.br
\fBlog_statement=\(aqall\(aq\fR
.br
\fBlc_messages\fR must be set to English (encoding does not matter)
.br
\fBbytea_output=escape\fR
(from version 9.0 on, only if you want to replay the log on 8.4 or earlier)
.P
The database cluster against which you replay the SQL statements must be
a clone of the database cluster that generated the logs from the time
\fIimmediately before\fR the logs were generated.
.P
\fBpgreplay\fR is useful for performance tests, particularly in the following
situations:
.TP 4
*
You want to compare the performance of your PostgreSQL application
on different hardware or different operating systems.
.TP 4
*
You want to upgrade your database and want to make sure that the new
database version does not suffer from performance regressions that
affect you.
.P
Moreover, \fBpgreplay\fR can give you some feeling as to how your application
\fImight\fR scale by allowing you to try to replay the workload at a higher
speed. Be warned, though, that 500 users working at double speed is not really
the same as 1000 users working at normal speed.
.SH OPTIONS
.SS Parse options:
.TP
\fB-c\fR
Specifies that the log file is in \(aqcsvlog\(aq format (highly recommended)
and not in \(aqstderr\(aq format.
.TP
\fB-b\fR \fItimestamp\fR
Only log entries greater or equal to that timestamp will be parsed.
The format is \fBYYYY-MM-DD HH:MM:SS.FFF\fR like in the log file.
An optional time zone part will be ignored.
.TP
\fB-e\fR \fItimestamp\fR
Only log entries less or equal to that timestamp will be parsed.
The format is \fBYYYY-MM-DD HH:MM:SS.FFF\fR like in the log file.
An optional time zone part will be ignored.
.TP
\fB-q\fR
Specifies that a backslash in a simple string literal will escape
the following single quote.
This depends on configuration options like
\fBstandard_conforming_strings\fR and is the default for server
version 9.0 and less.
.TP
\fB-D\fR \fIdatabase\fR
Only log entries related to the specified database will be parsed
(this option can be specified multiple times for more than one database).
.TP
\fB-U\fR \fIusername\fR
Only log entries related to the specified username will be parsed
(this option can be specified multiple times for more than one user).
.SS Replay options:
.TP
\fB-h\fR \fIhostname\fR
Host name where the target database cluster is running (or directory where
the UNIX socket can be found). Defaults to local connections.
.br
This works just like the \fB-h\fR option of \fBpsql\fR.
.TP
\fB-p\fR \fIport\fR
TCP port where the target database cluster can be reached.
.TP
\fB-W\fR \fIpassword\fR
By default, \fBpgreplay\fR assumes that the target database cluster
is configured for \fItrust\fR authentication. With the \fB-W\fR option
you can specify a password that will be used for all users in the cluster.
.TP
\fB-s\fR \fIfactor\fR
Speed factor for replay, by default 1. This can be any valid positive
floating point number. A \fIfactor\fR less than 1 will replay the workload
in \(oqslow motion\(cq, while a \fIfactor\fR greater than 1 means
\(oqfast forward\(cq.
.TP
\fB-E\fR \fIencoding\fR
Specifies the encoding of the log file, which will be used as client
encoding during replay. If it is omitted, your default client encoding will
be used.
.TP
\fB-j\fR
If all connections are idle, jump ahead to the next request instead of
sleeping. This will speed up replay. Execution delays will still be reported
correctly, but replay statistics will not contain the idle time.
.TP
\fB-n\fR
Dry run mode. No connections to the server are made.
Useful for checking if the replay file is corrupt or to get statistics
about the replay file (number of statements, original duration, ...)
.TP
\fB-X\fR \fIoptions\fR
Extra connection options for replay connections. These must be libpq
connection options specified in the format \(oqoption=value [...]\(cq.
.SS Output options:
.TP
\fB-o\fP \fIoutfile\fR
specifies the replay file where the statements will be written
for later replay.
.SS Debug options:
.TP
\fB-d\fR \fIlevel\fR
Specifies the trace level (between 1 and 3). Increasing levels will produce
more detailed information about what \fBpgreplay\fR is doing.
.TP
\fB-v\fR
Prints the program version and exits.
.SH ENVIRONMENT
.TP
\fBPGHOST\fR
Specifies the default value for the \fB-h\fR option.
.TP
\fBPGPORT\fR
Specifies the default value for the \fB-p\fR option.
.TP
\fBPGCLIENTENCODING\fR
Specifies the default value for the \fB-E\fR option.
.SH LIMITATIONS
\fBpgreplay\fR can only replay what is logged by PostgreSQL.
This leads to some limitations:
.TP 4
*
\fBCOPY\fR statements will not be replayed, because the copy data are not
logged.
.TP 4
*
Fast-path API function calls are not logged and will not be replayed.
Unfortunately, this includes the Large Object API.
.TP 4
*
Since the log file is always in the server encoding (which you can specify
with the \fB-E\fR switch of \fBpgreplay\fR), all
\fBSET client_encoding\fR statements will be ignored.
.TP 4
*
Since the preparation time of prepared statements is not logged (unless
\fBlog_min_messages\fR is \fBdebug2\fR or more), these statements will be
prepared immediately before they are first executed during replay.
.TP 4
*
Because the log file contains only text, query parameters and return values
will always be in text and never in binary format. If you use binary mode to,
say, transfer large binary data, \fBpgreplay\fR can cause significantly more
network traffic than the original run.
.TP 4
*
Sometimes, if a connection takes longer to complete, the session ID
unexpectedly changes in the PostgreSQL log file. This causes \fBpgreplay\fR
to treat the session as two different ones, resulting in an additional
connection. This is arguably a bug in PostgreSQL.
.SH AUTHOR
Written by Laurenz Albe \fB<[email protected]>\fR.