Man page for mk-slave-prefetch
August 24, 2007 – 5:38 pmMK-SLAVE-PREFETCH
Section: User Contributed Perl Documentation (1)
Updated: 2008-06-01
Index
Return to Main Contents
NAME
mk-slave-prefetch - Pipeline relay logs on a MySQL slave to pre-warm caches.
SYNOPSIS
mk-slave-prefetch
mk-slave-prefetch –statistics > /path/to/saved/statistics
mk-slave-prefetch /path/to/saved/statistics
DESCRIPTION
mk-slave-prefetch reads the slave’s relay log slightly ahead of where the
slave’s SQL thread is reading, converts statements into "SELECT", and
executes them. In theory, this should help alleviate the effects of the
slave’s single-threaded SQL execution. It will help take advantage of
multiple CPUs and disks by pre-reading the data from disk, so the data is
already in the cache when the slave SQL thread executes the un-modified
version of the statement.
Statements that can’t be converted into "SELECT" are ignored. However, there
is always a chance of bugs. It would be a very good idea to connect as a
read-only user. Here is an example of how to grant the necessary privileges:
GRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE ON *.*
TO ‘prefetch’@'%’ IDENTIFIED BY ’sp33dmeup!’;
"mk-slave-prefetch" learns how long it takes statements to execute, and doesn’t
try to execute those that take a very long time. You can ask it to print what
it has learned after it executes. You can also specify a filename on the
command line. The file should contain the statistics printed by a previous
run. These will be used to pre-populate the statistics so it doesn’t have to
re-learn.
This program is based on concepts I heard Paul Tuckfield explain at the November
2006 MySQL Camp un-conference. However, the code is my own work. I have not
seen any other implementation of Paul’s idea.
DOES IT WORK?
Does it work? Does it actually speed up the slave?
That depends on your workload, hardware, and other factors. It might work when
the following are true:
- *
The slave’s data is much larger than memory, and the workload is mostly randomly
scattered small (single-row is ideal) changes.
- *
There are lots of high-concurrency "UPDATE" and "DELETE" statements on the
master.
- *
The slave SQL thread is I/O-bound, but the slave overall has plenty of spare I/O
capacity (definitely more than one disk spindle).
- *
The slave uses InnoDB or another storage engine with row-level locking.
It does not speed up replication on my slaves, which mostly have large
queries like "INSERT .. SELECT .. GROUP BY". In my benchmarks it seemed to
make no difference at all, positive or negative.
On the wrong workload or slave configuration, this technique might actually make
the slaves slower. Your mileage will vary.
User-contributed benchmarks are welcome.
DOWNLOADING
You can download Maatkit from the Sourceforge website at
<http://sourceforge.net/projects/maatkit>, or you can get any of the tools
easily with a command like the following:
wget http://www.maatkit.org/get/toolname
or
wget http://www.maatkit.org/trunk/toolname
Where "toolname" can be replaced with the name (or fragment of a name) of any
of the Maatkit tools. Once downloaded, they’re ready to run; no installation is
needed. The first URL gets the latest released version of the tool, and the
second gets the latest trunk code from Subversion.
OPTIONS
–print and –daemonize are mutually exclusive.
Specify at least one of –print or –execute.
- –askpass
Prompt for a password when connecting to MySQL.
- –charset
short form: -A; type: stringDefault character set.
Enables character set settings in Perl and MySQL. If the value is "utf8", sets
Perl’s binmode on STDOUT to utf8, passes the "mysql_enable_utf8" option to
DBD::mysql, and runs "SET NAMES UTF8" after connecting to MySQL. Any other
value sets binmode on STDOUT without the utf8 layer, and runs "SET NAMES" after
connecting to MySQL.
- –checkint
short form: -i; type: Array; default: 16,1,1024How often to check the slave: init,min,max.
How many relay log events should pass before checking the output of "SHOW
SLAVE STATUS". The syntax is a three-number range: initial, minimum, and
maximum. You should be able to leave this at the defaults."mk-slave-prefetch" varies the check interval in powers of two, depending on
whether it decides the check was necessary.
- –daemonize
Fork and run in the background; POSIX OSes only.
- –database
short form: -D; type: stringThe database to use for the connection.
Connect to this database. "mk-slave-prefetch" will issue "USE" statements
as required by the binary log events.
- –defaults-file
short form: -F; type: stringOnly read mysql options from the given file. You must give an absolute
pathname.
- –errors
culumative: yes; default: 0; type: intPrint queries that caused errors. If specified once, at exit; if twice, in
realtime.If you specify this option once, you will see a report at the end of the script
execution, showing the normalized queries and the number of times they were
seen. If you specify this option twice, you will see the errors printed out as
they occur, but no normalized report at the end of execution.
- –execute
short form: -x; negatable: yes; default: yesExecute the transformed queries to warm the caches.
- –host
short form: -h; type: stringHost to connect to.
- –iolag
short form: -l; type: size; default: 1kHow many bytes to lag the slave I/O thread.
This helps avoid "mysqlbinlog" reading right off the end of the relay log file.
- –maxquerytime
short form: -q; type: float; default: 1Do not run queries longer than this many seconds; fractions allowed.
If "mk-slave-prefetch" predicts the query will take longer to execute, it will
skip the query. This is based on the theory that pre-warming the cache is most
beneficial for short queries."mk-slave-prefetch" learns how long queries require to execute. It keeps an
average over the last “–querysampsize” samples of each query. The averages
are based on an abstracted version of the query, with specific parameters
replaced by placeholders. The result is a sort of “fingerprint” for the query,
not executable SQL. You can see the learned statistics with the
“–statistics” option.You can pre-load query fingerprints, and average execution times, from a file.
This way you don’t have to wait for "mk-slave-prefetch" to learn all over
every time you start it. Just specify the file on the command line. The
format should be the same as the output from “–statistics”.You might also want to filter out some statements completely, or let only some
statements through. See the “–rejectregexp” and “–permitregexp”
options.If "mk-slave-prefetch" hasn’t seen a query’s fingerprint before, and thus
doesn’t know how long it will take to execute, it wraps it in a subuery, like
this:
SELECT 1 FROM ( <query> ) AS X LIMIT 1;This helps avoid fetching a lot of data back to the client when a query is
very large. It requires a version of MySQL that supports subqueries (version
4.1 and newer). If yours doesn’t, the subquery trick can’t be used, so the
query might fetch a lot of data back to the client.Once a query’s fingerprint has been seen, so it’s known that the query isn’t
enormously slow, "mk-slave-prefetch" just rewrites the "SELECT" list for
efficiency. (Avoiding the subquery reduces the query’s overhead for short
queries). The rewritten query will then look like the following;
SELECT ISNULL(COALESCE(<columns>)) FROM …- –offset
short form: -o; type: size; default: 128How many bytes "mk-slave-prefetch" will try to stay in front of the slave
SQL thread.It will not execute log events it doesn’t think are at least this
far ahead of the SQL thread. See also “–window”.
- –password
short form: -p; type: stringThe password to use when connecting.
- –permitregexp
type: stringPermit queries matching this Perl regexp.
This is a filter for log events. The regular expression is matched against the
raw log event, before any transformations are applied. If specified, this
option will permit only log events matching the regular expression.
- –port
short form: -P; type: intPort number to use for connection.
Print the transformed relay log events to standard output.
- –printnonrewritten
Print queries that could not be transformed into "SELECT".
- –progress
type: intPrint progress information every X events.
The information is the current log file and position, plus a summary of the
statistics gathered.
- –querysampsize
type: int; default: 4Average query exec time over this many queries.
The last "N" queries with a given fingerprint are averaged together to get the
average query execution time (see “–maxquerytime”).
- –rejectregexp
type: stringReject queries matching this Perl regexp.
Similar to “–permitregexp”, but has the opposite effect: log events must
not match the regular expression.
- –sentinel
type: string; default: /tmp/mk-slave-prefetch-sentinelExit if this file exists.
- –setvars
type: string; default: wait_timeout=10000Set these MySQL variables.
Specify any variables you want to be set immediately after connecting to MySQL.
These will be included in a "SET" command.
- –socket
short form: -S; type: stringSocket file to use for connection.
- –statistics
Print execution statistics after exiting.The statistics are in two sections: counters, and queries. The counters simply
count the number of times events occur. You may see the following counters:
NAME MEANING
====================== =======================================
mysqlbinlog Executed mysqlbinlog to read log events.
events The total number of relay log events.
not_far_enough_ahead An event was not at least –offset
bytes ahead of the SQL thread.
too_far_ahead An event was more than –offset
+ –window bytes ahead of the SQL thread.
too_close_to_io_thread An event was less than –iolag bytes
away from the I/O thread’s position.
event_not_allowed An event wasn’t a SET, USE, INSERT,
UPDATE, DELETE or REPLACE query.
event_filtered_out An event was filtered out because of
–permitregexp or –rejectregexp.
same_timestamp A SET TIMESTAMP event was ignored because
it had the same timestamp as the last one.
do_query A transformed event was executed
or printed.
query_error An executed query had an error.
query_too_long An event was not executed because its
average query length exceeded
–maxquerytime.
query_not_rewritten A query could not be rewritten to a
SELECT.
master_pos_wait The tool waited for the SQL thread to
catch up.
show_slave_status The tool queried SHOW SLAVE STATUS.
load_data_infile The tool found a LOAD DATA INFILE query
and unlinked (deleted) the temp file.
could_not_unlink The tool failed to unlink a temp file.
sleep The tool slept for a second because the
slave’s SQL thread was not running, or
because it read past the end of the log.After the counters, "mk-slave-prefetch" prints information about each query
fingerprint it has seen, two lines per fingerprint. The first line contains
the query’s fingerprint. The second line contains the number of times the
fingerprint was seen, number of times executed, the sum of the execution
times, and the average execution time over the last “–querysampsize”
samples.
- –stop
Stop running instances by creating the “–sentinel” file.
- –time
short form: -t; type: timeHow long "mk-slave-prefetch" should run before exiting.
The default is to run forever.
- –tmpdir
type: string; default: /dev/nullWhere to create temp files for "LOAD DATA INFILE" queries.
The default will cause "mysqlbinlog" to skip the file and the associated "LOAD
DATA INFILE" command entirely.If "mk-slave-prefetch" sees a "LOAD DATA INFILE" command (which it won’t, if
this is left at the default), it will try to remove the temporary file, then
skip the event.
- –user
short form: -u; type: stringUser for login if not current user.
- –window
short form: -w; type: size; default: 4kThe max bytes ahead of the slave "mk-slave-prefetch" should get.
Defines the window within which "mk-slave-prefetch" considers a query OK to
execute. The window begins at the slave SQL thread’s last known position plus
“–offset” bytes, and extends for the specified number of bytes.If "mk-slave-prefetch" sees a log event that is too far in the future, it will
increment the "too_far_ahead" counter and wait for the slave SQL thread to
catch up (which increments the "master_pos_wait" counter). If an event isn’t
far enough ahead of the SQL thread, it will be discarded and the
"not_far_enough_ahead" counter increments.Watching the mentioned statistics can help you understand how to tune the
window. You want "mk-slave-prefetch" to run just ahead of the SQL thread, not
throwing out a lot of events for being too far ahead or not far enough ahead.
SYSTEM REQUIREMENTS
You need Perl, DBI, DBD::mysql, and some core packages that ought to be
installed in any reasonably new version of Perl.
ENVIRONMENT
The environment variable "MKDEBUG" enables verbose debugging output in all of
the Maatkit tools:
MKDEBUG=1 mk-….
BUGS
Please use the Sourceforge bug tracker, forums, and mailing lists to request
support or report bugs: <http://sourceforge.net/projects/maatkit/>.
Please include the complete command-line used to reproduce the problem you are
seeing, the version of all MySQL servers involved, the complete output of the
tool when run with “–version”, and if possible, debugging output produced by
running with the "MKDEBUG=1" environment variable.
COPYRIGHT, LICENSE AND WARRANTY
This program is copyright (c) 2007 Baron Schwartz.
Feedback and improvements are welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue `man perlgpl’ or `man perlartistic’ to read these
licenses.
You should have received a copy of the GNU General Public License along with
this program; if not, write to the Free Software Foundation, Inc., 59 Temple
Place, Suite 330, Boston, MA 02111-1307 USA.
AUTHOR
VERSION
This manual page documents Ver 1.0.2 Distrib 1972 $Revision: 1970 $.
Index
- NAME
- SYNOPSIS
- DESCRIPTION
- DOES IT WORK?
- DOWNLOADING
- OPTIONS
- SYSTEM REQUIREMENTS
- ENVIRONMENT
- BUGS
- COPYRIGHT, LICENSE AND WARRANTY
- AUTHOR
- VERSION