Man page for mk-find
August 24, 2007 – 5:35 pmMK-FIND
Section: User Contributed Perl Documentation (1)
Updated: 2008-06-01
Index
Return to Main Contents
NAME
mk-find - Find MySQL tables and execute actions, like GNU find.
DESCRIPTION
mk-find looks for MySQL tables that pass the tests you specify, and executes
the actions you specify. The default action is to print the database and table
name to STDOUT.
mk-find is simpler than GNU find. It doesn’t allow you to specify
complicated expressions on the command line.
mk-find only looks for and processes tables. If you need it to do
other things, like triggers or columns, file a bug report and I’ll add the
features.
mk-find uses SHOW TABLES when possible, and SHOW TABLE STATUS when needed.
DOWNLOADING
You can download Maatkit from the Sourceforge website at
<http://sourceforge.net/projects/maatkit>, or you can get any of the tools
easily with a command like the following:
wget http://www.maatkit.org/get/toolname
or
wget http://www.maatkit.org/trunk/toolname
Where "toolname" can be replaced with the name (or fragment of a name) of any
of the Maatkit tools. Once downloaded, they’re ready to run; no installation is
needed. The first URL gets the latest released version of the tool, and the
second gets the latest trunk code from Subversion.
OPTIONS
There are three kinds of options: normal options, which determine some behavior
or setting; tests, which determine whether a table should be included in the
list of tables found; and actions, which do something to the tables mk-find
finds.
mk-find uses standard Getopt::Long option parsing, so you should use double
dashes in front of long option names, unlike GNU find.
OPTIONS
- –askpass
Prompt for a password when connecting to MySQL.
- –case-insensitive
Specifies that all regular expression searches are case-insensitive.
- –charset
Enables character set settings in Perl and MySQL. If the value is "utf8", sets
Perl’s binmode on STDOUT to utf8, passes the "mysql_enable_utf8" option to
DBD::mysql, and runs "SET NAMES UTF8" after connecting to MySQL. Any other
value sets binmode on STDOUT without the utf8 layer, and runs "SET NAMES" after
connecting to MySQL.
- –daystart
Measure times (for “–mmin”, etc) from the beginning of today rather than from the
current time.
- –defaults-file
If you specify this option, only this file is read for MySQL default options;
otherwise all the default files will be read.
- –help
Displays a help message.
- –host
Connect to host.
- –or
By default, tests are evaluated as though there were an AND between them. This
option switches it to OR.Option parsing is not implemented by mk-find itself, so you cannot specify
complicated expressions with parentheses and mixtures of OR and AND.
- –password
The password to use when connecting.
- –port
The port number to use for the connection.
- –quote
This option is enabled by default. It quotes MySQL identifier names with
MySQL’s standard backtick character. Quoting happens after tests are run, and
before actions are run.
- –setvars
Specify any variables you want to be set immediately after connecting to MySQL.
These will be included in a "SET" command.
- –socket
The socket file to use for the connection.
- –user
The user for login if not the current user.
- –version
Output version information and exit.
TESTS
Most tests check some criterion against a column of SHOW TABLE STATUS output.
Numeric arguments can be specified as +n for greater than n, -n for less than n,
and n for exactly n. All numeric options can take an optional suffix multiplier
of k, M or G (1_024, 1_048_576, and 1_073_741_824 respectively). All patterns
are Perl regular expressions (see ‘man perlre’) unless specified as SQL LIKE
patterns.
Dates and times are all measured relative to the same instant, when mk-find
first asks the database server what time it is. All date and time manipulation
is done in SQL, so if you say to find tables modified 5 days ago, that
translates to SELECT DATE_SUB(CURRENT_TIMESTAMP, INTERVAL 5 DAY). If you
specify “–daystart”, if course it’s relative to CURRENT_DATE instead.
However, table sizes and other metrics are not consistent at an instant in
time. It can take some time for MySQL to process all the SHOW queries, and
mk-find can’t do anything about that. These measurements are as of the
time they’re taken.
If you need some test that’s not in this list, file a bug report and I’ll
enhance mk-find for you. It’s really easy.
- –autoinc
Table’s next AUTO_INCREMENT is n. This tests the Auto_increment column.
- –avgrowlen
Table avg row len is n bytes. This tests the Avg_row_length column.
- –checksum
Table checksum is n. This tests the Checksum column.
- –cmin
Table was created n minutes ago. This tests the Create_time column.
- –collation
Table collation matches pattern. This tests the Collation column.
- –comment
Table comment matches pattern. This tests the Comment column.
- –createopts
Table create option matches pattern. This tests the Create_options column.
- –ctime
Table was created n days ago. This tests the Create_time column.
- –datasize
Table data uses n bytes of space. This tests the Data_length column.
- –datafree
Table has n bytes of free space. This tests the Data_free column.
- –dblike
Database name matches SQL LIKE pattern.
- –dbregex
Database name matches this pattern.
- –empty
Table has no rows. This tests the Rows column.
- –engine
Table storage engine matches this pattern. This tests the Engine column, or in
earlier versions of MySQL, the Type column.
- –indexsize
Table indexes use n bytes of space. This tests the Index_length column.
- –kmin
Table was checked n minutes ago. This tests the Check_time column.
- –ktime
Table was checked n days ago. This tests the Check_time column.
- –mmin
Table was last modified n minutes ago. This tests the Update_time column.
- –mtime
Table was last modified n days ago. This tests the Update_time column.
- –pid
Table name has nonexistent MySQL connection ID. This tests the table name for
a pattern. The argument to this test must be a Perl regular expression that
captures digits like this: (\d+). If the table name matches the pattern,
these captured digits are taken to be the MySQL connection ID of some process.
If the connection doesn’t exist according to SHOW FULL PROCESSLIST, the test
returns true. If the connection ID is greater than mk-find’s own
connection ID, the test returns false for safety.Why would you want to do this? If you use MySQL statement-based replication,
you probably know the trouble temporary tables can cause. You might choose to
work around this by creating real tables with unique names, instead of
temporary tables. One way to do this is to append your connection ID to the
end of the table, thusly: scratch_table_12345. This assures the table name is
unique and lets you have a way to find which connection it was associated
with. And perhaps most importantly, if the connection no longer exists, you
can assume the connection died without cleaning up its tables, and this table
is a candidate for removal.This how I manage scratch tables, and that’s why I included this test in
mk-find.The argument I use to “–pid” is “\D_(\d+)$”. That finds tables with a series of
numbers at the end, preceded by an underscore and some non-number character (the
latter criterion prevents me from examining tables with a date at the end, which
people tend to do: baron_scratch_2007_05_07 for example). It’s better to keep
the scratch tables separate of course.If you do this, make sure the user mk-find runs as has the PROCESS privilege!
Otherwise it will only see connections from the same user, and might think some
tables are ready to remove when they’re still in use. For safety, mk-find
checks this for you.See also “–sid”.
- –rows
Table has n rows. This tests the Rows column.
- –rowformat
Table row format matches pattern. This tests the Row_format column.
- –sid
Table name contains the server ID. If you create temporary tables with the
naming convention explained in “–pid”, but also add the server ID of the
server on which the tables are created, then you can use this pattern match to
ensure tables are dropped only on the server they’re created on. This prevents
a table from being accidentally dropped on a slave while it’s in use (provided
that your server IDs are all unique, which they should be for replication to
work).For example, on the master (server ID 22) you create a table called
scratch_table_22_12345. If you see this table on the slave (server ID 23), you
might think it can be dropped safely if there’s no such connection 12345. But
if you also force the name to match the server ID with "–sid ‘\D_(\d+)_\d+$’",
the table won’t be dropped on the slave.
- –tablesize
Table uses n bytes of space. This tests the sum of the Data_length and
Index_length columns.
- –tbllike
Table name matches SQL LIKE pattern.
- –tblregex
Table name matches this pattern.
- –tblversion
Table version is n. This tests the Version column.
ACTIONS
The exec_plus action happens after everything else, but otherwise actions
happen in an indeterminate order. If you need determinism, file a bug report
and I’ll add this feature.
- –exec
Execute this SQL with each item found. The SQL can contain escapes and
formatting directives (see “–printf”).
- –exec_plus
Execute this SQL with all items at once. This option is unlike “–exec”. There
are no escaping or formatting directives; there is only one special placeholder
for the list of database and table names, %s. The list of tables found will be
joined together with commas and substituted wherever you place %s.You might use this, for example, to drop all the tables you found:
DROP TABLE %sThis is sort of like GNU find’s “-exec command {} +” syntax. Only it’s not
totally cryptic. And it doesn’t require me to write a command-line parser.
Print the database and table name, followed by a newline. This is the default
action if no other action is specified.
- –printf
Print format on the standard output, interpreting ‘\’ escapes and ‘%’
directives. Escapes are backslashed characters, like \n and \t. Perl
interprets these, so you can use any escapes Perl knows about. Directives are
replaced by %s, and as of this writing, you can’t add any special formatting
instructions, like field widths or alignment (though I’m musing over ways to do
that).Here is a list of the directives. Note that most of them simply come from
columns of SHOW TABLE STATUS. If the column is NULL or doesn’t exist, you get
an empty string in the output. A % character followed by any character not in
the following list is discarded (but the other character is printed).
CHAR DATA SOURCE NOTES
—- —————— ——————————————
a Auto_increment
A Avg_row_length
c Checksum
C Create_time
D Database The database name in which the table lives
d Data_length
E Engine In older versions of MySQL, this is Type
F Data_free
f Innodb_free Parsed from the Comment field
I Index_length
K Check_time
L Collation
M Max_data_length
N Name
O Comment
P Create_options
R Row_format
S Rows
T Table_length Data_length+Index_length
U Update_time
V Version
EXAMPLES
Find all tables created more than a day ago, which use the MyISAM engine, and
print their names:
mk-find –ctime +1 –engine MyISAM
Find InnoDB tables that haven’t been updated in a month, and convert them to
MyISAM storage engine (data warehousing, anyone?):
mk-find –mtime +30 –engine InnoDB –exec "ALTER TABLE %D.%N ENGINE=MyISAM"
Find tables created by a process that no longer exists, following the name_sid_pid
naming convention, and remove them.
mk-find –pid ‘\D_\d+_(\d+)$’ –sid ‘\D_(\d+)_\d+$’ –exec_plus "DROP TABLE %s"
Find empty tables in the test and junk databases, and delete them:
mk-find –empty junk test –exec_plus "DROP TABLE %s"
Find tables more than five gigabytes in total size:
mk-find –tablesize +5G
Find all tables and print their total data and index size, and sort largest
tables first (sort is a different program, by the way).
mk-find –printf "%T\t%D.%N\n" | sort -rn
As above, but this time, insert the data back into the database for posterity:
mk-find –noquote –exec "INSERT INTO sysdata.tblsize(db, tbl, size) VALUES(’%D’, ‘%N’, %T)"
ENVIRONMENT
The environment variable "MKDEBUG" enables verbose debugging output in all of
the Maatkit tools:
MKDEBUG=1 mk-….
BUGS
Please use the Sourceforge bug tracker, forums, and mailing lists to request
support or report bugs: <http://sourceforge.net/projects/maatkit/>.
Please include the complete command-line used to reproduce the problem you are
seeing, the version of all MySQL servers involved, the complete output of the
tool when run with “–version”, and if possible, debugging output produced by
running with the "MKDEBUG=1" environment variable.
SYSTEM REQUIREMENTS
You need the following Perl modules: DBI and DBD::mysql.
LICENSE
This program is copyright (c) 2007 Baron Schwartz.
Feedback and improvements are welcome (see “BUGS”).
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue `man perlgpl’ or `man perlartistic’ to read these
licenses.
You should have received a copy of the GNU General Public License along with
this program; if not, write to the Free Software Foundation, Inc., 59 Temple
Place, Suite 330, Boston, MA 02111-1307 USA.
AUTHOR
VERSION
This manual page documents Ver 0.9.11 Distrib 1972 $Revision: 1970 $.
Index
- NAME
- DESCRIPTION
- DOWNLOADING
- OPTIONS
- OPTIONS
- TESTS
- ACTIONS
- EXAMPLES
- ENVIRONMENT
- BUGS
- SYSTEM REQUIREMENTS
- LICENSE
- AUTHOR
- VERSION