Man page for mk-parallel-restore
August 24, 2007 – 5:36 pmMK-PARALLEL-RESTORE
Section: User Contributed Perl Documentation (1)
Updated: 2008-06-01
Index
Return to Main Contents
NAME
mk-parallel-restore – Load files into MySQL in parallel.
SYNOPSIS
mk-parallel-restore /path/to/files
mk-parallel-restore –tab /path/to/files
DESCRIPTION
mk-parallel-restore is a way to load SQL or delimited-file dumps into MySQL
in parallel at high speed. It is especially designed for restoring files
dumped by mk-parallel-dump. It automatically
detects whether a file contains SQL or delimited data from the filename
extension, and either shells out to "mysql" or executes "LOAD DATA INFILE"
with the file. On UNIX-like systems, it will even make a FIFO to decompress
gzipped files for "LOAD DATA INFILE".
By default it discovers all files in the directory you specify on the command
line. It uses the file’s parent directory as the database name and the file’s
name (up to the first dot) as the table name. It can deal with files named
like the following:
dir/tbl.sql
dir/tbl.txt
dir/tbl.csv
dir/tbl.trg
It is also happy with files that look like this, where "EXT" is one of the
extensions just listed.
dir/tbl.EXT.000
dir/tbl.EXT.000.gz
By default, it loads "SQL" files first, if they exist, then loads "CSV" or
"TXT" files next, in order of the numbers in the filename extension as just
shown. This makes it easy for you to reload a table’s definition followed by
its data, in case you dumped them into separate files (as happens with
"mysqldump"‘s "–tab" option). It loads "TRG" files, which create
triggers, last. Creating the triggers before loading data might keep the data
from being restored correctly. Files that are named 00_views.sql are loaded
even later, after all the parallel restores are finished; dependencies among
views and tables make them hard to restore one at a time. See
mk-parallel-dump for details on how data is dumped.
Exit status is 0 if everything went well, 1 if any files failed, and any
other value indicates an internal error.
OUTPUT
Output depends on verbosity. When “–test” is given, output includes
commands that would be executed.
When “–verbose” is 0, there is normally no output unless there’s an error.
When “–verbose” is 1, there is one line of output for the entire job,
showing how many tables were processed, how many files were loaded with what
status, how much time elapsed, and how much time the parallel load jobs added
up to. If any files were skipped, the filenames are printed to the output.
When “–verbose” is 2, there’s one line of output per table, showing extra
data such as how many threads were running when each table finished loading:
DATABASE TABLE FILES TIME STATUS THREADS
sakila language 2 0.07 0 2
sakila film_actor 2 0.07 0 2
sakila actor 2 0.06 0 2
sakila payment 2 0.07 0 2
sakila transport_backup 2 0.05 0 2
sakila country 2 0.08 0 2
sakila film 2 0.05 0 2
sakila rental 2 0.07 0 2
SPEED OF PARALLEL LOADING
User-contributed benchmarks are welcome. See
<http://www.paragon-cs.com/wordpress/?p=52> for one user’s experiences.
DOWNLOADING
You can download Maatkit from the Sourceforge website at
<http://sourceforge.net/projects/maatkit>, or you can get any of the tools
easily with a command like the following:
wget http://www.maatkit.org/get/toolname
or
wget http://www.maatkit.org/trunk/toolname
Where "toolname" can be replaced with the name (or fragment of a name) of any
of the Maatkit tools. Once downloaded, they’re ready to run; no installation is
needed. The first URL gets the latest released version of the tool, and the
second gets the latest trunk code from Subversion.
OPTIONS
- –askpass
Prompt for a password when connecting to MySQL.
- –basedir
type: stringDirectory where FIFO files will be created.
- –bulkinsbufsize
type: intSet bulk_insert_buffer_size before each "LOAD DATA INFILE".
Has no effect without “–tab”.
- –charset
short form: -A; type: string; default: BINARYSets the connection, database, and "LOAD DATA INFILE" character set.
The default is "BINARY", which is the safest value to use for "LOAD DATA
INFILE". Has no effect without “–tab”.
- –commit
Commit after each load via "LOAD DATA INFILE".
- –createdb
Create databases if they don’t exist.
- –csv
Files are in CSV format (implies “–tab”).Changes “–tab” options so the following "LOAD DATA INFILE" statement is used:
LOAD DATA INFILE <filename> INTO TABLE <table>
FIELDS TERMINATED BY ‘,’ OPTIONALLY ENCLOSED BY ‘\"’
LINES TERMINATED BY ‘\n’;- –database
short form: -D; type: stringLoad all files into this database.
Overrides the database which is normally specified by the directory in which the
files live. Does not specify a default database for the connection.
- –databases
short form: -d; type: hashRestore only this comma-separated list of databases.
- –dbregex
type: stringRestore only databases whose names match this regex.
- –defaults-file
short form: -F; type: stringOnly read mysql options from the given file. You must give an absolute
pathname.
- –disablekeys
negatable: yesExecute "ALTER TABLE DISABLE KEYS" before each table.
- –fifo
negatable: yes; default: yesStream files into a FIFO for –tab.
Load compressed tab-separated files by piping them into a FIFO and using the
FIFO with "LOAD DATA INFILE", instead of by decompressing the files on disk.
Sets “–umask” to 0.
- –host
short form: -h; type: stringConnect to host.
- –ignore
short form: -iAdds the "IGNORE" modifier to "LOAD DATA INFILE".
- –ignoredb
short form: -g; type: HashIgnore this comma-separated list of databases.
- –ignoretbl
short form: -n; type: HashIgnore this comma-separated list of table names.
Table names may be qualified with the database name.
- –local
short form: -LUses the "LOCAL" option to "LOAD DATA INFILE".
If you enable this option, the files are read locally by the client library, not
by the server.
- –locktables
negatable: yesLock tables before "LOAD DATA INFILE".
- –noautovalon0
negatable: yesSet SQL "NO_AUTO_VALUE_ON_ZERO" before "LOAD DATA INFILE".
- –nobinlog
negatable: yesSet "SQL_LOG_BIN=0" before "LOAD DATA INFILE".
This prevents large loads from being logged to the server’s binary log.
- –noforeignkeys
negatable: yesSet "FOREIGN_KEY_CHECKS=0" before "LOAD DATA INFILE".
- –nouniquechecks
negatable: yesSet "UNIQUE_CHECKS=0" before "LOAD DATA INFILE".
- –numthread
short form: -m; type: intSpecifies the number of parallel processes to run.
The default is 2 (this is mk-parallel-restore after all — 1 is not parallel).
On GNU/Linux machines, the default is the number of times ‘processor’ appears in
/proc/cpuinfo. On Windows, the default is read from the environment. In any
case, the default is at least 2, even when there’s only a single processor.
- –password
short form: -p; type: stringPassword to use when connecting.
- –port
short form: -P; type: intPort number to use for connection.
- –quiet
short form: -qSets “–verbose” to 0.
- –replace
short form: -rAdds the "REPLACE" modifier to "LOAD DATA INFILE".
- –setvars
type: string; default: wait_timeout=10000Set these MySQL variables.
Specify any variables you want to be set immediately after connecting to MySQL.
These will be included in a "SET" command.
- –socket
short form: -S; type: stringSocket file to use for connection.
- –tab
short form: -TLoad tab-separated files with "LOAD DATA INFILE".
This is similar to what "mysqlimport" does, but more flexible.
Enables the following options, unless they are specifically disabled:
“–commit”, “–disablekeys”, “–noautovalon0”, “–nobinlog”,
“–nouniquechecks”, “–noforeignkeys”.
- –tables
short form: -t; type: hashRestore only this comma-separated list of table names.
Table names may be qualified with the database name.
- –tblregex
type: stringRestore only tables whose names match this regex.
- –test
Print commands instead of executing them.
- –truncate
Run "TRUNCATE TABLE" before "LOAD DATA INFILE".This will delete all rows from a table before loading the first tab-delimited
file into it.
- –umask
type: stringSet the program’s "umask" to this octal value.
This is useful when you want created files (such as FIFO files) to be readable
or writable by other users (for example, the MySQL server itself).
- –user
short form: -u; type: stringUser for login if not current user.
- –verbose
short form: -v; cumulative: yes; default: 1Verbosity; can specify multiple times.
Repeatedly specifying it increments the verbosity. Default is 1 if not
specified. See “OUTPUT”.
- –wait
short form: -w; type: time; default: 5mWait limit when server is down.
If the MySQL server crashes during loading, waits until the server comes back
and then continues with the rest of the files. "mk-parallel-restore" will
check the server every second until this time is exhausted, at which point it
will give up and exit.
ENVIRONMENT
The environment variable "MKDEBUG" enables verbose debugging output in all of
the Maatkit tools:
MKDEBUG=1 mk-….
SYSTEM REQUIREMENTS
You need Perl, DBI, DBD::mysql, and some core packages that ought to be
installed in any reasonably new version of Perl.
BUGS
Please use the Sourceforge bug tracker, forums, and mailing lists to request
support or report bugs: <http://sourceforge.net/projects/maatkit/>.
Please include the complete command-line used to reproduce the problem you are
seeing, the version of all MySQL servers involved, the complete output of the
tool when run with “–version”, and if possible, debugging output produced by
running with the "MKDEBUG=1" environment variable.
COPYRIGHT, LICENSE AND WARRANTY
This program is copyright (c) 2007 Baron Schwartz. Feedback and improvements
are welcome.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED
WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation, version 2; OR the Perl Artistic License. On UNIX and similar
systems, you can issue `man perlgpl’ or `man perlartistic’ to read these
licenses.
You should have received a copy of the GNU General Public License along with
this program; if not, write to the Free Software Foundation, Inc., 59 Temple
Place, Suite 330, Boston, MA 02111-1307 USA.
AUTHOR
SEE ALSO
VERSION
This manual page documents Ver 1.0.7 Distrib 1972 $Revision: 1970 $.
Index
- NAME
- SYNOPSIS
- DESCRIPTION
- OUTPUT
- SPEED OF PARALLEL LOADING
- DOWNLOADING
- OPTIONS
- ENVIRONMENT
- SYSTEM REQUIREMENTS
- BUGS
- COPYRIGHT, LICENSE AND WARRANTY
- AUTHOR
- SEE ALSO
- VERSION
Related posts:
- man iptables-restore – Man page for iptables-restore
- man ipvsadm-restore – Man page for ipvsadm-restore
- man restore – Man page for restore
- man mysqlimport – Man page for mysqlimport
- man mysqlbinlog – Man page for mysqlbinlog
- man myisam_ftdump – Man page for myisam_ftdump
- man mysqlshow – Man page for mysqlshow
- man mysqlman – Man page for mysqlman
- man lket-b2a – Man page for lket-b2a
- man loadkeys – Man page for loadkeys