sqoop案例分析 - yingziaiai/SetupEnv GitHub Wiki
sqoop概述架构 sqoop的整体架构 http://www.jianshu.com/p/dd723351b39e
fuying@ubuntu2:/opt/BIGDATA/sqoop-1.4.5-cdh5.3.6$ bin/sqoop import --connect jdbc:mysql://ubuntu2:3306/test --username root --password woshiwo --table my_user
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
16/11/04 17:17:38 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.3.6
16/11/04 17:17:38 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/11/04 17:17:39 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
16/11/04 17:17:39 INFO tool.CodeGenTool: Beginning code generation
16/11/04 17:17:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user
AS t LIMIT 1
16/11/04 17:17:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user
AS t LIMIT 1
16/11/04 17:17:39 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/BIGDATA/hadoop-2.5.0-cdh5.3.6
注: /tmp/sqoop-fuying/compile/530781bef1d9cc803f908539ffd7d98b/my_user.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
16/11/04 17:17:43 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-fuying/compile/530781bef1d9cc803f908539ffd7d98b/my_user.jar
16/11/04 17:17:43 WARN manager.MySQLManager: It looks like you are importing from mysql.
16/11/04 17:17:43 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
16/11/04 17:17:43 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
16/11/04 17:17:43 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
16/11/04 17:17:43 INFO mapreduce.ImportJobBase: Beginning import of my_user
16/11/04 17:17:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/11/04 17:17:43 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
16/11/04 17:17:44 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
16/11/04 17:17:45 INFO client.RMProxy: Connecting to ResourceManager at ubuntu2/10.211.55.12:8032
16/11/04 17:17:47 INFO db.DBInputFormat: Using read commited transaction isolation
16/11/04 17:17:47 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(id
), MAX(id
) FROM my_user
16/11/04 17:17:47 INFO mapreduce.JobSubmitter: number of splits:4
16/11/04 17:17:47 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1477705386971_0016
16/11/04 17:17:48 INFO impl.YarnClientImpl: Submitted application application_1477705386971_0016
16/11/04 17:17:48 INFO mapreduce.Job: The url to track the job: http://ubuntu2:8088/proxy/application_1477705386971_0016/
16/11/04 17:17:48 INFO mapreduce.Job: Running job: job_1477705386971_0016
16/11/04 17:17:59 INFO mapreduce.Job: Job job_1477705386971_0016 running in uber mode : false
16/11/04 17:17:59 INFO mapreduce.Job: map 0% reduce 0%
16/11/04 17:18:11 INFO mapreduce.Job: map 50% reduce 0%
16/11/04 17:18:22 INFO mapreduce.Job: map 75% reduce 0%
16/11/04 17:18:23 INFO mapreduce.Job: map 100% reduce 0%
16/11/04 17:18:23 INFO mapreduce.Job: Job job_1477705386971_0016 completed successfully
16/11/04 17:18:23 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=524000
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=393
HDFS: Number of bytes written=61
HDFS: Number of read operations=16
HDFS: Number of large read operations=0
HDFS: Number of write operations=8
Job Counters
Launched map tasks=4
Other local map tasks=4
Total time spent by all maps in occupied slots (ms)=37170
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=37170
Total vcore-seconds taken by all map tasks=37170
Total megabyte-seconds taken by all map tasks=38062080
Map-Reduce Framework
Map input records=6
Map output records=6
Input split bytes=393
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=675
CPU time spent (ms)=6040
Physical memory (bytes) snapshot=361738240
Virtual memory (bytes) snapshot=2659934208
Total committed heap usage (bytes)=95420416
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=61
16/11/04 17:18:23 INFO mapreduce.ImportJobBase: Transferred 61 bytes in 38.9556 seconds (1.5659 bytes/sec)
16/11/04 17:18:23 INFO mapreduce.ImportJobBase: Retrieved 6 records.
生成的4个文件位置在HDFS上的路径 /user/fuying/my_user/
如果设定只有一个map. 并且指定target-dir,但之前并未手工创建这个目录
fuying@ubuntu2:/opt/BIGDATA/sqoop-1.4.5-cdh5.3.6$ bin/sqoop import --connect jdbc:mysql://ubuntu2:3306/test --username root --password woshiwo --table my_user --num-mappers 1 --target-dir /user/fuying/sqoop/
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
16/11/04 17:27:16 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.3.6
16/11/04 17:27:16 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/11/04 17:27:17 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
16/11/04 17:27:17 INFO tool.CodeGenTool: Beginning code generation
16/11/04 17:27:17 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user
AS t LIMIT 1
16/11/04 17:27:17 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user
AS t LIMIT 1
16/11/04 17:27:17 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/BIGDATA/hadoop-2.5.0-cdh5.3.6
注: /tmp/sqoop-fuying/compile/2b075a56d632ec44d642a8dfe4b39daa/my_user.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
16/11/04 17:27:20 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-fuying/compile/2b075a56d632ec44d642a8dfe4b39daa/my_user.jar
16/11/04 17:27:20 WARN manager.MySQLManager: It looks like you are importing from mysql.
16/11/04 17:27:20 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
16/11/04 17:27:20 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
16/11/04 17:27:20 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
16/11/04 17:27:20 INFO mapreduce.ImportJobBase: Beginning import of my_user
16/11/04 17:27:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/11/04 17:27:21 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
16/11/04 17:27:22 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
16/11/04 17:27:22 INFO client.RMProxy: Connecting to ResourceManager at ubuntu2/10.211.55.12:8032
16/11/04 17:27:25 INFO db.DBInputFormat: Using read commited transaction isolation
16/11/04 17:27:25 INFO mapreduce.JobSubmitter: number of splits:1
16/11/04 17:27:26 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1477705386971_0017
16/11/04 17:27:26 INFO impl.YarnClientImpl: Submitted application application_1477705386971_0017
16/11/04 17:27:26 INFO mapreduce.Job: The url to track the job: http://ubuntu2:8088/proxy/application_1477705386971_0017/
16/11/04 17:27:26 INFO mapreduce.Job: Running job: job_1477705386971_0017
16/11/04 17:27:36 INFO mapreduce.Job: Job job_1477705386971_0017 running in uber mode : false
16/11/04 17:27:36 INFO mapreduce.Job: map 0% reduce 0%
16/11/04 17:27:42 INFO mapreduce.Job: map 100% reduce 0%
16/11/04 17:27:43 INFO mapreduce.Job: Job job_1477705386971_0017 completed successfully
16/11/04 17:27:43 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=131152
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=87
HDFS: Number of bytes written=61
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Other local map tasks=1
Total time spent by all maps in occupied slots (ms)=4282
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=4282
Total vcore-seconds taken by all map tasks=4282
Total megabyte-seconds taken by all map tasks=4384768
Map-Reduce Framework
Map input records=6
Map output records=6
Input split bytes=87
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=124
CPU time spent (ms)=1170
Physical memory (bytes) snapshot=108969984
Virtual memory (bytes) snapshot=664985600
Total committed heap usage (bytes)=23855104
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=61
16/11/04 17:27:43 INFO mapreduce.ImportJobBase: Transferred 61 bytes in 20.9563 seconds (2.9108 bytes/sec)
16/11/04 17:27:43 INFO mapreduce.ImportJobBase: Retrieved 6 records.
结果保存在/user/fuying/sqoop/
导出到mysql,要特别注意分隔符
fuying@ubuntu2:/opt/BIGDATA/sqoop-1.4.5-cdh5.3.6$ bin/sqoop export --connect jdbc:mysql://ubuntu2:3306/test --username root --password woshiwo --table my_user2 --num-mappers 1 --export-dir /user/fuying/sqoop/Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
16/11/04 17:53:15 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.3.6
16/11/04 17:53:15 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/11/04 17:53:15 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
16/11/04 17:53:15 INFO tool.CodeGenTool: Beginning code generation
16/11/04 17:53:16 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user2
AS t LIMIT 1
16/11/04 17:53:16 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user2
AS t LIMIT 1
16/11/04 17:53:16 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/BIGDATA/hadoop-2.5.0-cdh5.3.6
注: /tmp/sqoop-fuying/compile/2ed96820d5282927a1d33d2bc8b94f96/my_user2.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
16/11/04 17:53:19 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-fuying/compile/2ed96820d5282927a1d33d2bc8b94f96/my_user2.jar
16/11/04 17:53:19 INFO mapreduce.ExportJobBase: Beginning export of my_user2
16/11/04 17:53:19 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/11/04 17:53:19 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
16/11/04 17:53:20 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
16/11/04 17:53:20 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
16/11/04 17:53:20 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
16/11/04 17:53:20 INFO client.RMProxy: Connecting to ResourceManager at ubuntu2/10.211.55.12:8032
16/11/04 17:53:23 INFO input.FileInputFormat: Total input paths to process : 1
16/11/04 17:53:23 INFO input.FileInputFormat: Total input paths to process : 1
16/11/04 17:53:23 INFO mapreduce.JobSubmitter: number of splits:1
16/11/04 17:53:23 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
16/11/04 17:53:24 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1477705386971_0020
16/11/04 17:53:24 INFO impl.YarnClientImpl: Submitted application application_1477705386971_0020
16/11/04 17:53:24 INFO mapreduce.Job: The url to track the job: http://ubuntu2:8088/proxy/application_1477705386971_0020/
16/11/04 17:53:24 INFO mapreduce.Job: Running job: job_1477705386971_0020
16/11/04 17:53:35 INFO mapreduce.Job: Job job_1477705386971_0020 running in uber mode : false
16/11/04 17:53:35 INFO mapreduce.Job: map 0% reduce 0%
16/11/04 17:54:05 INFO mapreduce.Job: map 100% reduce 0%
16/11/04 17:54:05 INFO mapreduce.Job: Job job_1477705386971_0020 completed successfully
16/11/04 17:54:05 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=130808
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=195
HDFS: Number of bytes written=0
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
Job Counters
Launched map tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=27336
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=27336
Total vcore-seconds taken by all map tasks=27336
Total megabyte-seconds taken by all map tasks=27992064
Map-Reduce Framework
Map input records=6
Map output records=6
Input split bytes=131
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=837
CPU time spent (ms)=1950
Physical memory (bytes) snapshot=88842240
Virtual memory (bytes) snapshot=662880256
Total committed heap usage (bytes)=23724032
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
16/11/04 17:54:05 INFO mapreduce.ExportJobBase: Transferred 195 bytes in 44.7889 seconds (4.3538 bytes/sec)
16/11/04 17:54:05 INFO mapreduce.ExportJobBase: Exported 6 records.
从mysql导入到hive:
fuying@ubuntu2:/opt/BIGDATA/sqoop-1.4.5-cdh5.3.6$ bin/sqoop import --connect jdbc:mysql://ubuntu2:3306/test --username root --password woshiwo --table my_user --num-mappers 1 --fields-terminated-by '\t' --target-dir /user/fuying/sqoop/input/ --delete-target-dir --hive-database hivetest --hive-import --hive-table h_myuser
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
16/11/04 18:11:52 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.3.6
16/11/04 18:11:52 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/11/04 18:11:52 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
16/11/04 18:11:52 INFO tool.CodeGenTool: Beginning code generation
16/11/04 18:11:52 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user
AS t LIMIT 1
16/11/04 18:11:52 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user
AS t LIMIT 1
16/11/04 18:11:52 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/BIGDATA/hadoop-2.5.0-cdh5.3.6
注: /tmp/sqoop-fuying/compile/42543b6bb7687365b773417a2677f5c5/my_user.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
16/11/04 18:11:54 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-fuying/compile/42543b6bb7687365b773417a2677f5c5/my_user.jar
16/11/04 18:11:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/11/04 18:11:56 INFO tool.ImportTool: Destination directory /user/fuying/sqoop/input deleted.
16/11/04 18:11:56 WARN manager.MySQLManager: It looks like you are importing from mysql.
16/11/04 18:11:56 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
16/11/04 18:11:56 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
16/11/04 18:11:56 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
16/11/04 18:11:56 INFO mapreduce.ImportJobBase: Beginning import of my_user
16/11/04 18:11:56 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
16/11/04 18:11:56 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
16/11/04 18:11:56 INFO client.RMProxy: Connecting to ResourceManager at ubuntu2/10.211.55.12:8032
16/11/04 18:11:58 INFO db.DBInputFormat: Using read commited transaction isolation
16/11/04 18:11:58 INFO mapreduce.JobSubmitter: number of splits:1
16/11/04 18:11:59 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1477705386971_0022
16/11/04 18:11:59 INFO impl.YarnClientImpl: Submitted application application_1477705386971_0022
16/11/04 18:11:59 INFO mapreduce.Job: The url to track the job: http://ubuntu2:8088/proxy/application_1477705386971_0022/
16/11/04 18:11:59 INFO mapreduce.Job: Running job: job_1477705386971_0022
16/11/04 18:12:10 INFO mapreduce.Job: Job job_1477705386971_0022 running in uber mode : false
16/11/04 18:12:10 INFO mapreduce.Job: map 0% reduce 0%
16/11/04 18:12:19 INFO mapreduce.Job: map 100% reduce 0%
16/11/04 18:12:20 INFO mapreduce.Job: Job job_1477705386971_0022 completed successfully
16/11/04 18:12:20 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=131442
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=87
HDFS: Number of bytes written=61
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Other local map tasks=1
Total time spent by all maps in occupied slots (ms)=7307
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=7307
Total vcore-seconds taken by all map tasks=7307
Total megabyte-seconds taken by all map tasks=7482368
Map-Reduce Framework
Map input records=6
Map output records=6
Input split bytes=87
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=129
CPU time spent (ms)=1450
Physical memory (bytes) snapshot=96002048
Virtual memory (bytes) snapshot=678191104
Total committed heap usage (bytes)=23855104
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=61
16/11/04 18:12:20 INFO mapreduce.ImportJobBase: Transferred 61 bytes in 24.2529 seconds (2.5152 bytes/sec)
16/11/04 18:12:20 INFO mapreduce.ImportJobBase: Retrieved 6 records.
16/11/04 18:12:20 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user
AS t LIMIT 1
16/11/04 18:12:20 INFO hive.HiveImport: Loading uploaded data into Hive
16/11/04 18:12:24 INFO hive.HiveImport:
16/11/04 18:12:24 INFO hive.HiveImport: Logging initialized using configuration in file:/opt/BIGDATA/hive-0.13.1-cdh5.3.6/conf/hive-log4j.properties
16/11/04 18:12:31 INFO hive.HiveImport: OK
16/11/04 18:12:31 INFO hive.HiveImport: Time taken: 1.069 seconds
16/11/04 18:12:32 INFO hive.HiveImport: Loading data to table hivetest.h_myuser
16/11/04 18:12:32 INFO hive.HiveImport: Table hivetest.h_myuser stats: [numFiles=1, numRows=0, totalSize=61, rawDataSize=0]
16/11/04 18:12:32 INFO hive.HiveImport: OK
16/11/04 18:12:32 INFO hive.HiveImport: Time taken: 1.127 seconds
16/11/04 18:12:33 INFO hive.HiveImport: Hive import complete.
16/11/04 18:12:33 INFO hive.HiveImport: Export directory is not empty, keeping it.
/user/hive/warehouse/hivetest.db/h_myuser
可以看出其原理是先把文件导入到指定目录下,然后再把数据文件移动到对应的数据表文件夹下
从hive导出数据到mysql:
fuying@ubuntu2:/opt/BIGDATA/sqoop-1.4.5-cdh5.3.6$ bin/sqoop export --connect jdbc:mysql://ubuntu2:3306/test --username root --password woshiwo --table my_user3 --num-mappers 1 --input-fields-terminated-by '\t' --export-dir /user/hive/warehouse/hivetest.db/h_myuser
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
16/11/04 18:25:12 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.3.6
16/11/04 18:25:12 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/11/04 18:25:12 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
16/11/04 18:25:12 INFO tool.CodeGenTool: Beginning code generation
16/11/04 18:25:13 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user3
AS t LIMIT 1
16/11/04 18:25:13 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user3
AS t LIMIT 1
16/11/04 18:25:13 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/BIGDATA/hadoop-2.5.0-cdh5.3.6
注: /tmp/sqoop-fuying/compile/5109a46cf4abc968e119aef5bf033b85/my_user3.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
16/11/04 18:25:16 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-fuying/compile/5109a46cf4abc968e119aef5bf033b85/my_user3.jar
16/11/04 18:25:16 INFO mapreduce.ExportJobBase: Beginning export of my_user3
16/11/04 18:25:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/11/04 18:25:17 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
16/11/04 18:25:19 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
16/11/04 18:25:19 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
16/11/04 18:25:19 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
16/11/04 18:25:19 INFO client.RMProxy: Connecting to ResourceManager at ubuntu2/10.211.55.12:8032
16/11/04 18:25:21 INFO input.FileInputFormat: Total input paths to process : 1
16/11/04 18:25:21 INFO input.FileInputFormat: Total input paths to process : 1
16/11/04 18:25:22 INFO mapreduce.JobSubmitter: number of splits:1
16/11/04 18:25:22 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
16/11/04 18:25:22 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1477705386971_0023
16/11/04 18:25:23 INFO impl.YarnClientImpl: Submitted application application_1477705386971_0023
16/11/04 18:25:23 INFO mapreduce.Job: The url to track the job: http://ubuntu2:8088/proxy/application_1477705386971_0023/
16/11/04 18:25:23 INFO mapreduce.Job: Running job: job_1477705386971_0023
16/11/04 18:25:33 INFO mapreduce.Job: Job job_1477705386971_0023 running in uber mode : false
16/11/04 18:25:33 INFO mapreduce.Job: map 0% reduce 0%
16/11/04 18:25:40 INFO mapreduce.Job: map 100% reduce 0%
16/11/04 18:25:41 INFO mapreduce.Job: Job job_1477705386971_0023 completed successfully
16/11/04 18:25:41 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=130853
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=218
HDFS: Number of bytes written=0
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
Job Counters
Launched map tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=5340
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=5340
Total vcore-seconds taken by all map tasks=5340
Total megabyte-seconds taken by all map tasks=5468160
Map-Reduce Framework
Map input records=6
Map output records=6
Input split bytes=154
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=69
CPU time spent (ms)=1020
Physical memory (bytes) snapshot=107024384
Virtual memory (bytes) snapshot=662876160
Total committed heap usage (bytes)=23724032
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
16/11/04 18:25:41 INFO mapreduce.ExportJobBase: Transferred 218 bytes in 22.6979 seconds (9.6044 bytes/sec)
16/11/04 18:25:41 INFO mapreduce.ExportJobBase: Exported 6 records.