sqoop案例分析 - yingziaiai/SetupEnv GitHub Wiki

sqoop概述架构 sqoop的整体架构 http://www.jianshu.com/p/dd723351b39e

fuying@ubuntu2:/opt/BIGDATA/sqoop-1.4.5-cdh5.3.6$ bin/sqoop import --connect jdbc:mysql://ubuntu2:3306/test --username root --password woshiwo --table my_user Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hbase does not exist! HBase imports will fail. Please set $HBASE_HOME to the root of your HBase installation. Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 16/11/04 17:17:38 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.3.6 16/11/04 17:17:38 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 16/11/04 17:17:39 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 16/11/04 17:17:39 INFO tool.CodeGenTool: Beginning code generation 16/11/04 17:17:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user AS t LIMIT 1 16/11/04 17:17:39 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user AS t LIMIT 1 16/11/04 17:17:39 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/BIGDATA/hadoop-2.5.0-cdh5.3.6 注: /tmp/sqoop-fuying/compile/530781bef1d9cc803f908539ffd7d98b/my_user.java使用或覆盖了已过时的 API。 注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。 16/11/04 17:17:43 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-fuying/compile/530781bef1d9cc803f908539ffd7d98b/my_user.jar 16/11/04 17:17:43 WARN manager.MySQLManager: It looks like you are importing from mysql. 16/11/04 17:17:43 WARN manager.MySQLManager: This transfer can be faster! Use the --direct 16/11/04 17:17:43 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path. 16/11/04 17:17:43 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql) 16/11/04 17:17:43 INFO mapreduce.ImportJobBase: Beginning import of my_user 16/11/04 17:17:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/11/04 17:17:43 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 16/11/04 17:17:44 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 16/11/04 17:17:45 INFO client.RMProxy: Connecting to ResourceManager at ubuntu2/10.211.55.12:8032 16/11/04 17:17:47 INFO db.DBInputFormat: Using read commited transaction isolation 16/11/04 17:17:47 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(id), MAX(id) FROM my_user 16/11/04 17:17:47 INFO mapreduce.JobSubmitter: number of splits:4 16/11/04 17:17:47 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1477705386971_0016 16/11/04 17:17:48 INFO impl.YarnClientImpl: Submitted application application_1477705386971_0016 16/11/04 17:17:48 INFO mapreduce.Job: The url to track the job: http://ubuntu2:8088/proxy/application_1477705386971_0016/ 16/11/04 17:17:48 INFO mapreduce.Job: Running job: job_1477705386971_0016 16/11/04 17:17:59 INFO mapreduce.Job: Job job_1477705386971_0016 running in uber mode : false 16/11/04 17:17:59 INFO mapreduce.Job: map 0% reduce 0% 16/11/04 17:18:11 INFO mapreduce.Job: map 50% reduce 0% 16/11/04 17:18:22 INFO mapreduce.Job: map 75% reduce 0% 16/11/04 17:18:23 INFO mapreduce.Job: map 100% reduce 0% 16/11/04 17:18:23 INFO mapreduce.Job: Job job_1477705386971_0016 completed successfully 16/11/04 17:18:23 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=524000 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=393 HDFS: Number of bytes written=61 HDFS: Number of read operations=16 HDFS: Number of large read operations=0 HDFS: Number of write operations=8 Job Counters Launched map tasks=4 Other local map tasks=4 Total time spent by all maps in occupied slots (ms)=37170 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=37170 Total vcore-seconds taken by all map tasks=37170 Total megabyte-seconds taken by all map tasks=38062080 Map-Reduce Framework Map input records=6 Map output records=6 Input split bytes=393 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=675 CPU time spent (ms)=6040 Physical memory (bytes) snapshot=361738240 Virtual memory (bytes) snapshot=2659934208 Total committed heap usage (bytes)=95420416 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=61 16/11/04 17:18:23 INFO mapreduce.ImportJobBase: Transferred 61 bytes in 38.9556 seconds (1.5659 bytes/sec) 16/11/04 17:18:23 INFO mapreduce.ImportJobBase: Retrieved 6 records.

生成的4个文件位置在HDFS上的路径 /user/fuying/my_user/

如果设定只有一个map. 并且指定target-dir,但之前并未手工创建这个目录

fuying@ubuntu2:/opt/BIGDATA/sqoop-1.4.5-cdh5.3.6$ bin/sqoop import --connect jdbc:mysql://ubuntu2:3306/test --username root --password woshiwo --table my_user --num-mappers 1 --target-dir /user/fuying/sqoop/ Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hbase does not exist! HBase imports will fail. Please set $HBASE_HOME to the root of your HBase installation. Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 16/11/04 17:27:16 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.3.6 16/11/04 17:27:16 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 16/11/04 17:27:17 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 16/11/04 17:27:17 INFO tool.CodeGenTool: Beginning code generation 16/11/04 17:27:17 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user AS t LIMIT 1 16/11/04 17:27:17 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user AS t LIMIT 1 16/11/04 17:27:17 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/BIGDATA/hadoop-2.5.0-cdh5.3.6 注: /tmp/sqoop-fuying/compile/2b075a56d632ec44d642a8dfe4b39daa/my_user.java使用或覆盖了已过时的 API。 注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。 16/11/04 17:27:20 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-fuying/compile/2b075a56d632ec44d642a8dfe4b39daa/my_user.jar 16/11/04 17:27:20 WARN manager.MySQLManager: It looks like you are importing from mysql. 16/11/04 17:27:20 WARN manager.MySQLManager: This transfer can be faster! Use the --direct 16/11/04 17:27:20 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path. 16/11/04 17:27:20 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql) 16/11/04 17:27:20 INFO mapreduce.ImportJobBase: Beginning import of my_user 16/11/04 17:27:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/11/04 17:27:21 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 16/11/04 17:27:22 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 16/11/04 17:27:22 INFO client.RMProxy: Connecting to ResourceManager at ubuntu2/10.211.55.12:8032 16/11/04 17:27:25 INFO db.DBInputFormat: Using read commited transaction isolation 16/11/04 17:27:25 INFO mapreduce.JobSubmitter: number of splits:1 16/11/04 17:27:26 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1477705386971_0017 16/11/04 17:27:26 INFO impl.YarnClientImpl: Submitted application application_1477705386971_0017 16/11/04 17:27:26 INFO mapreduce.Job: The url to track the job: http://ubuntu2:8088/proxy/application_1477705386971_0017/ 16/11/04 17:27:26 INFO mapreduce.Job: Running job: job_1477705386971_0017 16/11/04 17:27:36 INFO mapreduce.Job: Job job_1477705386971_0017 running in uber mode : false 16/11/04 17:27:36 INFO mapreduce.Job: map 0% reduce 0% 16/11/04 17:27:42 INFO mapreduce.Job: map 100% reduce 0% 16/11/04 17:27:43 INFO mapreduce.Job: Job job_1477705386971_0017 completed successfully 16/11/04 17:27:43 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=131152 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=87 HDFS: Number of bytes written=61 HDFS: Number of read operations=4 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Other local map tasks=1 Total time spent by all maps in occupied slots (ms)=4282 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=4282 Total vcore-seconds taken by all map tasks=4282 Total megabyte-seconds taken by all map tasks=4384768 Map-Reduce Framework Map input records=6 Map output records=6 Input split bytes=87 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=124 CPU time spent (ms)=1170 Physical memory (bytes) snapshot=108969984 Virtual memory (bytes) snapshot=664985600 Total committed heap usage (bytes)=23855104 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=61 16/11/04 17:27:43 INFO mapreduce.ImportJobBase: Transferred 61 bytes in 20.9563 seconds (2.9108 bytes/sec) 16/11/04 17:27:43 INFO mapreduce.ImportJobBase: Retrieved 6 records.

结果保存在/user/fuying/sqoop/

导出到mysql,要特别注意分隔符

fuying@ubuntu2:/opt/BIGDATA/sqoop-1.4.5-cdh5.3.6$ bin/sqoop export --connect jdbc:mysql://ubuntu2:3306/test --username root --password woshiwo --table my_user2 --num-mappers 1 --export-dir /user/fuying/sqoop/Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hbase does not exist! HBase imports will fail. Please set $HBASE_HOME to the root of your HBase installation. Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 16/11/04 17:53:15 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.3.6 16/11/04 17:53:15 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 16/11/04 17:53:15 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 16/11/04 17:53:15 INFO tool.CodeGenTool: Beginning code generation 16/11/04 17:53:16 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user2 AS t LIMIT 1 16/11/04 17:53:16 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user2 AS t LIMIT 1 16/11/04 17:53:16 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/BIGDATA/hadoop-2.5.0-cdh5.3.6 注: /tmp/sqoop-fuying/compile/2ed96820d5282927a1d33d2bc8b94f96/my_user2.java使用或覆盖了已过时的 API。 注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。 16/11/04 17:53:19 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-fuying/compile/2ed96820d5282927a1d33d2bc8b94f96/my_user2.jar 16/11/04 17:53:19 INFO mapreduce.ExportJobBase: Beginning export of my_user2 16/11/04 17:53:19 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/11/04 17:53:19 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 16/11/04 17:53:20 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 16/11/04 17:53:20 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 16/11/04 17:53:20 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 16/11/04 17:53:20 INFO client.RMProxy: Connecting to ResourceManager at ubuntu2/10.211.55.12:8032 16/11/04 17:53:23 INFO input.FileInputFormat: Total input paths to process : 1 16/11/04 17:53:23 INFO input.FileInputFormat: Total input paths to process : 1 16/11/04 17:53:23 INFO mapreduce.JobSubmitter: number of splits:1 16/11/04 17:53:23 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 16/11/04 17:53:24 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1477705386971_0020 16/11/04 17:53:24 INFO impl.YarnClientImpl: Submitted application application_1477705386971_0020 16/11/04 17:53:24 INFO mapreduce.Job: The url to track the job: http://ubuntu2:8088/proxy/application_1477705386971_0020/ 16/11/04 17:53:24 INFO mapreduce.Job: Running job: job_1477705386971_0020 16/11/04 17:53:35 INFO mapreduce.Job: Job job_1477705386971_0020 running in uber mode : false 16/11/04 17:53:35 INFO mapreduce.Job: map 0% reduce 0% 16/11/04 17:54:05 INFO mapreduce.Job: map 100% reduce 0% 16/11/04 17:54:05 INFO mapreduce.Job: Job job_1477705386971_0020 completed successfully 16/11/04 17:54:05 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=130808 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=195 HDFS: Number of bytes written=0 HDFS: Number of read operations=4 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 Job Counters Launched map tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=27336 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=27336 Total vcore-seconds taken by all map tasks=27336 Total megabyte-seconds taken by all map tasks=27992064 Map-Reduce Framework Map input records=6 Map output records=6 Input split bytes=131 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=837 CPU time spent (ms)=1950 Physical memory (bytes) snapshot=88842240 Virtual memory (bytes) snapshot=662880256 Total committed heap usage (bytes)=23724032 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=0 16/11/04 17:54:05 INFO mapreduce.ExportJobBase: Transferred 195 bytes in 44.7889 seconds (4.3538 bytes/sec) 16/11/04 17:54:05 INFO mapreduce.ExportJobBase: Exported 6 records.

从mysql导入到hive:

fuying@ubuntu2:/opt/BIGDATA/sqoop-1.4.5-cdh5.3.6$ bin/sqoop import --connect jdbc:mysql://ubuntu2:3306/test --username root --password woshiwo --table my_user --num-mappers 1 --fields-terminated-by '\t' --target-dir /user/fuying/sqoop/input/ --delete-target-dir --hive-database hivetest --hive-import --hive-table h_myuser Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hbase does not exist! HBase imports will fail. Please set $HBASE_HOME to the root of your HBase installation. Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 16/11/04 18:11:52 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.3.6 16/11/04 18:11:52 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 16/11/04 18:11:52 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 16/11/04 18:11:52 INFO tool.CodeGenTool: Beginning code generation 16/11/04 18:11:52 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user AS t LIMIT 1 16/11/04 18:11:52 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user AS t LIMIT 1 16/11/04 18:11:52 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/BIGDATA/hadoop-2.5.0-cdh5.3.6 注: /tmp/sqoop-fuying/compile/42543b6bb7687365b773417a2677f5c5/my_user.java使用或覆盖了已过时的 API。 注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。 16/11/04 18:11:54 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-fuying/compile/42543b6bb7687365b773417a2677f5c5/my_user.jar 16/11/04 18:11:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/11/04 18:11:56 INFO tool.ImportTool: Destination directory /user/fuying/sqoop/input deleted. 16/11/04 18:11:56 WARN manager.MySQLManager: It looks like you are importing from mysql. 16/11/04 18:11:56 WARN manager.MySQLManager: This transfer can be faster! Use the --direct 16/11/04 18:11:56 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path. 16/11/04 18:11:56 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql) 16/11/04 18:11:56 INFO mapreduce.ImportJobBase: Beginning import of my_user 16/11/04 18:11:56 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 16/11/04 18:11:56 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 16/11/04 18:11:56 INFO client.RMProxy: Connecting to ResourceManager at ubuntu2/10.211.55.12:8032 16/11/04 18:11:58 INFO db.DBInputFormat: Using read commited transaction isolation 16/11/04 18:11:58 INFO mapreduce.JobSubmitter: number of splits:1 16/11/04 18:11:59 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1477705386971_0022 16/11/04 18:11:59 INFO impl.YarnClientImpl: Submitted application application_1477705386971_0022 16/11/04 18:11:59 INFO mapreduce.Job: The url to track the job: http://ubuntu2:8088/proxy/application_1477705386971_0022/ 16/11/04 18:11:59 INFO mapreduce.Job: Running job: job_1477705386971_0022 16/11/04 18:12:10 INFO mapreduce.Job: Job job_1477705386971_0022 running in uber mode : false 16/11/04 18:12:10 INFO mapreduce.Job: map 0% reduce 0% 16/11/04 18:12:19 INFO mapreduce.Job: map 100% reduce 0% 16/11/04 18:12:20 INFO mapreduce.Job: Job job_1477705386971_0022 completed successfully 16/11/04 18:12:20 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=131442 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=87 HDFS: Number of bytes written=61 HDFS: Number of read operations=4 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Other local map tasks=1 Total time spent by all maps in occupied slots (ms)=7307 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=7307 Total vcore-seconds taken by all map tasks=7307 Total megabyte-seconds taken by all map tasks=7482368 Map-Reduce Framework Map input records=6 Map output records=6 Input split bytes=87 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=129 CPU time spent (ms)=1450 Physical memory (bytes) snapshot=96002048 Virtual memory (bytes) snapshot=678191104 Total committed heap usage (bytes)=23855104 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=61 16/11/04 18:12:20 INFO mapreduce.ImportJobBase: Transferred 61 bytes in 24.2529 seconds (2.5152 bytes/sec) 16/11/04 18:12:20 INFO mapreduce.ImportJobBase: Retrieved 6 records. 16/11/04 18:12:20 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user AS t LIMIT 1 16/11/04 18:12:20 INFO hive.HiveImport: Loading uploaded data into Hive 16/11/04 18:12:24 INFO hive.HiveImport: 16/11/04 18:12:24 INFO hive.HiveImport: Logging initialized using configuration in file:/opt/BIGDATA/hive-0.13.1-cdh5.3.6/conf/hive-log4j.properties 16/11/04 18:12:31 INFO hive.HiveImport: OK 16/11/04 18:12:31 INFO hive.HiveImport: Time taken: 1.069 seconds 16/11/04 18:12:32 INFO hive.HiveImport: Loading data to table hivetest.h_myuser 16/11/04 18:12:32 INFO hive.HiveImport: Table hivetest.h_myuser stats: [numFiles=1, numRows=0, totalSize=61, rawDataSize=0] 16/11/04 18:12:32 INFO hive.HiveImport: OK 16/11/04 18:12:32 INFO hive.HiveImport: Time taken: 1.127 seconds 16/11/04 18:12:33 INFO hive.HiveImport: Hive import complete. 16/11/04 18:12:33 INFO hive.HiveImport: Export directory is not empty, keeping it.

/user/hive/warehouse/hivetest.db/h_myuser

可以看出其原理是先把文件导入到指定目录下,然后再把数据文件移动到对应的数据表文件夹下

从hive导出数据到mysql: fuying@ubuntu2:/opt/BIGDATA/sqoop-1.4.5-cdh5.3.6$ bin/sqoop export --connect jdbc:mysql://ubuntu2:3306/test --username root --password woshiwo --table my_user3 --num-mappers 1 --input-fields-terminated-by '\t' --export-dir /user/hive/warehouse/hivetest.db/h_myuser Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hbase does not exist! HBase imports will fail. Please set $HBASE_HOME to the root of your HBase installation. Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../hcatalog does not exist! HCatalog jobs will fail. Please set $HCAT_HOME to the root of your HCatalog installation. Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../accumulo does not exist! Accumulo imports will fail. Please set $ACCUMULO_HOME to the root of your Accumulo installation. Warning: /opt/BIGDATA/sqoop-1.4.5-cdh5.3.6/bin/../../zookeeper does not exist! Accumulo imports will fail. Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation. 16/11/04 18:25:12 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.3.6 16/11/04 18:25:12 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 16/11/04 18:25:12 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 16/11/04 18:25:12 INFO tool.CodeGenTool: Beginning code generation 16/11/04 18:25:13 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user3 AS t LIMIT 1 16/11/04 18:25:13 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM my_user3 AS t LIMIT 1 16/11/04 18:25:13 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/BIGDATA/hadoop-2.5.0-cdh5.3.6 注: /tmp/sqoop-fuying/compile/5109a46cf4abc968e119aef5bf033b85/my_user3.java使用或覆盖了已过时的 API。 注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。 16/11/04 18:25:16 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-fuying/compile/5109a46cf4abc968e119aef5bf033b85/my_user3.jar 16/11/04 18:25:16 INFO mapreduce.ExportJobBase: Beginning export of my_user3 16/11/04 18:25:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/11/04 18:25:17 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar 16/11/04 18:25:19 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 16/11/04 18:25:19 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 16/11/04 18:25:19 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 16/11/04 18:25:19 INFO client.RMProxy: Connecting to ResourceManager at ubuntu2/10.211.55.12:8032 16/11/04 18:25:21 INFO input.FileInputFormat: Total input paths to process : 1 16/11/04 18:25:21 INFO input.FileInputFormat: Total input paths to process : 1 16/11/04 18:25:22 INFO mapreduce.JobSubmitter: number of splits:1 16/11/04 18:25:22 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative 16/11/04 18:25:22 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1477705386971_0023 16/11/04 18:25:23 INFO impl.YarnClientImpl: Submitted application application_1477705386971_0023 16/11/04 18:25:23 INFO mapreduce.Job: The url to track the job: http://ubuntu2:8088/proxy/application_1477705386971_0023/ 16/11/04 18:25:23 INFO mapreduce.Job: Running job: job_1477705386971_0023 16/11/04 18:25:33 INFO mapreduce.Job: Job job_1477705386971_0023 running in uber mode : false 16/11/04 18:25:33 INFO mapreduce.Job: map 0% reduce 0% 16/11/04 18:25:40 INFO mapreduce.Job: map 100% reduce 0% 16/11/04 18:25:41 INFO mapreduce.Job: Job job_1477705386971_0023 completed successfully 16/11/04 18:25:41 INFO mapreduce.Job: Counters: 30 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=130853 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=218 HDFS: Number of bytes written=0 HDFS: Number of read operations=4 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 Job Counters Launched map tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=5340 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=5340 Total vcore-seconds taken by all map tasks=5340 Total megabyte-seconds taken by all map tasks=5468160 Map-Reduce Framework Map input records=6 Map output records=6 Input split bytes=154 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=69 CPU time spent (ms)=1020 Physical memory (bytes) snapshot=107024384 Virtual memory (bytes) snapshot=662876160 Total committed heap usage (bytes)=23724032 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=0 16/11/04 18:25:41 INFO mapreduce.ExportJobBase: Transferred 218 bytes in 22.6979 seconds (9.6044 bytes/sec) 16/11/04 18:25:41 INFO mapreduce.ExportJobBase: Exported 6 records.