昨天接到一个任务,将一个表从mysql迁移到oracle,表总共有300多万条记录。
解决思路;先将数据从mysql导出一个文本文件,然后传送的目标服务器,进而用sqlldr将其加载至oracle
第一步: 导出文件
Mysql 导出文件
在此省去表结构
mysql> select * from justin into outfile '/data/mysqldata/justin.txt' fields terminated by ',' enclosed by '"';
第二步
将其传输到目标服务器
[rkun@localhost mysqldata]$ scp justin.txt oracle@192.168.0.1:
oracle@192.168.0.1's password:
justin.txt
第三步
目标服务器上建立control.ctl
load data
characterset UTF8
infile '/home/oracle/justin
new.txt'
append into table justin
fields terminated by ',' OPTIONALLY ENCLOSED BY '"'
(
"ID" sequence(MAX,1),
**********
**********
)
注:我们在此采用sqlldr直接路径加载,必须在控制文件指定加载字符集,使之与目标数据库保持一致,否则会默认采用OS端的字符集,中文字符会产生乱码
另外,mysql中id对应值全为0,而对应的oracle表该字段为主键,为防止主键冲突,使用sqlldr的sequence功能生成主键,sequence(max,1)插入前找出表中id最大值并自动将当前值加1;
同时此功能要求在文本文件中去掉对应的id列
该表总共有22列 使用cut命令 去除第一列
[oracle@justin ~]$ more justin.txt | cut -d ',' -f 2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22 > justinnew2.txt
另外mysql导出文本文件的时候,当字段为空值的时候,其对应的字符为\N, 当使用sqlldr加载至Oracle,会报错
Record 2: Rejected - Error on table justin, column "justin11".
ORA-01722: invalid number
采用tr命令来消除,将\N用空格来代替
[oracle@justin ~]$ more justinnew2.txt | tr '\\N' ' ' > justinnew.txt
接下来调用sqlldr命令
[oracle@justin ~]$ $ORACLE_HOME/bin/./sqlldr userid=justin/****** control=control.ctl readsize=500000 rows=2000 direct=true log=justin.log
采用sqlldr加载, 报错如下
Record 5957: Rejected - Error on table justin, column "LOG_LEVEL".
Column not found before end of logical record (use TRAILING NULLCOLS)
日志显示从第5957行开始,找不到最后一列"LOG_LEVEL"的值了
查看一下发现 经过os命令转变后 原文件的相应行的最后一列“-1”被切掉了
[oracle@justin ~]$ head -5960 justin.txt | tail -10
\N,\N,\N,\N,"1","0"
\N,\N,\N,\N,"1","0"
\N,\N,\N,\N,"1","0"
\N,\N,\N,\N,"1","0"
\N,\N,\N,\N,"1","0"
\N,\N,\N,\N,"1","0"
\N,\N,"805688",\N," 1","-1"\N,\N,"5579643",\N,"1","-1"
\N,\N,"6092314",\N,"1","-1"
\N,\N,"788776",\N,"1","-1"
[oracle@justin ~]$ head -5960 justinnew.txt | tail -10
, , ,"1","0"
, , ,"1","0"
, , ,"1","0"
, , ,"1","0"
, , ,"1","0"
, , ,"1","0"
, ,"805688", ,"1" —-只有了1,后面的”-1”没有了
, ,"5579643", ,"1"
, ,"6092314", ,"1"
, ,"788776", ,"1"
经确认,问题出在cut命令上,我本意是想把第一列去掉,没想到从5957行开始,把最后一列也给切了,导致报错Column not found before end of logical record (use TRAILING NULLCOLS)
反复验证后发现了问题,当调用cut命令时候,采用','作为分隔符,但是中间有个字段值里面已经有了',',类似"asdgasg,asdf",故os把它当作了两个不同的列来处理。
解决方法: 调用cut时候后面多加一个数字23即可
注: 我们的sqlldr的控制文件也是采用','作为分隔符,但是由于加上了fields terminated by ',' OPTIONALLY ENCLOSED BY '"',故不会像cut命令一样把此字段当成两个来处理
重新加载 发现又报错 但是这次已经加载了50多万行 比上次有进步了
先把能够加载的全都加载进去 设置errors参数为10000
[oracle@justin ~]$ $ORACLE_HOME/bin/./sqlldr userid=justin/******* control=control.ctl readsize=500000 rows=2000 direct=true log=justin.log errors=10000
结果显示才95条遗漏的
Table justin
:
3006277 Rows successfully loaded.
95 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
考虑到是日志跟踪表 这个错误率还是可以接受的
[ 本帖最后由 myownstars 于 2011-2-25 17:51 编辑 ]
解决思路;先将数据从mysql导出一个文本文件,然后传送的目标服务器,进而用sqlldr将其加载至oracle
第一步: 导出文件
Mysql 导出文件
在此省去表结构
mysql> select * from justin into outfile '/data/mysqldata/justin.txt' fields terminated by ',' enclosed by '"';
第二步
将其传输到目标服务器
[rkun@localhost mysqldata]$ scp justin.txt oracle@192.168.0.1:
oracle@192.168.0.1's password:
justin.txt
第三步
目标服务器上建立control.ctl
load data
characterset UTF8
infile '/home/oracle/justin
new.txt'
append into table justin
fields terminated by ',' OPTIONALLY ENCLOSED BY '"'
(
"ID" sequence(MAX,1),
**********
**********
)
注:我们在此采用sqlldr直接路径加载,必须在控制文件指定加载字符集,使之与目标数据库保持一致,否则会默认采用OS端的字符集,中文字符会产生乱码
另外,mysql中id对应值全为0,而对应的oracle表该字段为主键,为防止主键冲突,使用sqlldr的sequence功能生成主键,sequence(max,1)插入前找出表中id最大值并自动将当前值加1;
同时此功能要求在文本文件中去掉对应的id列
该表总共有22列 使用cut命令 去除第一列
[oracle@justin ~]$ more justin.txt | cut -d ',' -f 2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22 > justinnew2.txt
另外mysql导出文本文件的时候,当字段为空值的时候,其对应的字符为\N, 当使用sqlldr加载至Oracle,会报错
Record 2: Rejected - Error on table justin, column "justin11".
ORA-01722: invalid number
采用tr命令来消除,将\N用空格来代替
[oracle@justin ~]$ more justinnew2.txt | tr '\\N' ' ' > justinnew.txt
接下来调用sqlldr命令
[oracle@justin ~]$ $ORACLE_HOME/bin/./sqlldr userid=justin/****** control=control.ctl readsize=500000 rows=2000 direct=true log=justin.log
采用sqlldr加载, 报错如下
Record 5957: Rejected - Error on table justin, column "LOG_LEVEL".
Column not found before end of logical record (use TRAILING NULLCOLS)
日志显示从第5957行开始,找不到最后一列"LOG_LEVEL"的值了
查看一下发现 经过os命令转变后 原文件的相应行的最后一列“-1”被切掉了
[oracle@justin ~]$ head -5960 justin.txt | tail -10
\N,\N,\N,\N,"1","0"
\N,\N,\N,\N,"1","0"
\N,\N,\N,\N,"1","0"
\N,\N,\N,\N,"1","0"
\N,\N,\N,\N,"1","0"
\N,\N,\N,\N,"1","0"
\N,\N,"805688",\N," 1","-1"\N,\N,"5579643",\N,"1","-1"
\N,\N,"6092314",\N,"1","-1"
\N,\N,"788776",\N,"1","-1"
[oracle@justin ~]$ head -5960 justinnew.txt | tail -10
, , ,"1","0"
, , ,"1","0"
, , ,"1","0"
, , ,"1","0"
, , ,"1","0"
, , ,"1","0"
, ,"805688", ,"1" —-只有了1,后面的”-1”没有了
, ,"5579643", ,"1"
, ,"6092314", ,"1"
, ,"788776", ,"1"
经确认,问题出在cut命令上,我本意是想把第一列去掉,没想到从5957行开始,把最后一列也给切了,导致报错Column not found before end of logical record (use TRAILING NULLCOLS)
反复验证后发现了问题,当调用cut命令时候,采用','作为分隔符,但是中间有个字段值里面已经有了',',类似"asdgasg,asdf",故os把它当作了两个不同的列来处理。
解决方法: 调用cut时候后面多加一个数字23即可
注: 我们的sqlldr的控制文件也是采用','作为分隔符,但是由于加上了fields terminated by ',' OPTIONALLY ENCLOSED BY '"',故不会像cut命令一样把此字段当成两个来处理
重新加载 发现又报错 但是这次已经加载了50多万行 比上次有进步了
先把能够加载的全都加载进去 设置errors参数为10000
[oracle@justin ~]$ $ORACLE_HOME/bin/./sqlldr userid=justin/******* control=control.ctl readsize=500000 rows=2000 direct=true log=justin.log errors=10000
结果显示才95条遗漏的
Table justin
:
3006277 Rows successfully loaded.
95 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
考虑到是日志跟踪表 这个错误率还是可以接受的
[ 本帖最后由 myownstars 于 2011-2-25 17:51 编辑 ]
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/15480802/viewspace-688359/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/15480802/viewspace-688359/

754

被折叠的 条评论
为什么被折叠?



