前几天因项目需要,在网上找了些转格式工具,转换出来会出现中文乱码,于是写了一个小程序来转换。
因为本项目中只会出现ANSI和UTF8编码,于是只需要判断是否为UTF8。
判断编码格式部分出自:
http://blog.163.com/wf_shunqiziran/blog/static/176307209201258102217810/
package test;
import java.io.BufferedInputStream;
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.io.PrintWriter;
public class AnsiToUtf8 {
/**
* 读写文件方式改变文件编码格式
* @param path
* @throws IOException
*/
public static void ansiToUTF8(String path) throws IOException{
BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(path), "GB2312"));
String all = "";
String line = null;
while((line = br.readLine()) != null){
all += line + "\r\n";
}
br.close();
PrintWriter out = new PrintWriter(new BufferedWriter(new OutputStreamWriter(new FileOutputStream(path), "UTF-8")));
out.write(all);
out.flush();
out.close();
}
/**
* 判断文件编码格式是否为UTF8
* @param path
* @return
* @throws IOException
*/
public static boolean isUTF8(String path) throws IOException{
BufferedInputStream bis = new BufferedInputStream(new FileInputStream(path));
byte[] b = new byte[3];
bis.mark(0);
bis.read(b, 0, 3);
if(b[0] == (byte)0xEF && b[1] == (byte)0xBB && b[2] == (byte)0xBF){
bis.close();
return true;
}
bis.reset();
int read;
while((read = bis.read()) != -1){
if (read >= 0xF0)
break;
if (0x80 <= read && read <= 0xBF) // 单独出现BF以下的,也算是GBK
break;
if (0xC0 <= read && read <= 0xDF) {
read = bis.read();
if (0x80 <= read && read <= 0xBF) // 双字节 (0xC0 - 0xDF)
continue;
else
break;
} else if (0xE0 <= read && read <= 0xEF) {// 也有可能出错,但是几率较小
read = bis.read();
if (0x80 <= read && read <= 0xBF) {
read = bis.read();
if (0x80 <= read && read <= 0xBF) {
bis.close();
return true;
} else
break;
} else
break;
}
}
bis.close();
return false;
}
/**
* 遍历文件夹下所有文件
* @param dir
* @throws IOException
*/
public static void traverseFolder(File dir) throws IOException{
File[] files = dir.listFiles();
for (File file : files) {
// System.out.println(file.getName());
if(file.isDirectory()){
traverseFolder(file);
}
else if(file.getName().endsWith(".java") && !isUTF8(file.getAbsolutePath())){
ansiToUTF8(file.getAbsolutePath());
}
}
}
public static void main(String[] args){
File dir = new File("d:\\test");
try {
traverseFolder(dir);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
本文介绍了一款用于将ANSI编码文件转换为UTF-8编码的小程序,该程序可以自动判断文件是否已经是UTF-8编码,并针对.java文件进行转换。

1万+

被折叠的 条评论
为什么被折叠?



