Java——随机读写文件RandomAccessFile

原创于 2026-05-15 11:01:44 发布 · 358 阅读

6 ·

本内容遵循CC 4.0 BY-SA版权协议

GEO检测

标签

#java #开发语言 #RandomAccess

Java 专栏收录该内容

77 篇文章

订阅专栏

随机读写文件RandomAccessFile

1、用法

RandomAccessFile有如下构造方法：

public RandomAccessFile(String name, String mode)
   throws FileNotFoundException
public RandomAccessFile(File file, String mode)
   throws FileNotFoundException

参数name和file容易理解，表示文件路径和File对象，mode是什么意思呢？它表示打开模式，可以有4个取ç

“r”：只用于读。
“rw”：用于读和写。
“rws”：和"rw"一样，用于读和写，另外，它要求文件内容和元数据的任何更新都同步到设备上。
“rwd”：和"rw"一样，用于读和写，另外，它要求文件内容的任何更新都同步到设备上，和"rws"的区别是，元数据的更新不要求同步。

RandomAccessFile虽然不是InputStream/OutputStream的子类，但它也有类似于读写字节流的方法。另外，它还实现了DataInput/DataOutput接口。这些方法我们之前基本都介绍过，这里列举部分方法，以增强直观感受：

//读一个字节，取最低8位，0～255
public int read() throws IOException
public int read(byte b[]) throws IOException
public final int readInt() throws IOException
public final void writeInt(int v) throws IOException
public void write(byte b[]) throws IOException

RandomAccessFile还有另外两个read方法：

public final void readFully(byte b[]) throws IOException
public final void readFully(byte b[], int off, int len) throws IOException

与对应的read方法的区别是，它们可以确保读够期望的长度，如果到了文件结尾也没读够，它们会抛出EOFException异常。

RandomAccessFile内部有一个文件指针，指向当前读写的位置，各种read/write操作都会自动更新该指针。与流不同的是，RandomAccessFile可以获取该指针，也可以更改该指针，相关方法是：

//获取当前文件指针
public native long getFilePointer() throws IOException
//更改当前文件指针到pos
public native void seek(long pos) throws IOException

RandomAccessFile是通过本地方法，最终调用操作系统的API来实现文件指针调整的。

InputStream有一个skip方法，可以跳过输入流中n个字节，默认情况下，它是通过实际读取n个字节实现的。

RandomAccessFile有一个类似方法，不过它是通过更改文件指针实现的：

public int skipBytes(int n) throws IOException

RandomAccessFile可以直接获取文件长度，返回文件字节数，方法为：

public native long length() throws IOException

它还可以直接修改文件长度，方法为：

public native void setLength(long newLength) throws IOException

如果当前文件的长度小于newLength，则文件会扩展，扩展部分的内容未定义。如果当前文件的长度大于newLength，则文件会收缩，多出的部分会截取，如果当前文件指针比newLength大，则调用后会变为newLength。

RandomAccessFile中有如下方法，需要注意一下：

public final void writeBytes(String s) throws IOException
public final String readLine() throws IOException

看上去，writeBytes方法可以直接写入字符串，而readLine方法可以按行读入字符串，实际上，这两个方法都是有问题的，它们都没有编码的概念，都假定一个字节就代表一个字符，这对于中文显然是不成立的，所以，应避免使用这两个方法。

2、设计一个键值数据库BasicDB

在日常的一般文件读写中，使用流就可以了，但在一些系统程序中，流是不适合的， RandomAccessFile因为更接近操作系统，更为方便和高效。

2.1、功能

BasicDB提供的接口类似于Map接口，可以按键保存、查找、删除，但数据可以持久化保存到文件上。此外，不像HashMap/TreeMap，它们将所有数据保存在内存，BasicDB只把元数据如索引信息保存在内存，值的数据保存在文件上。相比HashMap/TreeMap, BasicDB的内存消耗可以大大降低，存储的键值对个数大大提高，尤其当值数据比较大的时候。BasicDB通过索引，以及RandomAccessFile的随机读写功能保证效率。

2.2、接口

对外，BasicDB提供的构造方法是：

public BasicDB(String path, String name) throws IOException

path表示数据库文件所在的目录，该目录必须已存在。name表示数据库的名称， BasicDB会使用以name开头的两个文件，一个存储元数据，扩展名是．meta，一个存储键值对中的值数据，扩展名是．data。比如，如果name为student，则两个文件为student.meta和student.data，这两个文件不一定存在，如果不存在，则创建新的数据库，如果已存在，则加载已有的数据库。

BasicDB提供的公开方法有：

//保存键值对，键为String类型，值为byte数组
public void put(String key, byte[] value) throws IOException
//根据键获取值，如果键不存在，返回null
public byte[] get(String key) throws IOException
public void remove(String key) //根据键删除
public void flush() throws IOException //确保将所有数据保存到文件
public void close() throws IOException //关闭数据库

为便于实现，我们假定值即byte数组的长度不超过1020，如果超过，会抛出异常，当然，这个长度在代码中可以调整。在调用put和remove后，修改不会马上反映到文件中，如果需要确保保存到文件中，需要调用flush。

2.3、使用

在BasicDB中，我们设计的值为byte数组，这看上去是一个限制，不便使用，我们主要是为了简化，而且任何数据都可以转化为byte数组保存。对于字符串，可以使用getBytes()方法，对于对象，可以使用之前介绍的流转换为byte数组。

比如，保存一些学生信息到数据库，代码可以为：

private static byte[] toBytes(Student student) throws IOException {
    ByteArrayOutputStream bout = new ByteArrayOutputStream();
    DataOutputStream dout = new DataOutputStream(bout);
    dout.writeUTF(student.getName());
    dout.writeInt(student.getAge());
    dout.writeDouble(student.getScore());
    return bout.toByteArray();
}
public static void saveStudents(Map<String, Student> students)
        throws IOException {
    BasicDB db = new BasicDB("./", "students");
    for(Map.Entry<String, Student> kv : students.entrySet()) {
        db.put(kv.getKey(), toBytes(kv.getValue()));
    }
    db.close();
}

保存学生信息到当前目录下的students数据库，toBytes方法将Student转换为了字节。

2.4、设计

将键值对分为两部分，值保存在单独的．data文件中，值在．data文件中的位置和键称为索引，索引保存在．meta文件中。
在．data文件中，每个值占用的空间固定，固定长度为1024，前4个字节表示实际长度，然后是实际内容，实际长度不够1020的，后面是补白字节0。
索引信息既保存在．meta文件中，也保存在内存中，在初始化时，全部读入内存，对索引的更新不立即更新文件，调用flush方法才更新。
删除键值对不修改．data文件，但会从索引中删除并记录空白空间，下次添加键值对的时候会重用空白空间，所有的空白空间也记录到．meta文件中。

3、BasicDB的实现

BasicDB定义了如下静态变量：

private static final int MAX_DATA_LENGTH = 1020;
//补白字节
private static final byte[] ZERO_BYTES = new byte[MAX_DATA_LENGTH];
//数据文件扩展名
private static final String DATA_SUFFIX = ".data";
//元数据文件扩展名，包括索引和空白空间数据
private static final String META_SUFFIX = ".meta";

内存中表示索引和空白空间的数据结构是：

Map<String, Long> indexMap; //索引信息，键->值在．data文件中的位置
Queue<Long> gaps; //空白空间，值为在．data文件中的位置

表示文件的数据结构是：

RandomAccessFile db; //值数据文件
File metaFile;  //元数据文件

构造方法的代码为：

public BasicDB(String path, String name) throws IOException{
    File dataFile = new File(path + name + DATA_SUFFIX);
    metaFile = new File(path + name + META_SUFFIX);
    db = new RandomAccessFile(dataFile, "rw");
    if(metaFile.exists()){
        loadMeta();
    }else{
        indexMap = new HashMap<>();
        gaps = new ArrayDeque<>();
    }
}

元数据文件存在时，会调用loadMeta将元数据加载到内存，我们先假定不存在，先来看其他代码。保存键值对的方法是put，其代码为：

public void put(String key, byte[] value) throws IOException{
    Long index = indexMap.get(key);
    if(index==null){
        index = nextAvailablePos();
        indexMap.put(key, index);
    }
    writeData(index, value);
}

先通过索引查找键是否存在，如果不存在，调用nextAvailablePos方法为值找一个存储位置，并将键和存储位置保存到索引中，最后，调用writeData方法将值写到数据文件中。

nextAvailablePos的代码是：

private long nextAvailablePos() throws IOException{
  if(! gaps.isEmpty()){
      return gaps.poll();
  }else{
      return db.length();
    }
}

它首先查找空白空间，如果有，则重用，否则定位到文件末尾。

writeData方法实际写值数据，它的代码是：

private void writeData(long pos, byte[] data) throws IOException {
    if(data.length > MAX_DATA_LENGTH) {
        throw new IllegalArgumentException("maximum allowed length is "
                + MAX_DATA_LENGTH + ", data length is " + data.length);
    }
    db.seek(pos);
    db.writeInt(data.length);
    db.write(data);
    db.write(ZERO_BYTES, 0, MAX_DATA_LENGTH - data.length);
}

它先检查长度，长度满足的情况下，定位到指定位置，写实际数据的长度、写内容、最后补白。

可以看出，在这个实现中，索引信息和空白空间信息并没有实时保存到文件中，要保存，需要调用flush方法，待会我们再看这个方法。

根据键获取值的方法是get，其代码为：

public byte[] get(String key) throws IOException{
    Long index = indexMap.get(key);
    if(index! =null){
        return getData(index);
    }
    return null;
}

如果键存在，就调用getData方法获取数据。getData方法的代码为：

private byte[] getData(long pos) throws IOException{
    db.seek(pos);
    int length = db.readInt();
    byte[] data = new byte[length];
    db.readFully(data);
    return data;
}

代码也很简单，定位到指定位置，读取实际长度，然后调用readFully方法读够内容。

删除键值对的方法是remove，其代码为：

public void remove(String key){
    Long index = indexMap.remove(key);
    if(index! =null){
        gaps.offer(index);
    }
}

从索引结构中删除，并添加到空白空间队列中。

同步元数据的方法是flush()，其代码为：

public void flush() throws IOException{
    saveMeta();
    db.getFD().sync();
}

回顾一下，getFD方法会返回文件描述符，其sync方法会确保文件内容保存到设备上， saveMeta方法的代码为：

private void saveMeta() throws IOException{
    DataOutputStream out = new DataOutputStream(
            new BufferedOutputStream(new FileOutputStream(metaFile)));
    try{
        saveIndex(out);
        saveGaps(out);
    }finally{
        out.close();
    }
}

索引信息和空白空间保存在一个文件中，saveIndex保存索引信息，代码为：

private void saveIndex(DataOutputStream out) throws IOException{
    out.writeInt(indexMap.size());
    for(Map.Entry<String, Long> entry : indexMap.entrySet()){
        out.writeUTF(entry.getKey());
        out.writeLong(entry.getValue());
    }
}

先保存键值对个数，然后针对每条索引信息，保存键及值在．data文件中的位置。

saveGaps方法保存空白空间信息，代码为：

private void saveGaps(DataOutputStream out) throws IOException{
    out.writeInt(gaps.size());
    for(Long pos : gaps){
        out.writeLong(pos);
    }
}

也是先保存长度，然后保存每条空白空间信息。

在构造方法中，我们提到了loadMeta方法，它是saveMeta的逆操作，代码为：

private void loadMeta() throws IOException{
DataInputStream in = new DataInputStream(
            new BufferedInputStream(new FileInputStream(metaFile)));
    try{
        loadIndex(in);
        loadGaps(in);
    }finally{
        in.close();
    }
}

loadIndex加载索引，代码为：

private void loadIndex(DataInputStream in) throws IOException{
    int size = in.readInt();
    indexMap = new HashMap<String, Long>((int) (size / 0.75f) + 1, 0.75f);
    for(int i=0; i<size; i++){
        String key = in.readUTF();
        long index = in.readLong();
        indexMap.put(key, index);
    }
}

loadGaps加载空白空间，代码为：

private void loadGaps(DataInputStream in) throws IOException{
    int size = in.readInt();
    gaps = new ArrayDeque<>(size);
    for(int i=0; i<size; i++){
        long index = in.readLong();
        gaps.add(index);
    }
}

数据库关闭的代码为：

public void close() throws IOException{
    flush();
    db.close();
}