Java比较两个文本差异,想用java找到两个文本文件之间的内容差异

文件路径比较与差异分析
该博客介绍了一种方法来确定两个文本文件中文件路径的增删情况。通过读取a.txt和b.txt的内容,将它们分别存储到列表中,然后进行比较。首先找出a.txt中独有的路径,然后找出b.txt中独有的路径,从而确定哪些路径被添加,哪些被移除。示例中,g.properties和h.properties被添加,而b.properties被移除。

I have two text files,

a.txt

b.txt

Each text files contains some file paths. b.txt contains some more file paths than a.txt. I would like to determine which paths are added and which are removed from a.txt so that it corresponds to paths in b.txt.

For example,

abc.txt contains

E:\Users\Documents\hello\a.properties

E:\Users\Documents\hello\b.properties

E:\Users\Documents\hello\c.properties

and xyz.txt contains

E:\Users\Documents\hello\a.properties

E:\Users\Documents\hello\c.properties

E:\Users\Documents\hello\g.properties

E:\Users\Documents\hello\h.properties

Now how to find that g.prop and h.prop are added and b.prop is removed?

Could anyone explain how it is done? I could only find how to check for identical contents.

解决方案

The below code will serve your purpose irrespective of the content of the file.

import java.io.BufferedReader;

import java.io.FileNotFoundException;

import java.io.FileReader;

import java.io.InputStream;

import java.io.InputStreamReader;

import java.util.ArrayList;

import java.util.List;

public class Test {

public Test(){

System.out.println("Test.Test()");

}

public static void main(String[] args) throws Exception {

BufferedReader br1 = null;

BufferedReader br2 = null;

String sCurrentLine;

List list1 = new ArrayList();

List list2 = new ArrayList();

br1 = new BufferedReader(new FileReader("test.txt"));

br2 = new BufferedReader(new FileReader("test2.txt"));

while ((sCurrentLine = br1.readLine()) != null) {

list1.add(sCurrentLine);

}

while ((sCurrentLine = br2.readLine()) != null) {

list2.add(sCurrentLine);

}

List tmpList = new ArrayList(list1);

tmpList.removeAll(list2);

System.out.println("content from test.txt which is not there in test2.txt");

for(int i=0;i

System.out.println(tmpList.get(i)); //content from test.txt which is not there in test2.txt

}

System.out.println("content from test2.txt which is not there in test.txt");

tmpList = list2;

tmpList.removeAll(list1);

for(int i=0;i

System.out.println(tmpList.get(i)); //content from test2.txt which is not there in test.txt

}

}

}

比较两个文本内容差异通常可以使用文本对比库或算法,比如著名的DiffUtils库(如果是在Java应用中)或者更底层的字符串处理技术。以下是一个简单的步骤: 1. **使用Apache DiffUtils**: - 首先,在你的项目中添加Apache Commons Lang依赖。如果你使用Maven,可以在pom.xml文件中加入: ```xml <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-lang3</artifactId> <version>3.12.0</version> </dependency> ``` - 然后,你可以创建一个`TextDifferencer`实例并比较两个字符串: ```java import org.apache.commons.lang3.text.Diff; String text1 = "这是原始文本"; String text2 = "这是更新后的文本"; Diff<String> diff = new Diff<>(text1, text2); List<Diff.DiffSegment> differences = diff.getDifference(); ``` - `differences`列表将包含文本中每个更改的部分。 2. **手动处理** (逐字符或行比较): 如果不想依赖外部库,你可以选择逐字符或逐行比较文本。例如,可以计算每个字符串的Levenshtein距离或汉明距离。 ```java import java.util.ArrayList; import java.util.List; public class TextComparator { // ... 实现 Levenshtein 或汉明距离计算方法 public List<String> compare(String text1, String text2) { int[][] dp = calculateDistance(text1.toCharArray(), text2.toCharArray()); List<String> changes = new ArrayList<>(); for (int i = 0; i < dp.length; i++) { for (int j = 0; j < dp[i].length; j++) { if (dp[i][j] > 0 && dp[i][j] != Math.abs(i-j)) { changes.add("在位置 " + (i+1) + ", " + text1.substring(i, i+1) + " -> " + text2.substring(j, j+1)); } } } return changes; } } ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值