A股上市公司传智教育(股票代码 003032)旗下技术交流社区北京昌平校区

 找回密码
 加入黑马

QQ登录

只需一步,快速开始

下面的例子演示如何利用正则表达式从一个URL中查找并输出所有类似下面的超链接:
首先我们从命令行输入URL地址,打开输入流,读取URL的内容并转化为字符串存入htmlString中。然后以"(<a\\s*href=[^>]*>)"构造正则表达式,最后在字符串htmlString中查找匹配的字符串。
import java.io.*;
import java.net.*;
import java.util.regex.*;
public class GetHref {
public static void main(String[] args) {
InputStream in = null;
PrintWriter out = null;
String htmlString=null;
try {
// Check the arguments
if ((args.length != 1)&& (args.length != 2))
throw new IllegalArgumentException("Wrong number of args");

// Set up the streams
URL url = new URL(args[0]); // Create the URL
in = url.openStream(); // Open a stream to it
if (args.length == 2) // Get an appropriate output stream
out = new PrintWriter(new FileWriter(args[1]));
BufferedReader bin=new BufferedReader(new InputStreamReader(in));
String line;
StringBuffer sb = new StringBuffer();
while((line=bin.readLine())!=null){
if(out!=null) out.println(line);
sb=sb.append(line);
}
htmlString=sb.toString();
// System.out.println(sb.toString());
}
// On exceptions, print error message and usage message.
catch (Exception e) {
System.err.println(e);
System.err.println("Usage: java GetURL <URL> [<filename>]");
}
finally { // Always close the streams, no matter what.
try { in.close(); out.close(); } catch (Exception e) {}
}
Pattern p = Pattern.compile("(<a\\s*href=[^>]*>)");
Matcher m = p.matcher(htmlString);
boolean result = m.find();
while(result){
for(int i=1;i<=m.groupCount();i++){
System.out.println(m.group(i));
}
result=m.find();
}
}
}
程序运行结果:
C:\java>java GetHref http://127.0.0.1:8080/zz3zcwbwebhome/index.jsp
<a href=mailto:hi@javaweb.cc>
<a href="javascript:" class="bb">
<a href="javascript:" class="bb">
<a href="learn.jsp">
<a href="download.jsp">
<a href="article.jsp">
<a href="#">
<a href="#">
<a href="#">
<a href="#" class="FrameTeitle">
<a href="#" class="FrameTeitle">
<a href=view.jsp?id=89>
<a href=view.jsp?id=88>

1 个回复

倒序浏览
奈斯
回复 使用道具 举报
您需要登录后才可以回帖 登录 | 加入黑马