求帮忙写一个c#百度收录查询的程序

宋天琪

求帮忙写一个百度收录查询的程序，
打开百度 www.baidu.com
输入 site:www.google.com
找到相关结果数4,200,000个
把4,200,000 提出来并显示。

千年 · 千年

百度收录真快，一分钟就收录百度一下 c#百度收录会发现我这个页面排第一 {:soso_e113:}

pm324 · pm324

本帖最后由 pm324 于 2013-8-19 18:28 编辑

这里会用到
using System.Text.RegularExpressions;
using System.Net;
using System.IO;
这三个命名空间

首先我们知道百度搜索的语法是 http://www.baidu.com/s?wd=
wd后面跟的便是我们要搜索的关键词，（关于百度搜索url语法的具体用法有兴趣的同学可以M我索取）

这样我们就得到了我们要请求的页面的URL

然后用WebRequest请求指定url的的WebResponse对象，然后用StreamReader读取WebResponse中的数据流，这就获得了网页的源代码

然后呢我们用正则进行匹配，通过观察不难看出规则

我们要取得的结果放在标签 和 内

那么就可以写出正则表达式 (.|\n)*? 来取出我们要的结果

取出来之后是带有html标签的，在这里可以有很多方法去除标签我就不一一详说了

我用的是直接取出这个信息中的数字

具体代码在下面，不理解的同学可以找我交流

using System;
using System.Collections.Generic;
using System.Text;
using System.Text.RegularExpressions;
using System.Net;
using System.IO;
namespace 查询收录
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("请输入您要查询的域名");
Console.WriteLine("找到相关结果数："+collect("http://www.baidu.com/s?wd=site:" + Console.ReadLine()));
Console.ReadKey();
}
public static string collect(string url)
{
Match ma = null;
try
{
Regex r;
r = new Regex(@"(.|\n)*?");
ma = r.Match(GetHttpData(url));
r = new Regex(@"\d+");
ma = r.Match(ma.ToString().Replace(",", ""));
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
return ma.ToString();
}
public static string GetHttpData(string Url)
{
string sException = null;
string sRslt = null;
WebResponse oWebRps = null;
WebRequest oWebRqst = WebRequest.Create(Url);
oWebRqst.Timeout = 50000;
try
{
oWebRps = oWebRqst.GetResponse();
}
catch (Exception e)
{
sException = e.ToString();
Console.WriteLine(sException);
}
finally
{
if (oWebRps != null)
{
StreamReader oStreamRd = new StreamReader(oWebRps.GetResponseStream(), Encoding.GetEncoding("UTF-8"));
sRslt = oStreamRd.ReadToEnd();
oStreamRd.Close();
oWebRps.Close();
}
}
return sRslt;
}
}
}

复制代码

帐号		自动登录	找回密码
密码			加入黑马

求帮忙写一个c#百度收录查询的程序

评分

2 个回复

评分