9

Download source code of a website in C#

With the help of the follwing short C# snippet, you’re able to receive the html code / source of any website as string. Such a function is useful, for example, if you want to parse some information from a website for later use.

Thus the snippet works, you have to add the following two using directives to the header of your source.

using System.Net;
using System.IO;

The function itself looks like this:

public string getHTML(string url)
{
 //Create request for given url
 HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);

 //Create response-object
 HttpWebResponse response = (HttpWebResponse)request.GetResponse();

 //Take response stream
 StreamReader sr = new StreamReader(response.GetResponseStream());

 //Read response stream (html code)
 string html = sr.ReadToEnd();

 //Close streamreader and response
 sr.Close();
 response.Close();

 //return source
 return html;
}

A function call could look like this:

getHTML("http://www.code-bude.net");

If you’ve got suggestions or even problems with the snippet, just write a comment.

About the author: This article, as well as 140 more on en.code-bude.net, were published by Raffi. – Since 2011 I've been blogging here and on code-bude.net, the german counterpart, about programming and my software, write tutorials and try to share my knowledge, as well as possible, with my readers. Furthermore I write at derwirtschaftsinformatiker.de on subjects of my studies.
  //    •  • Facebook  • Twitter


  1. Jerry says:

    Hello Raffi, look works good for a simple url (ex: http://www.code-bude.net).

    But for the url which contains parameter (ex: http://www.code-bude.net/search?query=abc). Seem it also return the html code of http://www.code-bude.net only

    Is there a way to read html source code from url which contains parameters?

    Regards,

    Jerry

  2. AV Awesome says:

    Man!!!
    You are awesome……….
    Legen….. wait for it…….
    Dary….

  3. Aerox says:

    Hey i need help i have a url, and when i open it in the browser it works but if i use it in the C# Code it shows a other site. There is a Problem with the Date when i open it in my C# programm
    this is the link:
    http://ch.tilllate.com/de/events/applyfilter/2013-11-08?ref=calendar-horizontal

  4. Hardik says:

    Raffi , dude I love you ! :*

  5. Paritosh says:

    how to get all the pages from a website and found a keyword in whole website?

    • Raffi says:

      You have to start with the root domain. Then parse the received html-code with Regular Expressions. That way you can get all urls/links from html-code. Then save the links in a List for example and get the html code for all of them. Then also parse them for links. So you can get the html of all pages. Save the html-code from every page and then search with string.IndexOf or another Regular Expression for the keyword.

      Greets

  6. krish says:

    hello sir,
    The code which you have posted is very useful to extract the entire source code. now i need c#.net code to extract specific tags from the given URLs. for example i want to extract the “script” tag code means how can i do..