Marty Hermsen

Talks around the ICT and Financial Coffee Corners

Add on for Firefox and the Webmaster

November 23
by Marty Hermsen 23. November 2009 17:02

Viewing the CSS in a page with the Firfox browser.  Webmaster 'must have' download

Install in Firefox he add-on

https://addons.mozilla.org/en-US/firefox/addon/60

and install the add-on

https://addons.mozilla.org/firefox/addon/6622

OK restart Firefox, open a webpage and click on the CSS button and choice "View Style Information' or Ctrl-Shift-Y

All info is there

 

Share or Bookmark this post…
  • LinkedIn
  • Google
  • Facebook
  • NuJIJ
  • MySpace
  • del.icio.us
  • Technorati
  • Digg
  • DotNetKicks
  • Yahoo! Buzz
  • Yigg
  • E-Mail

Tags: ,

BlogEngine.NET | Web IIS 6 - IIS 7

Detecting Browsers, Crawlers, and Web Bots in C# ASP .NET

August 02
by Marty Hermsen 2. August 2009 17:52

The .NET framework, used to create C# ASP .NET web applications, actually comes with a built-in web browser detector, called the BrowserCaps feature. .NET 2.0 adds an additional detector, called the .Browser feature. Regardless of the .NET version, determining the difference between a user's web browser and an automated web crawler can make a big difference in a web application, and it's easy to do.

In this article, we'll discuss three methods for determining the web browser type. We'll also describe how to tell the difference between a user's web browser and an automated crawler.

What's Inside the User-Agent String

It really all starts with the web browser user-agent string. The user-agent is a string of text, sent in the HTTP header by the web browser, for each request made when accessing a page in the C# ASP .NET web application. The user-agent typically describes the web browser client type, name, version, and other information.

Some example User-Agent strings:

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727)
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Mozilla/5.0 (compatible; Yahoo! Slurp; +http://help.yahoo.com/help/us/ysearch/slurp)

As you can tell from the above examples, quite a bit of information can be parsed out of the user-agent string. We can tell that the first user-agent is a Microsoft Internet Explorer web browser, and thus a regular user. The other two user-agents are web bots. By looking at the details of the user-agent string, you can probably determine the most direct method of detecting the user's web browser is by simply looking for sub-strings.

Looking for Keywords in a User-Agent

The most direct and simple method for detecting web browsers accessing your C# ASP .NET web application is to simply search for a sub-string within the user-agent and classify the web browser accordingly.

if (Request.UserAgent.ToString().IndexOf("Googlebot") > -1)
{
   // We have a GoogleBot web crawler.
}
else
{
   // We do not have a GoogleBot web crawler.
}

By parsing a simple sub-string from the UserAgent property of the HttpRequest, we can determine the type of web client accessing the site. While this method is simple and direct, it suffers from the problem of being unable to classify the many different types of user-agent strings out there. You could certainly obtain a list of user-agent strings and add keywords to parse for each, but this could take a long time. It would also be difficult to maintain the list and keep it updated as new web bots and browsers emerge. There must be an easier way and this is exactly where Microsoft is one step ahead.

Digging Deeper Into Request.Browser

In the above code sample, we pulled the user-agent string from the HttpRequest object. Rather than parse a sub-string from the Request.UserAgent property, the Request object provides us with an additional object for accessing information about the web browser client via Request.Browser. One of the properties of interest for telling the difference between a user and a web bot is Request.Browser.Crawler. This property is a boolean and will indicate true if the web browser is actually a web bot.

if (Request.Browser.Crawler)
{
   // We have a web crawler.
}
else
{
   // We do not have a web crawler.
}

Request.Browser.Crawler Always Returns False

If you try using the above code sample and testing using various user-agent strings to simulate web bots (ie. with the Firefox User-Agent Switcher plug-in), you'll notice that Request.Browser.Crawler always returns false. This is due to missing information in one of .NET's configuration sections, called BrowserCaps. We'll need to populate the list of BrowserCaps (the list of available user-agents that we have information about) in order to use this feature.

Using the BrowserCaps To Detect Web Browsers From Web Bots

BrowserCaps is a section in the web.config file, within the system.web section. BrowserCaps allows you to specify a list of web browser user-agent strings, via regular expressions, to match against. Each item in the list indicates the capabilities of the web browser, version, whether it's a crawler, and much more.

Inside the web.config (or machine.config) file:

<configuration>
<system.web>
<browserCaps>
   <result type="class"/>
   <use var="HTTP_USER_AGENT"/>
        browser=Unknown
        version=0.0
        majorver=0
        minorver=0
        frames=false
        tables=false
      <filter>
         <case match="Windows 98|Win98">
            platform=Win98
         </case>
      <case match="Windows NT|WinNT">
         platform=WinNT
      </case>
   </filter>
   <filter match="Unknown" with="%(browser)">
      <filter match="Win95" with="%(platform)">
      </filter>
   </filter>
</browserCaps>
</system.web>
</configuration>

The above is a sample entry for detecting Windows 98 and Windows NT operating systems in the user-agent string from the web browser. While you can proceed to add entries by hand to match each web browser and crawler of interest, you can actually download a complete and updated list of user-agent BrowserCaps to add to your C# ASP .NET web application.

To add the list of BrowserCaps to your development machine or server, follow these steps:

1. Open the following file for editing:
C:\windows\Microsoft.NET\Framework\v2.0.50727\CONFIG\machine.config

2. Download the BrowserCaps list from http://owenbrady.net/browsercaps (direct download list).

3. Paste the entire contents of the XML file into the machine.config, just before the line </system.web>.

If you only want the BrowserCaps list available to a single web application, paste the BrowserCaps section into your local web.config. If you want all web applications to have access to the information, use the machine.config as noted above.

After saving the changes and refreshing the C# ASP .NET web application, you will now have proper values displaying for Request.Browser.Crawler. The regularly updated list helps you detect the majority of web crawlers, bots, scripts, and web browsers.

Using the Newer .BROWSER

BrowserCaps was introduced in the .NET 1.0 Framework. While it is still active and supported by Microsoft, it has been deprecated with .NET 2.0. The current standard is to use the .BROWSER feature to indicate the list of user-agent strings. It's important to note that entries specified in the .BROWSER feature are merged with the contents of the BrowserCaps, so that both methods may be used.

.BROWSER provides a way of specifying the web browser user-agents via XML in separate files in C:\windows\Microsoft.NET\Framework\v2.0.50727\CONFIG\Browsers. After creating a .browser file, you can execute aspnet_regsql.exe to build the browser files into the global assembly, giving access to the list to all web applications. This allows you to add new entries to the list without restarting the web application process. The actual command line to use is: C:\WINDOWS\Microsoft.NET\Framework\<versionNumber>\aspnet_regsql.exe -i

The .browser feature provides a more seamless way of incorporating web browser detection into an ASP .NET application. However, at this time, a greater number of entries are available for the BrowserCaps method, which provides a more accurate detection method of web bots in the wild. Since both methods can be used together, there is no harm in combining them.

Perfecting Traffic Statistics with Web Bot Detection

One of the primary reasons to determine a web bot from a regular user's web browser is to allow for accurate recording of statistics. For example, when counting the hits to a particular page in an ASP .NET web application, the numbers would become skewed if you included hits from GoogleBot, Yahoo Slurp, and the many other web bots. By using the Request.Browser.Crawler value, we can easily detect a web bot from a user and provide a more accurate figure.

Cloaking Isn't Just in Star Trek

The discussion about web bot detection in C# ASP .NET web applications wouldn't be complete without briefly cautioning against displaying different content to web bots and regular user web browsers, also called cloaking. More specifically, cloaking is when your web application detects a web bot and shows a different page or content, with the goal of affecting search engine ranking. It's generally a rule of thumb to display the same content to web bots as you would to normal users and only use the web bot detection methods shown above for traffic statistical means or other behind-the-scenes activities.

Conclusion

The .NET Framework provides two powerful features for detecting the web browser client and determining web spiders from users' web browsers. .NET 1.0 provides the BrowserCaps feature, which can be updated regularly with new user-agent strings as they become available. .NET 2.0 provides the .BROWSER feature, in addition to the BrowserCaps feature, for incorporating new user-agent matches more seamlessly in web applications. By using web browser and web bot detection responsibly, you can help enhance web application traffic statistics and features, creating a more powerful and resiliant C# ASP .NET web application.

Share or Bookmark this post…
  • LinkedIn
  • Google
  • Facebook
  • NuJIJ
  • MySpace
  • del.icio.us
  • Technorati
  • Digg
  • DotNetKicks
  • Yahoo! Buzz
  • Yigg
  • E-Mail

Tags: ,

BlogEngine.NET | DotNetNuke | Security | Web IIS 6 - IIS 7

Increasing Google Page Rank with Blogengine.NET

August 02
by Marty Hermsen 2. August 2009 12:39

Every Blogengine.NET user must know and even when you are not using Blogengine.NET the following explain is intresting and usefull. Did you read some of the comments on my website...?

Google Page Rank SEO

On my BlogEngine.NET website here the added comments are NOT moderated and everyone is free to add comments...  When we take a look at this comments and you will follow the link to the commenter we arrive mostly on a store website...!  Strange.. I asked myself are these comments real because the added comments are most times described with "like your blog" or "will come back soon" or "very informative"

On the Codeplex Blogengine.NET website I found one article about the "strange comments" and that I was not the only one receiving this strange comments... What is going on... Is there a way to stop this comments and moreover WHY they added this comments... After a small investigation I found that SEORebel (nickname) where the cause of this strange comments... and he is a smart guy....

How we could stop this 'automatic' added comments first..

It is simple to stop this comments by disable the comments, or moderated the comments...but that's not what I wanted. Just at that moment BlogEngine.NET developers comes up with a small patch to stop this strange comments... With four line of code the 'automatic' added comments are stopped...but not comments who added by people hands...for GPR

These days I see more and more comments added on my website who are added by hand, talks about the article, but the link from the commenter is still going to a store website... like a example see this article with comments...

Why they take the time to add comments by hand !  The answer is here Google PageRank

Google PageRank increasing

First of all what I have to say to the GRP commenters... Google PageRank doesn't work anymore on THIS BlogEngine.NET website for comments !!!!

To the users from Blogengine.NET I want to say... There is a solution and I will share that solution here

Google Page Rank is the heart from Google and for users from Google it's all about money...

Do you sell products then its a huge challenge to be the first in the Google Search Engine.  Google Page Ranks is used by Google Searchengine.  How higher the reach and rank in Google, how more hits, how more selling products...! P1, P2, P3 P4, or P8, your page on your website has a Google Page Rank..

Why the combination BlogEngine.NET and Google Page Rank is intresting for commenters to add their link ? When Google is indexing (crawling) your BlogEngine.NET website by default all documents will be indexed, simple their is no robots.txt  Users from BlogEngine.NET could add a robots.txt in the root and add some rules to follow or not to follow....it could solve the problem but its very intensive to maintenance

The question (and solution) is more HOW did Google visit your webpage to crawl and index your website and pages... yes... the GoogleBot... and if I could tell you the name from Google Search Indexing Bot you could easily redirect the search engine (without robots.txt) !!!!  Do you see already the solution for Blogengine.NET yet ?  We have to redirect Google's search engine to a new page, which on the fly created...! without comments... its so easy... I am for sure the GPR commenters will no longer add comments...!!

try something like this...

[code:c#]

if (Request.UserAgent.ToString().IndexOf("Googlebot") > -1)
{
   // We have a GoogleBot web crawler.
}
else
{
   // We do not have a GoogleBot web crawler.
}

 

// or try this code in a some deeper way...

            System.Web.HttpBrowserCapabilities clientBrowserCaps = Request.Browser;
            if (((System.Web.Configuration.HttpCapabilitiesBase)clientBrowserCaps).Crawler)
            {
                Response.Write ( "Browser is a search engine. I am creating a page without comments");
            }
            else
            {
               Response.Write ("Browser is not a search engine. Viewing this page with comments ");
            } 

 

// In the example above you have also to read my other article about detecting webcrawlers

 

//Here is another approach..based on the User Agent


            if (Request.ServerVariables["HTTP_USER_AGENT"].Contains("Googlebot"))
            {
                //log Google bot visit
            }
            else if (Request.ServerVariables["HTTP_USER_AGENT"].Contains("msnbot"))
            {
                //log MSN bot visit
            }
            else if (Request.ServerVariables["HTTP_USER_AGENT"].Contains("Yahoo"))
            {
                //log yahoo bot visit
            }
            else
            {
                //similarly you can check for other search engines
            }

I believe the developers from BlogEngine.NET will take this advice in the next release as default.

 

Other Web Links

See example webpage with this comments linked to a webstore...

More about detecting Google Bot within the C# .NET Framework on this website

If you want to know more about Google Page Rank then read this great article about

Open Source BlogEngine.NET information is here

Share or Bookmark this post…
  • LinkedIn
  • Google
  • Facebook
  • NuJIJ
  • MySpace
  • del.icio.us
  • Technorati
  • Digg
  • DotNetKicks
  • Yahoo! Buzz
  • Yigg
  • E-Mail

Tags:

BlogEngine.NET | Web IIS 6 - IIS 7

How to change BlogEngine Icon to your own icon...

July 20
by Marty Hermsen 20. July 2009 01:13

If you ever wanted your own icon in the bars from IE 7/8, Netscape or Google Chrome !

Within BlogEngine.NET you only have to change the refferer to the icon in the sitemaster asp page from your theme...located in the themes dir...

I did change my icon, but converted it first with the trial Any2Icon program/software, to me riding on my scooter....

Put the Icon file also in the theme directory you are using...

Share or Bookmark this post…
  • LinkedIn
  • Google
  • Facebook
  • NuJIJ
  • MySpace
  • del.icio.us
  • Technorati
  • Digg
  • DotNetKicks
  • Yahoo! Buzz
  • Yigg
  • E-Mail

Tags:

BlogEngine.NET | Web IIS 6 - IIS 7

BlogEngine.NET provisioning for Linkedin.com and Twitter.com

July 19
by Marty Hermsen 19. July 2009 18:47

For some time now I was thinking about a RSS feeds with FIX related news in the FIX Protocol group on LinkedIn

In the search to RSS Feeds about the FIX Protocol I didn't find any feed about the FIX Protocol, not even on the official FIX Protocol website ! or Google reader...

So I decided to use the new BlogEngine.NET blog for adding a RSS Feeds to the FIX group.  Simple add news on my blog in the category FIX protocol and the feed provisioning is there...

LinkedIn will each two hour come back for updates in the feed...

BlogEngine makes life easier with all widgets born by developers...and still open source...

It's now also possible with Blogengine.NET to integrate published blogs articles within Twitter...and tweets

Why not click on the green tweet button ! try it out..

 

Connecting the world together in a Single Sign On Enterprise environment !!!  Who thinks about !

Share or Bookmark this post…
  • LinkedIn
  • Google
  • Facebook
  • NuJIJ
  • MySpace
  • del.icio.us
  • Technorati
  • Digg
  • DotNetKicks
  • Yahoo! Buzz
  • Yigg
  • E-Mail

Tags: , ,

BlogEngine.NET | FIX Protocol | Security

BlogEngine Comment Poster is stopped by update

July 19
by Marty Hermsen 19. July 2009 12:53

Sorry SEORebel, your spam tool is stopped by a simple update....

How this SPAM tool works you can see in the video below

How to stop this SPAM tool click here

What is BlogEngine?
BlogEngine is a Microsoft .Net based blogging system used by a large number of people and companies all over the world. The blog system is very vulnerable to link spam.!

What is BlogEngine Comment Poster?
BlogEngine Comment Poster is a program that automatically posts comments to blog posts on BlogEngine blogs, these comments can contain your link, and give you link-juice to boost your sites in Google, Yahoo and Bing.

How many blog posts are there to post my link on?
According to Google there are currently 77.800 blog posts indexed:

Are these do-follow links?
Around 75% of the blogs are do-follow, meaning that the search engines will recognize these links as valid and counting. (click here to solve)

Do I have to wait for my links to get approved?
Around 50% of the blogs will have your link on instantly! The rest are moderated. The program will automatically check to see if your link is live after posting a comment, and you have the option to save all blog posts with links on to a text file. That gives the possiblity to ping these blog posts, so the search engine will find your links faster.

What success rate can I expect?
Around 40% of your attempted comment posts, should convert into live do-follow links.

What about captchas?
BlogEngine uses a .net hidden-field based captcha system. We've used webbrowser based posting to bypass this, making it appear as human as possible.

What are the requirements for running this program?
Windows and Microsoft .net 3.5

What is the comment posting speed?
The posting speed can be defined in settings. Expect 2-4 posts per minute, including the time it takes to verify if your link is live.

How do I find the blog posts?
You can do a simple google search for:

+"Notify me when new comments are added" +"Powered by blogengine"

Do I get some blog posts together with the program?
Yes, to get your started we've made a text-file with 950 blog posts all with with Pagerank 2-5!
To make sure that this doesn't get too saturated, we encourage people to create their own collections of blog posts. Use your favorite Search Engine Scraper. Download data-files - The blog posts are in the blogtargets.txt file.

Share or Bookmark this post…
  • LinkedIn
  • Google
  • Facebook
  • NuJIJ
  • MySpace
  • del.icio.us
  • Technorati
  • Digg
  • DotNetKicks
  • Yahoo! Buzz
  • Yigg
  • E-Mail

Tags: ,

BlogEngine.NET | Security

The evolution and simplicity from Google Chrome

July 05
by Marty Hermsen 5. July 2009 19:58

Share or Bookmark this post…
  • LinkedIn
  • Google
  • Facebook
  • NuJIJ
  • MySpace
  • del.icio.us
  • Technorati
  • Digg
  • DotNetKicks
  • Yahoo! Buzz
  • Yigg
  • E-Mail

Tags: , ,

BlogEngine.NET

osCommerce - Open Source E-Commerce Solutions

June 14
by Marty Hermsen 14. June 2009 23:25

osCommerce has attracted a large growing e-commerce community that consists of over 212,700 store owners and developers who support each other and extend osCommerce Online Merchant with add-ons being contributed on a daily basis. To date there are over 5,500 add-ons that are available for free to customize osCommerce Online Merchant online stores and to help increase sales.

osCommerce Online Merchant is an Open Source online shop e-commerce solution that is available for free under the GNU General Public License. It features a rich set of out-of-the-box online shopping cart functionality that allows store owners to setup, run, and maintain online stores with minimum effort and with no costs, fees, or limitations involved.

With over 8 years of operation, osCommerce has built a showcase of over 14,100 online shops that have been voluntarily added to the live shops section, and powers many thousands of more online shops worldwide.

osCommerce Philosophy

Open Source software provides an opportunity for people to work on software with others that share the same interest, exchanging ideas, knowledge, and work with one another, to expand and improve the solution.

The motivation for working on Open Source software originates at different sources, which include working on the software for fun as a hobby, to make the software meet own requirements, and to bring commercial interest into the software.

It is this combination of motivations that has brought together a team of developers to successfully make what osCommerce is today - and what it will be in the future - and an active and growing community, with each person having their own unique requirements but ultimately sharing the same goal: to use the software and to make it a better solution.

Open Source software always remains open providing the opportunity for anyone that is interested to work on it, at any time.

Because Open Source software is open, it provides a choice. The choice to use the software, the choice to learn the software, and the choice to join, share, and participate in a community - a community full of enthusiastic supporters that want to see the software grow and succeed.

It is this very reason why Open Source software is successful, and most importantly, why it works.

click here works also very good on Windows 2008 and IIS7

Share or Bookmark this post…
  • LinkedIn
  • Google
  • Facebook
  • NuJIJ
  • MySpace
  • del.icio.us
  • Technorati
  • Digg
  • DotNetKicks
  • Yahoo! Buzz
  • Yigg
  • E-Mail

Tags:

ASP.NET | BlogEngine.NET | DotNetNuke | Web IIS 6 - IIS 7

About Me

My name is Marty Hermsen, 45 years young, living in the Netherlands, married with Denise for almost 16 years now, without children but with our 'child' dogs in the small village Kamerik near Woerden, between cows and cheaps, in the middle from nature.... a paradise in the dense populated area in the world...

I am working at Fortis Bank Netherland and ABN Amro as IT Architect with current activities in separation Fortis Netherlands and Fortis Belgium and in integration Fortis Bank Netherland with ABN Amro. Creating a new Enterprise Microsoft Windows Platform based on Windows 2008 and integrating webapplications, sharepoint etc etc.

Creating a newbank...

click here for more about me

Calendar

<<  March 2010  >>
MoTuWeThFrSaSu
22232425262728
1234567
891011121314
15161718192021
22232425262728
2930311234

View posts in large calendar

Google Reader Picks

Blogroll Others

Download OPML file OPML

Poll

No poll