Eve For Enterprise    Eve Support Community    Forums  Hop To Forum Categories  Resources  Hop To Forums  Community Management    Will search engine robots crawl UBB6 dynamic pages?
Page 1 2 
Go
New
Find
Notify
Tools
Reply
  
5-star Rating Rate It!  Login/Join 
RTM
Groupee Member
Posted
Greetings,

In general, I love most of the new features in UBB6. There is one nagging issue that is of major concern to me; the new architecture whereby forums and topics are now generated dynamically using CGI.

Up until now, UBB has created static HTML pages which were superb for creating fodder for the search engine crawlers, AKA robots. These HTML pages became extremely valuable assets for seeding the indexes with excellent content which served as natural doorway pages for a site. It was one of the primary benefits of using discussion forum on a site; not only did it foster a community spirit and stickiness, but it provided the invaluable added benefit of content for the engines.

Now that we have all our content being served on a dynamic basis, I wonder how well the content will be indexed (if at all) by the main search engines. I have read a few posts here that say that some crawlers now follow links with dynamic (CGI and ? links), but I have yet to see results.

Has anyone actually had their CGI UBB6 pages succesfully added to the indexes of the Tier 1 search indexes - Google/AV/Lycos/Hotbot etc.? If so, I would be interested in hearing your experiences and seeing the links.

This is a key issue for us. The web site we operate serves a niche community (blues music fans and musicians) and we have succeeded in creating a vibrant, rapidly growing community, with very high quality postings from our registered members.

My concern is that the rest of the world will not find our great content on the search engines because of the fact that it is all generated on the fly...

Peace,

Rob

 
Posts: 55 | Location: Americas | Registered: February 25, 2001Reply With QuoteEdit or Delete MessageReport This Post
Groupee Member
Posted Hide Post
I was going to post a message with this exact same question. It's very important for some of us to know. Does anybody have an answer or explanation?

[ 03-05-2001: Message edited by: jcaron ]

 
Posts: 74 | Location: Saratoga Springs, NY | Registered: January 28, 2001Reply With QuoteEdit or Delete MessageReport This Post
Groupee Newbie
Posted Hide Post
Same question.....

Anybody from Ultimate Bulletin Board Support ?

Thanks in advance...


Laszlo

 
Posts: 17 | Registered: March 02, 2001Reply With QuoteEdit or Delete MessageReport This Post
Groupee Newbie
Posted Hide Post
I too am perplexed by this...

Brains, I need brains, particularly of search engines...

 
Posts: 23 | Location: 76092 | Registered: February 02, 2001Reply With QuoteEdit or Delete MessageReport This Post
Groupee Member
Posted Hide Post
Could you guys do this. Like make a little program the converts the main board, and maybe a lot of other pages, into pure HTML like every day. Then store it in some place so that the search engines can read it. Then when they click on something, it should lead back to your forums with the CGI pages....?
 
Posts: 141 | Registered: January 03, 2000Reply With QuoteEdit or Delete MessageReport This Post
RTM
Groupee Member
Posted Hide Post
Well, here is a followup: We have submitted our site to all of the Tier 1 search engines and indexes. We have yet to see any of their robots crawl into our UBB dynamic content. The only way we managed to get a few of our topics indexed (on Altavista) was by submitting the individual links directly.

Lycos clearly states that their bot will not crawl dynamic content (such as CGI/Perl stuff). In a sense it's understandable, since they are worried about their bot getting stuck in a trap and looping endlessly...


It is however a major issue as far as getting eyeballs to our site. We have hundreds of content-rich topics that are invisible to the majority of netizens...

Will be looking into creating html pages (on the fly) that can be used as content for the spiders.

Cheers,

Rob

The Temple of Blues

 
Posts: 55 | Location: Americas | Registered: February 25, 2001Reply With QuoteEdit or Delete MessageReport This Post

Posted Hide Post
Some of the newer technology search engines will. Google for example will be able to, older ones will not. If they want to keep up with the Googles of they world they will update as well.
 
Posts: 12039 | Registered: March 12, 1999Reply With QuoteEdit or Delete MessageReport This Post
RTM
Groupee Member
Posted Hide Post
quote:
Originally posted by David Dreezer:
Some of the newer technology search engines will. Google for example will be able to, older ones will not. If they want to keep up with the Googles of they world they will update as well.

Greetings Navaho - big fan, your posts have been of significant assistance, and we can understand why you are now an official UBB dude.

Your point about the engines adapting their technology is certainly valid. I used to be a hardcore AV fan, as I reminisce back in the day when DEC bought the Altavista.com domain name for $3M. I was a hardcore addict and AV evangelist, in part because they were running Digital Unix on their Alphaservers, and bottom line, AV rocked.

Then, they ... umm, portalized, added 1001 lead generation links to their main page, and quickly the site became annoying. Along came Google, with it's simple interface and equally hardcore backend. AV responded with RagingSearch.com (too little, too late).

This being said, I agree with you on the main point - the search engines/indexes will have to adapt to dynamic content indexing issues. But the fact remains that the vast majority of search engines will not crawl dynamic content - this includes Google. Even if the spiders do not explicitly exclude CGI/ASP content, they certainly do not visit the site quickly to crawl it.

There is very little deep content of UBB6 sites on any engine, including Google. OTOH, there is plenty of HTML content that is happily crawled, indexed and served on 90%+ of the SE world.

Perhaps what is most needed is an HTML doorway generation tool for the dynamic content of UBB6x just for the engines. I think I have some of this worked out, but nothing automated yet. (I use an offline website utility to gen the html, then create an index page... the genned html pages contain links to the CGI of UBB, so eventually people are pulled back into the live site.) It's definitely not the most elegant solution.

Peace,

The Temple of Blues Crew

 
Posts: 55 | Location: Americas | Registered: February 25, 2001Reply With QuoteEdit or Delete MessageReport This Post
Groupee Member
Posted Hide Post
I would like to know if this is being addressed at all, as I am still in need of the answer.
 
Posts: 425 | Location: Largo, FL | Registered: January 27, 2001Reply With QuoteEdit or Delete MessageReport This Post

Posted Hide Post
Google will search the UBB. There isn't as much to find there yet because though the UBB has been around 5 years now UBB6 has been around only a few months. And certainly not all owners have made the switch.

You're going to find most all of the search engines will be able to follow these queries because almost all of the larger message boards are now using dynamic queries to display content. I can't think of one of the larger or more popular message boards that do not.

 
Posts: 12039 | Registered: March 12, 1999Reply With QuoteEdit or Delete MessageReport This Post

Posted Hide Post
Whoops Please excuse my "Still Sunday morning first cup of coffee not quite with it yet manners"

Thank you for that compliment Temple of Blues

 
Posts: 12039 | Registered: March 12, 1999Reply With QuoteEdit or Delete MessageReport This Post
Groupee Newbie
Posted Hide Post
Temple of Blues is correct, the engines need to catch up. Along with html as doorway pages submit your site to open directory projecttheir downliners are fairly deep and it's free, you can always use GoToand use the pay per click, but that could be anywhere from a nickel a hit to 2.00 but you can be plastered all over the internet, Looksmart for 199.00 one time charge and their downliners include MSN, lots a hits, down side: it cost money ,

you can do the best ya can till they catch up, that's technology for ya.

[ 06-29-2001: Message edited by: sundancerz ]

 
Posts: 14 | Location: Salt Lake City | Registered: June 28, 2001Reply With QuoteEdit or Delete MessageReport This Post

Picture of J.C.
Posted Hide Post
Has anyone tried a robot.txt file, designating which pages to index and those to exclude? This might aid in the indexing dilemma.
 
Posts: 8527 | Location: Earth | Registered: September 18, 1998Reply With QuoteEdit or Delete MessageReport This Post
Groupee Member
Posted Hide Post
This is all great but we still do not have a solution. We have no ability to get the search engines to upgrade and for now we need our pages indexed. I realize the change was done to improve the service but we need to get our pages listed today! I agree with the previous post I think InfoPop should create an optional script that will allow us to create the "old style" html pages until the dynamic content is indexed. This could be an addition to the cp or simply an "unofficial/unsupported" script that we would run via the shell prompt. Either way I think this is something InfoPop should help us with.
 
Posts: 45 | Location: Miami, Floirda, 33178 | Registered: September 09, 1999Reply With QuoteEdit or Delete MessageReport This Post
Groupee Newbie
Posted Hide Post
Hello Everyone,

I have a solution I think for you all with 6.0 versions. We have launched a search engine directly indexing message boards threads. We search at the thread level and display at the thread level. We have been developing the tech to do this quickly and efficiently. Check out our search: http://www.boardreader.com
Please let me know if you like this search engine. I need the feedback we just launched on 7.18.01. Anyway if you like it submit your site and we will index it for you. Thanks.

Scott
www.boardreader.com

 
Posts: 15 | Location: Ann Arbor, MI | Registered: July 20, 2001Reply With QuoteEdit or Delete MessageReport This Post
Groupee Member
Posted Hide Post
(in a small, terrified voice) the UBB pages are indexed? What people post there will turn up in SEARCH ENGINES! Good God! I didn't know that. I thought you had to add something to the pages to make them turn up in search engines. We have to coax some of our people to speak because they're afraid if they say something bad about another dancer or choreographer, it will hurt them professionally. They would not want, say, in a search for Baryshnikov to have what they write on our board turn up in a search engine string!

So I have the opposite question/problem: how do I stop the search engines at the gate? I hope private forums, at least, are un-Googleable.

Alexandra

 
Posts: 241 | Location: Washington, DC | Registered: November 10, 1998Reply With QuoteEdit or Delete MessageReport This Post
Groupee Newbie
Posted Hide Post
Private forums with passwords to view are protected for the most part.
 
Posts: 15 | Location: Ann Arbor, MI | Registered: July 20, 2001Reply With QuoteEdit or Delete MessageReport This Post
RTM
Groupee Member
Posted Hide Post
quote:
Originally posted by David Dreezer:
Whoops Please excuse my "Still Sunday morning first cup of coffee not quite with it yet manners"

No problem, Navaho, that is why I couldn't even come near the 'puter before having my first (2-3) cups of coffee

quote:
Originally posted by JC:
Has anyone tried a robot.txt file, designating which pages to index and those to exclude?

We created a robots.txt file that includes the following parameters:

user-agent: *
disallow: /cgi-bin/
allow: /cgi-bin/ultimatebb.cgi
allow: /cgi-bin/ultimatebb.cgi?ubb=forum*
allow: /cgi-bin/ultimatebb.cgi?ubb=get_topic*
allow: /cgi-bin/ultimatebb.cgi?ubb=getbio*
allow: /cgi-bin/ultimatebb.cgi?ubb=get_profile*

This basically denies robots access to all UBB cgi pages, except those which contain forum indexes, topic pages, and user profiles. Of course, this only applies to search engine crawlers that respect the robots.txt standard.

quote:
Originally posted by Alex Rodriguez:
allow us to create the "old style" html pages until the dynamic content is indexed.

Alex - we were thinking the same thing. Unfortunately, there doesn't seem to be any script available to convert the dynamic 6.x content into static HTML files. We used an offline website downloader to generate html files from our UBB, and then we created a doorway page with links to the individual topic pages in HTML format. The pages themselves then contain links to the actual UBB cgi pages.

quote:
Originally posted by boardreader:
We have launched a search engine directly indexing message boards threads. We search at the thread level and display at the thread level.

Boardreader (scott) : Your site seems to have interesting technology - we have submitted one of our discussion forum sites (www.templeofblues.com) and we hope that your project takes off! It would be good to see you license your search results to other major search engines.

quote:
Originally posted by Alexandra:
So I have the opposite question/problem: how do I stop the search engines at the gate? I hope private forums, at least, are un-Googleable

Alexandra, well, you and your users should be aware that anything they post in any public Internet forum (be it a UBB, USENET, or a chat room) is available for all participants and lurkers to see. This includes search engines. One solution, as boardreader mentioned, is to make all your forums private. Even then, any registered member can read all the posts!

To prevent search engines from indexing your UBB content, please see my robots.txt code further up in this post

 
Posts: 55 | Location: Americas | Registered: February 25, 2001Reply With QuoteEdit or Delete MessageReport This Post
RTM
Groupee Member
Posted Hide Post
I was checking our server logs, and noted that Altavista's robot/spider, Scooter, pounded our website indexing dynamic content within our UBB. It also attempted to hit a lot of circular referential links that seemed to draw it into the dreaded spider trap. It has not revisited our site, and we have not yet seen the dynamic content pages added to their index.

Is there anyone else, with access to their raw server logs, that noticed a visit from Altavista?

This is interesting - they are obviously struggling fiercely to compete - perhaps they are testing out the indexing of dynamic content. Their FAQ pages clearly state that they do not index dynamic content. Hopefully, they are adapting their spider technology.

 
Posts: 55 | Location: Americas | Registered: February 25, 2001Reply With QuoteEdit or Delete MessageReport This Post
RTM
Groupee Member
Posted Hide Post
Note - Altavista's Scooter robot dropped by our site around July 20th. We host other sites with dynamic content (non-UBB) and it also saw visits from Altavista around this date.

[ 07-31-2001: Message edited by: Temple of Blues ]

 
Posts: 55 | Location: Americas | Registered: February 25, 2001Reply With QuoteEdit or Delete MessageReport This Post
 Previous Topic | Next Topic powered by eve community Page 1 2  
 

Eve For Enterprise    Eve Support Community    Forums  Hop To Forum Categories  Resources  Hop To Forums  Community Management    Will search engine robots crawl UBB6 dynamic pages?