Facebook Robots

Discussion in 'Facebook' started by Converse, Nov 4, 2014.

  1. Converse Active Member


    The new forum software differentiates between human guests and robots (Members / Current Visitors / Robots) and one interesting thing that I have noticed more than once is a robot that is identified as Facebook. Is Facebook planning something new or does its current operations make use of a spider?

  2. Brox New Member


    That new software is very interesting and I see it here on this forum first time in my life. When I open the home page that always surprise me. Really funny. :)

    I think that Facebook uses robots for web scraping, and when some robot presents itself as Facebook, it came to collect some data. Maybe it uses robots and for some other purpose, but I think that data collection is primarly goal.

    This means that Facebook is a very active on the Internet. It is using robots to follow the developments of other similar social networks and forums. Facebook is always planning something new, and it wants to see what competitors are doing.

    I had had some job offer and after that I little investigated about robots and their purpose on the Internet. I had found new useful things, so after I wanted to start using robots on Facebook, but finally I never did that.

  3. SimplySidy Member


    Oh, so they still continue doing this. In fact, I had read about the crawler or scraper (as Fb said) in around 2010. When I had my site up and we integrated the Fb Plugin and set up the page (now defunct) we had seen this on our servers too. We had contacted Facebook but there was no response from their end (as it happens with the Biggies - they dont care about their users :D )

    So, we did a search and found that Fb sends out "scrapers" to the sites that get themselves a page or even put up a Facebook Plugin on their websites. This is done to authenticate the site and also, keep track of what and how the activities are going on on the connected site. According to the Fb Documentation of today -

    Our system retrieves this information only after a user provides us with a link.

    You can find about this here - facebook.com/externalhit_uatext.php (it is just a small paragraph with the above as its Gist).

    So, yeah, maybe your inclusion of the Login Via Fb or the Recommend on FB has got your site listed with Fb. :)

    ps: Cannot post a link as I do not have 25+ posts (have 20 at the moment). Hence, I removed the http from the link above.

  4. Converse Active Member


    I adjusted that down from 25 to 20, so you should be able to on your next post.

  5. toradrake Member


    Probably a stupid question but I am going to ask anyway... what impact is this going to have on websites? Think it might help peoples websites in SE's or will it have no effect at all?

  6. SimplySidy Member


    As far as my understanding of the things go -
    1. FB is not a search engine(as of now).
    2. Google or other search engines (to my knowledge) do not take into consideration who crawls the site in question
    3. FB only crawls to collect info from your site - maybe about the legitimacy (maybe - about the content, shared via Facebook, is actually there on your site or not) or anything else (but definitely not to judge your site about SEO aspects - again, my best guess)

    I do know there are too many Guesses here, still, I dont think FB crawling your website (or any other) has anything to do with the results from Google - specially on the SEO aspects.

  7. Scorp Member


    One reason why Facebook crawls a site is to get information about it for when stuff is shared from that site onto Facebook.

    And that's why the debug tool exists, here it is: https://developers.facebook.com/tools/debug/

    Over a year ago I spent a whole day trying to figure out Why my blog's homepage URL, when I share it to Facebook, doesn't show the picture I want it to show, even though I have implemented all the og: meta tags and things properly.

    Seriously, the above link was a godsend. What happened was, they had already crawled and "indexed" my site before I implemented the meta tags and put the picture I wanted as the header, and so Facebook kept showing the same old random picture it had picked up as my blog's homepage photo - because they had that in their cache - and it was actually an RSS feed button I had on there, doh...

    So after running my blog's homepage through the debug tool, they re-crawled my site, saw the new meta tags, and then started doing what I wanted them to do.

    I have no idea if they have any other reason for sending robots to a site, but I'm sure that this is one of them.

  8. toradrake Member


    Awesome little tool. Thank you for sharing. I just used it and it says that my meta tags are in the body not the head.... however, when I went and checked the meta's were in the head not the body. So now I am wondering if there is a problem in my html. I will have to figure out how to correct it.

  9. Aree Wongwanlee New Member

    Aree Wongwanlee

    I vaguely remember reading somewhere that there was this idea that Facebook might just be considering becoming a search engine, too. So those Facbook bots may just be the beginning. Myself, I have no problems with search bots from Facebook crawling all over my site. The more the merrier, I say.

  10. Scorp Member


    One thing though, their debug tool is oftentimes buggy as well.

    I mean, I've read of people who had issues reported by the tool even though - as they claimed - they were confident that everything on their website was in order, as it should be.

    Generally, I've had more issues with Facebook than with any other social network out there. Of course, that does make sense, since they are the bigger social network out there by far, but still...

    Just, take any errors they give you with a grain of salt. And if it doesn't affect you any (thumbnail pictures, descriptions, etc), you can just let it go.

    Read up on it, the debug tool is an interesting thing, and it has helped me at times... It still does, if I decide to make any change to a page and want to share it on Facebook for example, and I want it to show the new photo I posted as the thumbnail, the debug tool will make facebook re-crawl my site and show the proper thumbnail photo...

    Funny thing, recently I started a blogger blog which was previously owned by someone else and then deleted. It was non-existent for some time. Then I registered that domain, posted an article, and when I shared that article on FB I got some thumbnail pictures I've never seen before, which was confusing...

    Then I realized that Facebook has simply cached the photos from the previous blog and remembered those still. Then I debugged the site and things came to normal :)


