1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Server Logs with RK=0/RS=2 .... I now know what these are

Discussion in 'General PHP and MySQL Discussions' started by tenants, May 10, 2014.

  1. tenants

    tenants Well-Known Member

    RK=0/RS server logs will predominately be seen of cms forums/blogs

    So, I was looking thought my sever logs ... as you do, and noticed quite a lot of 404s containing RK=0/RS=2

    Examples:

    /threads/bla.122//RK=0/RS=vPK0kuDSeb3CaVdcjJW.N298Nh0-
    /threads/bla.1685//RK=0/RS=.qGGYV08XMiLnmLDOHYZP215QDg-
    /threads/bla.1484//RK=0/RS=hdjCK92yAOrAf65E0Krp.kmnx1s-
    /forums/bla/page-51/RK=0/RS=EEfWB5KiTot_hgPI1uel0WxcOeU-
    /threads/bla.1685//RK=0/RS=_81PdaR2Bj.0OrpPmDAm7BxJYSc-

    (bla, being forum/thread title)

    I've searched around about these, but some have incorrectly suggested they are hack attempts, or botnets looking for exploits, they are not!

    I have figured out what these are... (since I look into quite a lot of spam related things)

    When people use XRumer/ScrapBox or other spam bot related software, they also use scrapers (some come with scrapers)

    Scrapers are effectively things that look through sites / search engines for links / emails / content. (Email scrapers exist, such as Tarantula, for leaching email addresses from sites / search engines)

    If you want to post on forums (automated), you'll often want to do it for related site content. In order to post on related forums, it would be useful to build up a list of forums and forum threads containing related content. To get the list of related links, you will need search and scrape something (either a site / a search engine)

    For instance, if I wanted to build a list of xenforo sites, I would scrape every page for links on xenforo.com and search google, yahoo and bing for the footer " Forum software by XenForo " - This will get some

    These links list can then be used for automated submission

    However, there will be a lot of inexpedience scrappers and some of these will use YAHOO to scrape their links (some won't do this correctly)

    If you just search Yahoo for xenforo:
    https://uk.search.yahoo.com/search?p=xenforo

    On right clicking the link and opening in a new window, you will notice the link contains RK=0/RS=
    XenForo: http://r.search.yahoo.com/_ylt=A9mSs2bLBG5TljkArRpLBQx.;_ylu=X3oDMTE0b3A0cGFyBHNlYwNzcgRwb3MDMQRjb2xvA2lyMgR2dGlkA1FJVUswMV85Mg--/RV=2/RE=1399747915/RO=10/RU=http://xenforo.com//RK=0/RS=GTm15Mn4bwTI1PDUAp5MlwKMQNs-

    So, these are the links that are actually being scrapped from Yahoo:
    http://xenforo.com/RK=0/RS=GTm15Mn4bwTI1PDUAp5MlwKMQNs-

    It's not a good link to scrape, since it will redirect to a page that doesn't exist (but botters will often not care, as long as they build up a list where some work), and as such, produce the 404 in the logs (as I have seen)

    I'm now avoiding these 404s, by just redirecting them via .htacess:
    Code:
        RewriteEngine On
    
        # strange behaving bots, these are urls scraped from yahoo (botters scrapping for links, yahoo search link contain RK RS) tenants modification:
        RewriteRule ^(.*)RK=0/RS= /$1 [L,NC,R=301]
        RewriteRule ^(.*)RS=^ /$1 [L,NC,R=301]
    
     
    Last edited: May 11, 2014
  2. tenants

    tenants Well-Known Member

    In a way, by 301 redirecting them, I'm actually helping the bot find the correct page... but then, I know they can't do anything on my sites, I just want to avoid the build up of 404s and look for real 404s in my logs

    You don't have to do anything about these, it's just a bit more reasuring to know why they exists
     

Share This Page