1 minute, 15 seconds
A bit ago I wrote a post about using command line tools to get stats of this blog. I recently wrote another version of this to get the most popular posts here, sorted by the most popular at the top. I love that this can be done in all in one command.
Here’s the command:
tail -1000000 access_log|grep 'GET /blog'|cut -d" " -f 7|egrep -v '.png|.jpg|wp-includes|.css|/page/|/category/|xmlrpc|wp-trackback|/feed/|wp-login|/wp-content/|/trackback/|wp-comments|wp-app.php|wp-admin|comment-page|index.php|?p=|page_id|comments|feed'|sort|cut -d"/" -f 3|uniq -c|grep -v ' 1 '|sort -nr>plip.blog.tops.txt
This breaks down into the following:
- get the last 1000000 of the blog access log
- look for requests to “/blog”
- split by space, and get the 7th field, the URL being requested
- exclude a ton of items
- sort the results
- split by the “/” slash and get the 3rd field, the blog name in the URL
- get the unique list of blog names with a count for each URL
- remove the singletons
- reverse sort so the most popular is at the top
- write it all to a file called plip.blog.tops.txt
The results are in! The winner is currently chocolate-crinkle-cookie-photos! W00T
137 chocolate-crinkle-cookie-photos 119 two-loves-css-recaptcha 109 24-hours-in-photos 104 our-pet-venus-fly-trap 103 ruby-less-way-to-add-key-frames-to-flv-videos-for-the-likes-of-jwplayer 94 toss-your-salad-code 91 update-firefox-does-have-reset-more 91 firefox-reset-is-really-launch-in-safe-mode 84 keep-those-passwords-safe 81 photos-food-bikes-sunsets-and-stars 79 thoughts-on-very-large-monitors 78 when-the-cat-is-away-the-worms-will-play 76 photos-from-around-the-bay 76 our-tree 75 one-foggy-morning-in-my-commute 74 wordpress-exploit-fog-fruit-plants-and-plates 72 recaptcha-now-google-recaptcha-will-help-google-books 72 from-burning-man-town-to-oaktown 67 gmaps-pedometer-google-calc-8-94607843-minutes-per-mile 66 the-massive-compost-tower 65 on-theft-privacy-and-data-loss 64 pizza-and-dough-from-scratch 60 this-is-not-an-ipad 60 go-faster-encoding 57 fixed-theme-wp-updated-more-wp-hacks 44 every-vehicle-is-a-prius 42 photorec-to-the-rescue 41 the-very-very-poor-mans-google-analytics-tail-cut-sort-uniq-wc 41 on-comcast-internet 38 taking-the-plunge-safari-4-full-time 35 secret-jumps-of-tunnel 35 i-got-four-cores-but-a-distributed-load-aint-on-one 34 stir-fry-dinner 33 tasty-comfort-food 32 fancy-diff 26 how-to-fix-zend-studio-5-5-zde-in-os-10-6-snow-leopard 24 ping-traceroute-and-quotes 22 wordpress-rich-mans-blog-poor-mans-cms 21 new-news-old-open-source 20 old-broken-usb-hub-ipod-charger 19 gmail-contest 19 alternate-way-to-have-google-analytics-track-pdfs 17 this-is-what-makes-a-happy-saturday 17 macchiato 16 american-born-chinese 15 rogue-mysql-queries 15 fixed-gear-slipped-chain-thankful-for-brake 13 simple-wp 13 plip-is-no-longer-a-cobblers-child 11 plix-plixing-better 11 itunes-imovie-on-lenovos-new-media-center-pc 10 wonderful-bike-lane-signs 10 this-is-what-makes-a-happy-sunday 10 plip-ts-on-your-back 9 plipgo-01-released 9 bart-speaks 8 yet-another-redesign 7 update-plip-content 7 plixing-for-pleasure 7 plip-for-peace 7 long-be-gone 7 kodiak-11-released 7 dot-com-casualty 7 dont-just-commit-commit-intelligently 6 verge-works-solves-all-your-woes 6 simpsons-for-ever 6 simple-is-better 6 plip-gets-its-own-dictionary 5 aids-ride-completed
d00d – this blog has been up since 2001? how did i miss that? anyway, thanks for the sitemap.
Hah – no, the blog has only been up for…um…9 months? No! Goodness! It’s been a year almost to the day! I imported all the old “posts” from the news system I wrote into wordpress. My first post was Feb 24th, 2009. How apropos that on the 27th I wrote about the top posts which likely span the last year. Hah!