Tag Archives: HOWTO

Displaying what you read from Google Reader

I’ve been wanting to share what I subscribe to in Google Reader and using the functions I wrote I was able to do just that. Check out the article for the full run down on the unofficial Google Reader API. This is written in python but should be easily portable to php. If i get around to it, I want to make a WordPress plugin so bloggers can share what they read with their readers. This will be followed (or in parallel depending on my mood) with a Javascript version so Blogspot users can do the same in the sidebar. On to the code!

To start off we’ll just copy the functions we need from last time. Generally this is the login and SID token functions, as well as the feed list function.

from django.shortcuts import render_to_response
from django.template import Library
from elementtree import ElementTree
import urllib
import urllib2
import re

login = 'timothy.broder@gmail.com'
password = '***'
source = 'gPowered'

google_url = 'http://www.google.com'
reader_url = google_url + '/reader'
login_url = 'https://www.google.com/accounts/ClientLogin'
token_url = reader_url + '/api/0/token'
subscription_list_url = reader_url + '/api/0/subscription/list'

#login / get SED
def get_SID():
    header = {'User-agent' : source}
    post_data = urllib.urlencode({ 'Email': login, 'Passwd': password, 'service': 'reader', 'source': source, 'continue': google_url, })
    request = urllib2.Request(login_url, post_data, header)

    try :
        f = urllib2.urlopen( request )
        result = f.read()

    except:
        print 'Error logging in'

    return re.search('SID=(\S*)', result).group(1)

#get results from url
def get_results(SID, url):
    header = {'User-agent' : source}
    header['Cookie']='Name=SID;SID=%s;Domain=.google.com;Path=/;Expires=160000000000' % SID
    print url
    request = urllib2.Request(url, None, header)

    try :
        f = urllib2.urlopen( request )
        result = f.read()

    except:
        print 'Error getting data from %s' % url

    return result

#get a specific feed.  It works for any feed, subscribed or not
def get_feed(SID, url):
    return get_results(SID, get_feed_url + url.encode('utf-8'))

#get a token, this is needed for modifying to reader
def get_token(SID):
    return get_results(SID, token_url)

#get a list of the users subscribed feeds
def get_subscription_list(SID):
    return get_results(SID, subscription_list_url)

Then we’ll want to get rid off all the information in the feed that we don’t want and load what we do into a data dictionary. After its in the dictionary, feed names and links (and the folders they are in) are ready to be displayed. As usual, I use Django to display my pages, but everything is the same up to the final return in the Feeds method. Below is an example of what each subscription looks like in the Google Reader Feed, and below that is how to process it


feed/http://www.ubuntu.com/rss.xml Ubuntu  user/16162999404522159936/label/dev dev   1186137757794 
class myFeed:
    def __init__(self, name, link):
        self.name = name
        self.link = link

def Feeds(request):
    SID = get_SID()
    feeds = get_subscription_list(SID)
    tree = ElementTree.fromstring(feeds)
    d = dict()   

    #loop through each feed
    for object in tree.findall('list')[0].findall('object'):
        strings = object.findall('string')
        key = object.findall('list')[0].findall('object')[0].findall('string')[1].text

        #tag already exists, add to the list
        try:
            d[key].append(myFeed(strings[1].text, strings[0].text.replace('feed/', '')))
        #tag doesn't exist, create list
        except KeyError:
            d[key] = [myFeed(strings[1].text, strings[0].text.replace('feed/', ''))]

    return render_to_response('pages/feeds.html', {
    'feeds': d,
    })

For those of you that use django or are just curious how I end up displaying the feeds, this is what i have in my view:


My Reading


  • {% for item in feeds.items %}
  • {{ item.0 }}



{% endfor %}

Again, too see what I subscribe to, click here

HOWTO: YUI Tabview

A few days ago I added the Digg counts to the bottom of the Posts page on gPowered.net. Although, the more posts that I add, the further down on the page this section will get, so I decided to play around with YUI’s tabview control and put the post list in one tab, and the diggs in another. It turned out to be really easy:

First we need a few dependencies

<!-- Dependencies -->
<!-- core CSS -->
<link rel="stylesheet" type="text/css" href="http://yui.yahooapis.com/2.3.0/build/tabview/assets/tabview.css">   

<!-- optional skin for border tabs -->
<link rel="stylesheet" type="text/css" href="http://yui.yahooapis.com/2.3.0/build/tabview/assets/border_tabs.css">   

<script type="text/javascript" src="http://yui.yahooapis.com/2.3.0/build/yahoo-dom-event/yahoo-dom-event.js"></script>
<script type="text/javascript" src="http://yui.yahooapis.com/2.3.0/build/element/element-beta-min.js"></script>  

<!-- Source file -->
<script type="text/javascript" src="http://yui.yahooapis.com/2.3.0/build/tabview/tabview-min.js"></script>

And then we just need to organize some DIVs

<div id="demo" class="yui-navset">
 <ul class="yui-nav">
  <li class="selected"><a href="#posts"><em>Posts</em></a></li>
  <li><a href="#diggs"><em>Diggs</em></a></li>
  <li><a href="#HOWTOs"><em>HOWTO's</em></a></li>
 </ul>
 <div class="yui-content">
  <div id="posts" style="margin: 10px">
   <h3>Posts</h3>
   content
  </div>
  <div id="diggs" style="margin: 10px">
   <h3>Diggs</h3>
   content
  </div>
 </div>
</div>

HOWTO: Getting the Numer of Diggs from Digg (Python)

After 2 of my posts were on the Digg front page this morning (Thank you all very much to those that dugg them), I took my first look into the Digg API. I wanted a way to take a quick look to see how many Diggs certain stories were getting. In some ways it is similar to GData: make a call to a URL, get some XML back, parse it, etc. It does, however, feel lighter, probably due to its streamlined nature. It has one purpose, get information off of Digg. Using this, I’ve added a section in the Post List section of gPowered.net that shows the Diggs of a few of the articles that I have submitted on Digg.

The API is broken into 5 main sections or endpoints. Each of these will return related types of data:
- Stories
- Events
- Users
- Topics
- Errors

In this quick HOWTO I’m going to take a quick look into the Stories endpoint so I can display the number of Diggs specific stories have. We’ll start off by making a small class to hold our returned data (useful to send to a template or just for working with later on. We don’t want to keep having to hit the ElementTree to get data out). All of the calls will be send to ‘http://services.digg.com/’. In this example I will only be querying ‘http://services.digg.com/story/{story clean title}’.

import httplib2
from elementtree import ElementTree  

#for storing
class MyDigg:
 def __init__(self, title, link, digg, diggs):
  self.title = title
  self.link = link
  self.digg = digg
  self.diggs = diggs

 def __str__(self):
  return self.title + ' ' + self.diggs

#stories to get diggs of
posts = [
 'Google_NOT_releasing_it_s_Goobuntu_Desktop_OS_STOP_DIGGING_IT',
 'New_Digg_Home_Page_breaks_the_Linux_section_on_IE',
 'Google_Reader_API_Functions'
 ]

#hold returned info
my_diggs = []

#all calls go through this
digg_service = 'http://services.digg.com/'

#just looking at stories
service_endpoint = digg_service + 'story/%s'

#only need 1 result back
trailer = '?count=1&appkey=http%3A%2F%2Fgpowered.blogspot.com'

#keep track of total diggs
total_diggs = 0

After we are set up, we will want to loop through each story we want to get Digg data for. Add the well formed title into the query string, and send it to the Digg service. Then, parse the response, and get the information we need.

for story in posts:
 curr_story = service_endpoint % story
 url = curr_story + trailer

 h = httplib2.Http()
 resp, content = h.request(url, "GET", body="nt", headers={'content-type':'text/plain'} )

 story = ElementTree.fromstring(content).findall('story')[0]

 d = MyDigg(story.findall('title')[0].text, story.get('link'), story.get('href'), story.get('diggs'))
 total_diggs = total_diggs + int(d.diggs)
 my_diggs.append(d)
 print d

print 'Total: ' + str(total_diggs)

And that’s that. my_diggs now has all the information we need!

HOWTO: Google Reader API Functions

I’ve been wanting an API for Google reader since I started using it, and especially since i started gPowered so I could display a list of the feeds I read on the site. The official word on an API for reader is “It’s coming in a few weeks,” but that was back in late 2005. The reason being that at the time, the URLs the API would use were going to change a lot. So, after a bit of research and coding I came up with some python functions to do the job.


The first step was authenticating against Google accounts without using the client library. The Python Gdata Library makes login very easy but Reader isn’t part of the Client Library yet (maybe I’ll try to add it, we’ll see…) but this was the method I was using for gdata and python pre-Client Library, and the principles still hold true for working with Reader. Thankfully, most of the research for working with the ‘Reader API’ was done for me already by Niall Kennedy. This is an unofficial, unsupported API and the URLs for some of the queries have changed since the writing of that article. Here we go…

We’re going to use urllib(2) to handle the communication with this one. I rather would have used httplib, but I was having trouble with the authentication cookie. Each retrieval has its own URL to query against

import urllib
import urllib2
import re

login = 'timothy.broder@gmail.com'
password = '****'
source = 'gPowered'

google_url = 'http://www.google.com'
reader_url = google_url + '/reader'
login_url = 'https://www.google.com/accounts/ClientLogin'
token_url = reader_url + '/api/0/token'
subscription_list_url = reader_url + '/api/0/subscription/list'
reading_url = reader_url + '/atom/user/-/state/com.google/reading-list'
read_items_url = reader_url + '/atom/user/-/state/com.google/read'
reading_tag_url = reader_url + '/atom/user/-/label/%s'
starred_url = reader_url + '/atom/user/-/state/com.google/starred'
subscription_url = reader_url + '/api/0/subscription/edit'
get_feed_url = reader_url + '/atom/feed/'

When we authenticate against Google Reader with a gmail account and password in the browser, a cookie is stored. We’ll have to recreate the values in this cookie. The static values are the Domain (.google.com), the Path (/), and Expires (we’ll use 160000000000). The unique value, based on the current login session, is the SID (Session ID?), which we will need to retrieve. We’ll do the login and retrieval in the same function:

#login / get SED
def get_SID():
    header = {'User-agent' : source}
    post_data = urllib.urlencode({ 'Email': login, 'Passwd': password, 'service': 'reader', 'source': source, 'continue': google_url, })
    request = urllib2.Request(login_url, post_data, header)

    try :
        f = urllib2.urlopen( request )
        result = f.read()

    except:
        print 'Error logging in'

    return re.search('SID=(\S*)', result).group(1)

We’ll also need a function that can handle any of those URLs, create the header, attach a cookie to it, and retrieve the data from Google. I left the return as a raw data string so you could use whatever XML parsing library you want. I personally like using ElementTree.

#get results from url
def get_results(SID, url):
    header = {'User-agent' : source}
    header['Cookie']='Name=SID;SID=%s;Domain=.google.com;Path=/;Expires=160000000000' % SID

    request = urllib2.Request(url, None, header)

    try :
        f = urllib2.urlopen( request )
        result = f.read()

    except:
        print 'Error getting data from %s' % url

    return result

The following methods are the calls that I’ve gotten working so far; I’m going to keep working on the ‘edit’ functions, like adding, removing feeds, changing tags, etc. See the comments for what they do. Note: Any edit against the API needs to send over a changing token as part of the call

#get a token, this is needed for modifying to reader
def get_token(SID):
    return get_results(SID, token_url)

#get a specific feed.  It works for any feed, subscribed or not
def get_feed(SID, url):
 return get_results(SID, get_feed_url + url.encode('utf-8'))

#get a list of the users subscribed feeds
def get_subscription_list(SID):
    return get_results(SID, subscription_list_url)

#get a feed of the users unread items
def get_reading_list(SID):
    return get_results(SID, reading_url)

#get a feed of the users read items
def get_read_items(SID):
    return get_results(SID, read_items_url)

#get a feed of the users unread items of a given tag
def get_reading_tag_list(SID, tag):
        tagged_url = reading_tag_url % tag
        return get_results(SID, tagged_url.encode('utf-8'))

#get a feed of a users starred items/feeds
def get_starred(SID):
    return get_results(SID, starred_url)

#subscribe of unsubscribe to a feed
def modify_subscription(SID, what, do):
    url = subscription_url + '?client=client:%s&ac=%s&s=%s&token=%s' % ( login, do.encode('utf-8'), 'feed%2F' + what.encode('utf-8'), get_token(SID) )
    print url
    return get_results(SID, url)

#subscribe to a feed
def subscribe_to(SID, url):
    return modify_subscription(SID, url, 'subscribe')

#unsubscribe to a feed
def unsubscribe_from(SID, url):
    return modify_subscription(SID, url, 'unsubscribe')

Example usage:

SID = get_SID()
print get_subscription_list(SID)
#print get_reading_list(SID)
#print get_read_items(SID)
#print get_reading_tag_list(SID, 'me')
#print get_reading_tag_list(SID, 'nada-mas')
#print get_starred(SID)
#print get_token(SID)

#test_feed = 'http://picasaweb.google.com/data/feed/base/user/timothy.broder/albumid/5101347429735335089?kind=photo&alt=rss&hl=en_US'

#print subscribe_to(SID, test_feed)
#returns ok but I don't see the feed in reader?

#print get_feed(SID, test_feed)

Like I said, I’d like to keep going with this and get the edit functionality to work better. I’m also going to take a look into the Client Library and see if I could set this up as a patch that people could use if they wanted to use the API.

View Google Groups posts in Reader

I don’t know why I never noticed this before but you can subscribe to the mail sent to Google Groups through an RSS feed at the bottom of each group. I’m trying it out for a few groups, I might like it better then reading through the mail. I wonder if there is a way to subscribe to the rolled up versions of the posts…

HOWTO: Pulling Google Bookmarks with Python

I love using Google Bookmarks (usually with the Google Toolbar) because it lets me get to my bookmarks at home on my laptop or desktop, at work, or anywhere. It’s great. Now I’m using those bookmarks to power the links section of gPowered.net
First we’re going to need the httplib2 library so we can authenticate against Google and grab the bookmark feed and then the ElementTree to help process the rss feed.

import httplib2
from elementtree import ElementTree

Then we’ll setup the link to pull the rss from, authenticate against the request, and pull back the feed

login = "timothy.broder@gmail.com"
password = "*****"
url = 'https://www.google.com/bookmarks/?output=rss&num=1000'  

h = httplib2.Http()
h.add_credentials(login, password)  

resp, content = h.request(url, "POST", body="nt", headers={'content-type':'text/plain'} )

I figured for this a hashmap (or dictionary) would work well using the tags on the bookmarks as keys, pointing to lists of bookmarks. Then when we display them, just iterate through the keys. I also kept a list of the keys to make sorting faster later on. So we define our objects and then loop through the rss object pulling out the tags for keys, the names of the links, and the urls. I define a small Bookmark class which holds a name and url which will go into the hashmap to make storing the bookmark’s easier. When I try to add a bookmark to the dict’s list I try to append it, if the key(tag) doesn’t exist I know I have to start a new list.

class Bookmark:
 def __init__(self, name, link):
  self.name = name
  self.link = link

d = dict()
sort_keys = []
for item in tree.findall('item'):
 key = item.findtext('{http://www.google.com/searchhistory}bkmk_label')
 if (key != None) and (key != 'gpowered') and (key != 'BP') and (key != 'Quick Searches') and (key != 'Me'):
  title = item.findtext('title')
  link = item.findtext('link')
  try:
   d[key].append(Bookmark(title.encode('utf-8'), link))
  except KeyError:
   d[key] = [Bookmark(title.encode('utf-8'), link)]
   sort_keys.append(key)

Then we’ll sort the key list and the list of each key. To do this we need a small function that defines how to sort a bookmark

def bookmark_compare(a, b):
 return cmp(a.name, b.name)

sort_keys.sort()
for key in sort_keys:
 d[key].sort(bookmark_compare)

Check out the static HTML version Here. I also made a fancier version using YUI‘s TreeView

HOWTO: Getting a list of post titles from blogger (Python)

This will be a quick one on how to pull the titles from your blog. I’m using it to Lists the posts I have available on gPowered.net. Firstly we’ll set up our imports and call to the blogger service.

from elementtree import ElementTree
from gdata import service
import gdata
import atom
import getopt
import sys

blog_id = 413573351281770670
blogger_service = service.GDataService('timothy.broder@gmail.com', '*****')
blogger_service.source = 'Blogger_Python_Sample-1.0'
blogger_service.service = 'blogger'
blogger_service.server = 'www.blogger.com'
blogger_service.ProgrammaticLogin()

For this query we’re going to use the summary feed because all we really need for this is the titles, not the full posts:

query = service.Query()
query.feed = '/feeds/' + str(blog_id) + '/posts/summary'
feed = blogger_service.Get(query.ToUri())

Then I just do a little counting so I can use the links on my site. All the information we need is in feed.entry

curr_id = int(feed.total_results.text)
for entry in feed.entry:
 entry.my_id = curr_id
 curr_id -= 1

Quick Docs Api Example (python)

To use the gdata docs python client you need to upgrade to 1.0.7 or higher. First thing is to import the modules you’ll need.

import gdata.docs.service
import gdata.docs

Then, set up the usual authentication parameters for the client.

gd_client = gdata.docs.service.DocsService()
gd_client.email = 'timothy.broder'
gd_client.password = '*****'
gd_client.source = 'gpowered-docs-list-ex'
gd_client.ProgrammaticLogin()

The most basic query will just return all of your documents

feed = gd_client.GetDocumentListFeed()

However, if we want to display just the spreadsheets, we build the query like this:

q = gdata.docs.service.DocumentQuery(categories=['spreadsheet'])
feed = gd_client.Query(q.ToUri())

Finally, we output the titles

if(len(feed.entry) == 0):
        print 'No entries in feed.\n'
for i, entry in enumerate(feed.entry):
        print '%s %s (%s)' % (i+1, entry.title.text.encode('UTF-8'))

If we wanted to, we could also import the DateTime library and show when the document was last updated

import gdata.docs.service
import gdata.docs

from mx import DateTime

gd_client = gdata.docs.service.DocsService()
gd_client.email = 'timothy.broder'
gd_client.password = '*****'
gd_client.source = 'gpowered-docs-list-ex'
gd_client.ProgrammaticLogin()

q = gdata.docs.service.DocumentQuery(categories=['spreadsheet'])
#feed = gd_client.GetDocumentListFeed()
feed = gd_client.Query(q.ToUri())

if(len(feed.entry) == 0):
        print 'No entries in feed.\n'
for i, entry in enumerate(feed.entry):
        dt = DateTime.ISO.ParseDateTimeUTC(entry.updated.text)
        print '%s %s (%s)' % (i+1, entry.title.text.encode('UTF-8'), dt.strftime('%m/%d/%Y %I:%M %p'))

For me this outputs:

1 TDP2006 Contact Info (11/18/2006 05:41 AM)
2 contact info (07/23/2006 08:15 PM)
3 Tim and Rob (08/09/2007 10:18 PM)
4 nyc happy hour spreadsheet (07/04/2007 08:25 PM)
5 public_spring_2006_roster (10/16/2006 12:40 AM)
6 dax2006 (11/12/2006 11:23 PM)
7 project dream (07/13/2007 03:54 AM)
8 Stuff Tim should get (06/13/2007 01:53 AM)
9 Erg Test Results - 9/26 (10/15/2006 01:02 AM)
10 Head of the Charles Regatta Itineary (10/17/2006 04:54 PM)
11 tvshows (11/02/2006 11:44 PM)
12 HF (10/01/2006 03:36 PM)
Performance Optimization WordPress Plugins by W3 EDGE