bridgy package

Reference documentation.

app

blog_webmention

blogger

cron

Cron jobs. Currently just minor cleanup tasks.

class cron.ReplacePollTasks(request=None, response=None)[source]

Bases: webapp2.RequestHandler

Finds sources missing their poll tasks and adds new ones.

class cron.UpdateTwitterPictures(request=None, response=None)[source]

Bases: webapp2.RequestHandler

Finds Twitter sources with new profile pictures and updates them.

https://github.com/snarfed/granary/commit/dfc3d406a20965a5ed14c9705e3d3c2223c8c3ff http://indiewebcamp.com/Twitter#Profile_Image_URLs

class cron.UpdatePictures(request=None, response=None)[source]

Bases: webapp2.RequestHandler

Finds sources with new profile pictures and updates them.

class cron.UpdateInstagramPictures(request=None, response=None)[source]

Bases: cron.UpdatePictures

Finds Instagram sources with new profile pictures and updates them.

Splits the accounts up into batches to avoid hitting Instagram’s rate limit. Try to hit every account once a week.

Testing on 2017-07-05 hit the rate limit after ~170 profile page requests, with ~270 total Instagram accounts on Bridgy.

SOURCE_CLS

alias of Instagram

class cron.UpdateFlickrPictures(request=None, response=None)[source]

Bases: cron.UpdatePictures

Finds Flickr sources with new profile pictures and updates them.

SOURCE_CLS

alias of Flickr

facebook

Facebook API code and datastore model classes.

TODO: use third_party_id if we ever need to store an FB user id anywhere else.

Example post ID and links

Example comment ID and links

class facebook.FacebookPage(*args, **kwds)[source]

Bases: models.Source

A Facebook profile or page.

The key name is the Facebook id.

GR_CLASS

alias of Facebook

static new(handler, auth_entity=None, **kwargs)[source]

Creates and returns a FacebookPage for the logged in user.

Parameters:
classmethod lookup(id)[source]

Returns the entity with the given id or username.

silo_url()[source]

Returns the Facebook account URL, e.g. https://facebook.com/foo.

canonicalize_url(url, activity=None, **kwargs)[source]

Facebook-specific standardization of syndicated urls.

Canonical form is https://www.facebook.com/USERID/posts/POSTID

Parameters:
  • url – a string, the url of the syndicated content
  • activity – the activity this URL came from. If it has an fb_object_id, we’ll use that instead of fetching the post from Facebook
  • kwargs – unused
Returns:

a string, the canonical form of the syndication url

cached_resolve_object_id(post_id, activity=None)[source]

Resolve a post id to its Facebook object id, if any.

Wraps granary.facebook.Facebook.resolve_object_id() and uses self.resolved_object_ids_json as a cache.

Parameters:
  • post_id – string Facebook post id
  • activity – optional AS activity representation of Facebook post
Returns:

string Facebook object id or None

is_activity_public(activity)[source]

Returns True if the given activity is public, False otherwise.

Uses the post_publics_json cache if we can’t tell otherwise.

infer_profile_url(url)[source]

Find a Facebook profile URL (ideally the one with the user’s numeric ID)

Looks up existing sources by username, inferred username, and domain.

Parameters:url – string, a person’s URL
Returns:a string URL for their Facebook profile (or None)
on_new_syndicated_post(*args, **kwds)[source]

If this source has no username, try to infer one from a syndication URL.

Parameters:syndpostmodels.SyndicatedPost
class facebook.AuthHandler(*args, **kwargs)[source]

Bases: util.Handler

Base OAuth handler class.

finish_oauth_flow(auth_entity, state)[source]

Adds or deletes a FacebookPage, or restarts OAuth to get publish permissions.

Parameters:
class facebook.OAuthCallback(*args, **kwargs)[source]

Bases: oauth_dropins.facebook.CallbackHandler, facebook.AuthHandler

OAuth callback handler.

class facebook.StartHandler(*args, **kwargs)[source]

Bases: util.Handler

Custom handler that sets OAuth scopes based on the requested feature(s).

flickr

Flickr source and data model storage class.

class flickr.Flickr(*args, **kwds)[source]

Bases: models.Source

A Flickr account.

The key name is the nsid.

GR_CLASS

alias of Flickr

static new(handler, auth_entity=None, **kwargs)[source]

Creates and returns a Flickr for the logged in user.

Parameters:
silo_url()[source]

Returns the Flickr account URL, e.g. https://www.flickr.com/people/foo/.

user_tag_id()[source]

Returns the tag URI for this source, e.g. ‘tag:instagram.com:123456’.

get_activities_response(*args, **kwargs)[source]

Discard min_id because we still want new comments/likes on old photos.

class flickr.AuthHandler(*args, **kwargs)[source]

Bases: util.Handler

Base OAuth handler for Flickr.

class flickr.StartHandler(*args, **kwargs)[source]

Bases: flickr.AuthHandler

Custom handler to start Flickr auth process.

class flickr.AddFlickr(*args, **kwargs)[source]

Bases: oauth_dropins.flickr.CallbackHandler, flickr.AuthHandler

Custom handler to add Flickr source when auth completes.

If this account was previously authorized with greater permissions, this will trigger another round of auth with elevated permissions.

googleplus

Google+ source code and datastore model classes.

class googleplus.GooglePlusPage(*args, **kwds)[source]

Bases: models.Source

A Google+ profile or page.

The key name is the user id.

GR_CLASS

alias of GooglePlus

static new(handler, auth_entity=None, **kwargs)[source]

Creates and returns a GooglePlusPage for the logged in user.

Parameters:
silo_url()[source]

Returns the Google+ account URL, e.g. https://plus.google.com/+Foo.

__getattr__(name)[source]

Overridden to pass auth_entity to granary.googleplus.GooglePlus.

Searches for activities with links to any of this source’s web sites.

Only searches for root domain web site URLs! Skips URLs with paths; they tend to generate false positive results in G+’s search. Not sure why yet.

G+ search supports OR: https://developers.google.com/+/api/latest/activities/search

Returns:sequence of ActivityStreams activity dicts
class googleplus.OAuthCallback(*args, **kwargs)[source]

Bases: util.Handler

OAuth callback handler.

Both the add and delete flows have to share this because Google+’s oauth-dropin doesn’t yet allow multiple callback handlers. :/

handlers

Common handlers, e.g. post and comment permalinks.

URL paths are:

/post/SITE/USER_ID/POST_ID
e.g. /post/facebook/212038/10100823411094363
/comment/SITE/USER_ID/POST_ID/COMMENT_ID
e.g. /comment/twitter/snarfed_org/10100823411094363/999999
/like/SITE/USER_ID/POST_ID/LIKED_BY_USER_ID
e.g. /like/twitter/snarfed_org/10100823411094363/999999
/repost/SITE/USER_ID/POST_ID/REPOSTED_BY_USER_ID
e.g. /repost/twitter/snarfed_org/10100823411094363/999999
/rsvp/SITE/USER_ID/EVENT_ID/RSVP_USER_ID
e.g. /rsvp/facebook/212038/12345/67890
class handlers.ItemHandler(*args, **kwargs)[source]

Bases: util.Handler

Fetches a post, repost, like, or comment and serves it as mf2 HTML or JSON.

handle_exception(e, debug)

A webapp2 exception handler that propagates HTTP exceptions into the response.

Use this as a webapp2.RequestHandler.handle_exception() method by adding this line to your handler class definition:

handle_exception = handlers.handle_exception

I originally tried to put this in a webapp2.RequestHandler subclass, but it gave me this exception:

File ".../webapp2-2.5.1/webapp2_extras/local.py", line 136, in _get_current_object
  raise RuntimeError('no object bound to %s' % self.__name__) RuntimeError: no object bound to app

These are probably related:

head(*args)[source]

Return an empty 200 with no caching directives.

get_item(id)[source]

Fetches and returns an object from the given source.

To be implemented by subclasses.

Parameters:
Returns:

ActivityStreams object dict

get_title(obj)[source]

Returns the string to be used in the <title> tag.

Parameters:obj – ActivityStreams object
get_post(id, **kwargs)[source]

Fetch a post.

Parameters:
  • id – string, site-specific post id
  • is_event – bool
  • kwargs – passed through to get_activities()
Returns:

ActivityStreams object dict

merge_urls(obj, property, urls, object_type='article')[source]

Updates an object’s ActivityStreams URL objects in place.

Adds all URLs in urls that don’t already exist in obj[property].

ActivityStreams schema details: http://activitystrea.ms/specs/json/1.0/#id-comparison

Parameters:
  • obj – ActivityStreams object to merge URLs into
  • property – string property to merge URLs into
  • urls – sequence of string URLs to add
  • object_type – stored as the objectType alongside each URL

instagram

Instagram API code and datastore model classes.

Example post ID and links:

Example comment ID and links:

class instagram.Instagram(*args, **kwds)[source]

Bases: models.Source

An Instagram account.

The key name is the username. Instagram usernames may have ASCII letters (case insensitive), numbers, periods, and underscores: https://stackoverflow.com/questions/15470180

GR_CLASS

alias of Instagram

static new(handler, auth_entity=None, actor=None, **kwargs)[source]

Creates and returns an Instagram for the logged in user.

Parameters:
silo_url()[source]

Returns the Instagram account URL, e.g. https://instagram.com/foo.

user_tag_id()[source]

Returns the tag URI for this source, e.g. ‘tag:instagram.com:123456’.

label_name()[source]

Returns the username.

get_activities_response(*args, **kwargs)[source]

Set user_id because scraping requires it.

class instagram.StartHandler(*args, **kwargs)[source]

Bases: oauth_dropins.webutil.handlers.TemplateHandler

Serves the “Enter your username” form page.

logs

Handler that exposes app logs to users.

medium

Medium hosted blog implementation.

Only supports outbound webmentions right now, not inbound, since Medium’s API doesn’t support creating responses or recommendations yet. https://github.com/Medium/medium-api-docs/issues/71 https://github.com/Medium/medium-api-docs/issues/72

API docs: https://github.com/Medium/medium-api-docs#contents https://medium.com/developers/welcome-to-the-medium-api-3418f956552

class medium.Medium(*args, **kwds)[source]

Bases: models.Source

A Medium publication or user blog.

The key name is the username (with @ prefix) or publication name.

static new(handler, auth_entity=None, id=None, **kwargs)[source]

Creates and returns a Medium for the logged in user.

Parameters:
verify(force=False)[source]

No incoming webmention support yet.

models

Datastore model classes.

models.get_type(obj)[source]

Returns the Response or Publish type for an AS object.

exception models.DisableSource[source]

Bases: exceptions.Exception

Raised when a user has deauthorized our app inside a given platform.

__weakref__

list of weak references to the object (if defined)

class models.SourceMeta(name, bases, classdict)[source]

Bases: google.appengine.ext.ndb.model.MetaModel

Source metaclass. Registers all subclasses in the sources global.

class models.Source(*args, **kwds)[source]

Bases: oauth_dropins.webutil.models.StringIdModel

A silo account, e.g. a Facebook or Google+ account.

Each concrete silo class should subclass this class.

__metaclass__

alias of SourceMeta

classmethod new(handler, **kwargs)[source]

Factory method. Creates and returns a new instance for the current user.

To be implemented by subclasses.

__getattr__(name)[source]

Lazily load the auth entity and instantiate self.gr_source.

Once self.gr_source is set, this method will not be called; gr_source will be returned normally.

classmethod lookup(id)[source]

Returns the entity with the given id.

By default, interprets id as just the key id. Subclasses may extend this to support usernames, etc.

user_tag_id()[source]

Returns the tag URI for this source, e.g. ‘tag:plus.google.com:123456’.

bridgy_path()[source]

Returns the Bridgy page URL path for this source.

bridgy_url(handler)[source]

Returns the Bridgy page URL for this source.

silo_url(handler)[source]

Returns the silo account URL, e.g. https://twitter.com/foo.

label()[source]

Human-readable label for this source.

label_name()[source]

Human-readable name or username for this source, whichever is preferred.

classmethod put_updates(*args, **kwds)[source]

Writes source.updates to the datastore transactionally.

Returns:Source
Return type:source
Returns:the updated Source
poll_period()[source]

Returns the poll frequency for this source, as a datetime.timedelta.

Defaults to ~15m, depending on silo. If we’ve never sent a webmention for this source, or the last one we sent was over a month ago, we drop them down to ~1d after a week long grace period.

should_refetch()[source]

Returns True if we should run OPD refetch on this source now.

classmethod bridgy_webmention_endpoint(domain='brid.gy')[source]

Returns the Bridgy webmention endpoint for this source type.

has_bridgy_webmention_endpoint()[source]

Returns True if this source uses Bridgy’s webmention endpoint.

get_author_urls()[source]

Determine the author urls for a particular source.

In debug mode, replace test domains with localhost.

Returns:a list of string URLs, possibly empty

Searches for activities with links to any of this source’s web sites.

https://github.com/snarfed/bridgy/issues/456 https://github.com/snarfed/bridgy/issues/565

Returns:sequence of ActivityStreams activity dicts
get_activities_response(**kwargs)[source]

Returns recent posts and embedded comments for this source.

May be overridden by subclasses.

get_comment(comment_id, **kwargs)[source]

Returns a comment from this source.

Passes through to granary by default. May be overridden by subclasses.

Parameters:
Returns:

dict, decoded ActivityStreams comment object, or None

get_like(activity_user_id, activity_id, like_user_id, **kwargs)[source]

Returns an ActivityStreams ‘like’ activity object.

Passes through to granary by default. May be overridden by subclasses.

Parameters:
  • activity_user_id – string id of the user who posted the original activity
  • activity_id – string activity id
  • like_user_id – string id of the user who liked the activity
  • kwargs – passed to granary.Source.get_comment
create_comment(post_url, author_name, author_url, content)[source]

Creates a new comment in the source silo.

Must be implemented by subclasses.

Parameters:
  • post_url – string
  • author_name – string
  • author_url – string
  • content – string
Returns:

response dict with at least ‘id’ field

feed_url()[source]

Returns the RSS or Atom (or similar) feed URL for this source.

Must be implemented by subclasses. Currently only implemented by blogger, medium, tumblr, and wordpress_rest.

Returns:string URL
edit_template_url()[source]

Returns the URL for editing this blog’s template HTML.

Must be implemented by subclasses. Currently only implemented by blogger, medium, tumblr, and wordpress_rest.

Returns:string URL
classmethod create_new(handler, user_url=None, **kwargs)[source]

Creates and saves a new Source and adds a poll task for it.

Parameters:
  • handler – the current webapp2.RequestHandler
  • user_url – a string, optional. if provided, supersedes other urls when determining the author_url
  • **kwargs – passed to new()
verified()[source]

Returns True if this source is ready to be used, false otherwise.

See verify() for details. May be overridden by subclasses, e.g. tumblr.Tumblr.

verify(force=False)[source]

Checks that this source is ready to be used.

For blog and listen sources, this fetches their front page HTML and discovers their webmention endpoint. For publish sources, this checks that they have a domain.

May be overridden by subclasses, e.g. tumblr.Tumblr.

Parameters:force – if True, fully verifies (e.g. re-fetches the blog’s HTML and performs webmention discovery) even we already think this source is verified.
canonicalize_url(url, activity=None, **kwargs)[source]

Canonicalizes a post or object URL.

Wraps oauth_dropins.webutil.util.UrlCanonicalizer.

infer_profile_url(url)[source]

Given an arbitrary URL representing a person, try to find their profile URL for this service.

Queries Bridgy’s registered accounts for users with a particular domain in their silo profile.

Parameters:url – string, a person’s URL
Returns:a string URL for their profile on this service (or None)
preprocess_for_publish(obj)[source]

Preprocess an object before trying to publish it.

By default this tries to massage person tags so that the tag’s “url” points to the person’s profile on this service (as opposed to a person’s homepage).

The object is modified in place.

Parameters:obj – ActivityStreams activity or object dict
on_new_syndicated_post(syndpost)[source]

Called when a new SyndicatedPost is stored for this source.

Parameters:syndpostSyndicatedPost
is_private()[source]

Returns True if this source is private aka protected.

…ie their posts are not public.

is_activity_public(activity)[source]

Returns True if the given activity is public, False otherwise.

Just wraps granary.source.Source.is_public(). Subclasses may override.

is_beta_user()[source]

Returns True if this is a “beta” user opted into new features.

Beta users come from beta_users.txt.

is_blocked(obj)[source]

Returns True if an object’s author is being blocked.

…ie they’re in this user’s block list.

class models.Webmentions(*args, **kwds)[source]

Bases: oauth_dropins.webutil.models.StringIdModel

A bundle of links to send webmentions for.

Use the Response and BlogPost concrete subclasses below.

label()[source]

Returns a human-readable string description for use in log messages.

To be implemented by subclasses.

add_task(**kwargs)[source]

Adds a propagate task for this entity.

To be implemented by subclasses.

restart()[source]

Moves status and targets to ‘new’ and adds a propagate task.

class models.Response(*args, **kwds)[source]

Bases: models.Webmentions

A comment, like, or repost to be propagated.

The key name is the comment object id as a tag URI.

restart(source=None)[source]

Moves status and targets to ‘new’ and adds a propagate task.

class models.BlogPost(*args, **kwds)[source]

Bases: models.Webmentions

A blog post to be processed for links to send webmentions to.

The key name is the URL.

class models.PublishedPage(*args, **kwds)[source]

Bases: oauth_dropins.webutil.models.StringIdModel

Minimal root entity for Publish children with the same source URL.

Key id is the string source URL.

class models.Publish(*args, **kwds)[source]

Bases: google.appengine.ext.ndb.model.Model

A comment, like, repost, or RSVP published into a silo.

Child of a PublishedPage entity.

class models.BlogWebmention(*args, **kwds)[source]

Bases: models.Publish, oauth_dropins.webutil.models.StringIdModel

Datastore entity for webmentions for hosted blog providers.

Key id is the source URL and target URL concated with a space, ie ‘SOURCE TARGET’. The source URL is always the URL given in the webmention HTTP request. If the source page has a u-url, that’s stored in the u_url property. The target URL is always the final URL, after any redirects.

Reuses Publish’s fields, but otherwise unrelated.

class models.SyndicatedPost(*args, **kwds)[source]

Bases: google.appengine.ext.ndb.model.Model

Represents a syndicated post and its discovered original (or not if we found no original post). We discover the relationship by following rel=syndication links on the author’s h-feed.

See original_post_discovery.

When a SyndicatedPost entity is about to be stored, source.Source.on_new_syndicated_post() is called before it’s stored.

classmethod insert_original_blank(*args, **kwds)[source]

Insert a new original -> None relationship. Does a check-and-set to make sure no previous relationship exists for this original. If there is, nothing will be added.

Parameters:
  • sourceSource subclass
  • original – string
classmethod insert_syndication_blank(*args, **kwds)[source]

Insert a new syndication -> None relationship. Does a check-and-set to make sure no previous relationship exists for this syndication. If there is, nothing will be added.

Parameters:
  • sourceSource subclass
  • original – string
classmethod insert(*args, **kwds)[source]

Insert a new (non-blank) syndication -> original relationship.

This method does a check-and-set within transaction to avoid including duplicate relationships.

If blank entries exists for the syndication or original URL (i.e. syndication -> None or original -> None), they will first be removed. If non-blank relationships exist, they will be retained.

Parameters:
  • sourceSource subclass
  • syndication – string (not None)
  • original – string (not None)
Returns:

newly created or preexisting entity

Return type:

SyndicatedPost

original_post_discovery

Augments the standard original_post_discovery algorithm with a reverse lookup that supports posts without a backlink or citation.

Performs a reverse-lookup that scans the activity’s author’s h-feed for posts with rel=syndication links. As we find syndicated copies, save the relationship. If we find the original post for the activity in question, return the original’s URL.

See http://indiewebcamp.com/posse-post-discovery for more detail.

This feature adds costs in terms of HTTP requests and database lookups in the following primary cases:

  • Author’s domain is known to be invalid or blacklisted, there will be 0 requests and 0 DB lookups.
  • For a syndicated post has been seen previously (regardless of whether discovery was successful), there will be 0 requests and 1 DB lookup.
  • The first time a syndicated post has been seen:
    • 1 to 2 HTTP requests to get and parse the h-feed plus 1 additional request for each post permalink that has not been seen before.
    • 1 DB query for the initial check plus 1 additional DB query for each post permalink.
original_post_discovery.discover(source, activity, fetch_hfeed=True, include_redirect_sources=True, already_fetched_hfeeds=None)[source]

Augments the standard original_post_discovery algorithm with a reverse lookup that supports posts without a backlink or citation.

If fetch_hfeed is False, then we will check the db for previously found models.SyndicatedPosts but will not do posse-post-discovery to find new ones.

Parameters:
  • sourcemodels.Source subclass. Changes to property values (e.g. domains, domain_urls, last_syndication_url) are stored in source.updates; they should be updated transactionally later.
  • activity – activity dict
  • fetch_hfeed – boolean
  • include_redirect_sources – boolean, whether to include URLs that redirect as well as their final destination URLs
  • already_fetched_hfeeds – set, URLs that we have already fetched and run posse-post-discovery on, so we can avoid running it multiple times
Returns:

(set(string original post URLs), set(string mention URLs)) tuple

original_post_discovery.refetch(source)[source]

Refetch the author’s URLs and look for new or updated syndication links that might not have been there the first time we looked.

Parameters:sourcemodels.Source subclass. Changes to property values (e.g. domains, domain_urls, last_syndication_url) are stored in source.updates; they should be updated transactionally later.
Returns:mapping syndicated_url to a list of new models.SyndicatedPosts
Return type:dict
original_post_discovery.targets_for_response(resp, originals, mentions)[source]

Returns the URLs that we should send webmentions to for a given response.

…specifically, all responses except posts get sent to original post URLs, but only posts and comments get sent to mentioned URLs.

Parameters:
  • resp – ActivityStreams response object
  • mentions (originals,) – sequence of string URLs
Returns:

set of string URLs

original_post_discovery.process_entry(source, permalink, feed_entry, refetch, preexisting, store_blanks=True)[source]

Fetch and process an h-entry and save a new models.SyndicatedPost.

Parameters:
  • source
  • permalink – url of the unprocessed post
  • feed_entry – the h-feed version of the h-entry dict, often contains a partial version of the h-entry at the permalink
  • refetch – boolean, whether to refetch and process entries we’ve seen before
  • preexisting – list of previously discovered models.SyndicatedPosts for this permalink
  • store_blanks – boolean, whether we should store blank models.SyndicatedPosts when we don’t find a relationship
Returns:

a dict from syndicated url to a list of new models.SyndicatedPosts

publish

Publishes webmentions into the silos.

Webmention spec: http://webmention.org/

Bridgy request and response details: https://brid.gy/about#response

Example request:

POST /webmention HTTP/1.1
Host: brid.gy
Content-Type: application/x-www-url-form-encoded

source=http://bob.host/post-by-bob&
target=http://facebook.com/123

Example response:

HTTP/1.1 201 Created
Location: http://facebook.com/456_789

{
  "url": "http://facebook.com/456_789",
  "type": "post",
  "id": "456_789"
}
class publish.Handler(*args, **kwargs)[source]

Bases: webmention.WebmentionHandler

Base handler for both previews and publishes.

Subclasses must set the PREVIEW attribute to True or False. They may also override other methods.

fetched

requests.Response from fetching source_url

rel-shortlink found in the original post, if any

authorize()[source]

Returns True if the current user is authorized for this request.

Otherwise, should call self.error() to provide an appropriate error message.

attempt_single_item(item)[source]

Attempts to preview or publish a single mf2 item.

Parameters:item – mf2 item dict from mf2py
Returns:CreationResult
preprocess(activity)[source]

Preprocesses an item before trying to publish it.

Specifically, expands inReplyTo/object URLs with rel=syndication URLs.

Parameters:activity – an ActivityStreams activity or object being published
expand_target_urls(activity)[source]

Expand the inReplyTo or object fields of an ActivityStreams object by fetching the original and looking for rel=syndication URLs.

This method modifies the dict in place.

Parameters:activity – an ActivityStreams dict of the activity being published
get_or_add_publish_entity(*args, **kwds)[source]

Creates and stores models.Publish entity.

…and if necessary, models.PublishedPage entity.

Parameters:source_url – string
class publish.PreviewHandler(*args, **kwargs)[source]

Bases: publish.Handler

Renders a preview HTML snippet of how a webmention would be handled.

class publish.SendHandler(*args, **kwargs)[source]

Bases: publish.Handler

Interactive publish handler. Redirected to after each silo’s OAuth dance.

Note that this is GET, not POST, since HTTP redirects always GET.

class publish.WebmentionHandler(*args, **kwargs)[source]

Bases: publish.Handler

Accepts webmentions and translates them to publish requests.

authorize()[source]

Check for a backlink to brid.gy/publish/SILO.

superfeedr

Superfeedr.

https://superfeedr.com/users/snarfed http://documentation.superfeedr.com/subscribers.html http://documentation.superfeedr.com/schema.html

If/when I add support for arbitrary RSS/Atom feeds, I should use http://feediscovery.appspot.com/ for feed discovery based on front page URL.

superfeedr.subscribe(source, handler)[source]

Subscribes to a source.

Also receives some past posts and adds propagate tasks for them.

http://documentation.superfeedr.com/subscribers.html#addingfeedswithpubsubhubbub

Parameters:
superfeedr.handle_feed(feed, source)[source]

Handles a Superfeedr JSON feed.

Creates models.BlogPost entities and adds propagate-blogpost tasks for new items.

http://documentation.superfeedr.com/schema.html#json http://documentation.superfeedr.com/subscribers.html#pubsubhubbubnotifications

Parameters:
  • feed – string, Superfeedr JSON feed
  • source – Blogger, Tumblr, or WordPress
class superfeedr.NotifyHandler(*args, **kwargs)[source]

Bases: util.Handler

Handles a Superfeedr notification.

Abstract; subclasses must set the SOURCE_CLS attr.

http://documentation.superfeedr.com/subscribers.html#pubsubhubbubnotifications

tasks

tumblr

Tumblr + Disqus blog webmention implementation.

To use, go to your Tumblr dashboard, click Customize, Edit HTML, then put this in the head section:

<link rel=”webmention” href=”https://brid.gy/webmention/tumblr”>

http://disqus.com/api/docs/ http://disqus.com/api/docs/posts/create/ https://github.com/disqus/DISQUS-API-Recipes/blob/master/snippets/php/create-guest-comment.php http://help.disqus.com/customer/portal/articles/466253-what-html-tags-are-allowed-within-comments- create returns id, can lookup by id w/getContext?

guest post (w/arbitrary author, url): http://spirytoos.blogspot.com/2013/12/not-so-easy-posting-as-guest-via-disqus.html http://stackoverflow.com/questions/15416688/disqus-api-create-comment-as-guest http://jonathonhill.net/2013-07-11/disqus-guest-posting-via-api/

can send url and not look up disqus thread id! http://stackoverflow.com/questions/4549282/disqus-api-adding-comment https://disqus.com/api/docs/forums/listThreads/

test command line: curl localhost:8080/webmention/tumblr -d ‘source=http://localhost/response.html&target=http://snarfed.tumblr.com/post/60428995188/glen-canyon-http-t-co-fzc4ehiydp?foo=bar#baz’

class tumblr.Tumblr(*args, **kwds)[source]

Bases: models.Source

A Tumblr blog.

The key name is the blog domain.

static new(handler, auth_entity=None, blog_name=None, **kwargs)[source]

Creates and returns a Tumblr for the logged in user.

Parameters:
verified()[source]

Returns True if we’ve found the webmention endpoint and Disqus.

verify()[source]

Checks that Disqus is installed as well as the webmention endpoint.

Stores the result in webmention_endpoint.

create_comment(post_url, author_name, author_url, content)[source]

Creates a new comment in the source silo.

Must be implemented by subclasses.

Parameters:
  • post_url – string
  • author_name – string
  • author_url – string
  • content – string
Returns:

JSON response dict with ‘id’ and other fields

static disqus_call(method, url, params, **kwargs)[source]

Makes a Disqus API call.

Parameters:
  • method – requests function to use, e.g. requests.get
  • url – string
  • params – query parameters
  • kwargs – passed through to method
Returns:

dict, JSON response

twitter

Twitter source code and datastore model classes.

Twitter’s rate limiting window is currently 15m. A normal poll with nothing new hits /statuses/user_timeline and /search/tweets once each. Both allow 180 calls per window before they’re rate limited. https://dev.twitter.com/docs/rate-limiting/1.1/limits

class twitter.Twitter(*args, **kwds)[source]

Bases: models.Source

A Twitter account.

The key name is the username.

GR_CLASS

alias of Twitter

static new(handler, auth_entity=None, **kwargs)[source]

Creates and returns a Twitter entity.

Parameters:
silo_url()[source]

Returns the Twitter account URL, e.g. https://twitter.com/foo.

label_name()[source]

Returns the username.

Searches for activities with links to any of this source’s web sites.

Twitter search supports OR: https://dev.twitter.com/rest/public/search

…but it only returns complete(ish) results if we strip scheme from URLs, ie search for example.com instead of http://example.com/, and that also returns false positivies, so we check that the returned tweets actually have matching links. https://github.com/snarfed/bridgy/issues/565

Returns:sequence of ActivityStreams activity dicts
get_like(activity_user_id, activity_id, like_user_id, **kwargs)[source]

Returns an ActivityStreams ‘like’ activity object for a favorite.

We get Twitter favorites by scraping HTML, and we only get the first page, which only has 25. So, use a models.Response in the datastore first, if we have one, and only re-scrape HTML as a fallback.

Parameters:
  • activity_user_id – string id of the user who posted the original activity
  • activity_id – string activity id
  • like_user_id – string id of the user who liked the activity
  • kwargs – passed to granary.source.Source.get_comment()
is_private()[source]

Returns True if this Twitter account is protected.

https://dev.twitter.com/rest/reference/get/users/show#highlighter_25173 https://support.twitter.com/articles/14016 https://support.twitter.com/articles/20169886

canonicalize_url(url, activity=None, **kwargs)[source]

Normalize /statuses/ to /status/.

https://github.com/snarfed/bridgy/issues/618

is_blocked(obj)[source]

Returns True if an object’s author is being blocked.

…ie they’re in this user’s block list.

class twitter.AuthHandler(*args, **kwargs)[source]

Bases: util.Handler

Base OAuth handler class.

start_oauth_flow(feature)[source]

Redirects to Twitter’s OAuth endpoint to start the OAuth flow.

Parameters:feature – ‘listen’ or ‘publish’
class twitter.StartHandler(*args, **kwargs)[source]

Bases: twitter.AuthHandler

Custom OAuth start handler so we can use access_type=read for state=listen.

Tweepy converts access_type to x_auth_access_type for Twitter’s oauth/request_token endpoint. Details: https://dev.twitter.com/docs/api/1/post/oauth/request_token

util

Misc utility constants and classes.

util.now_fn()

[tz] -> new datetime with tz’s local day and time.

class util.Login(site, name, path)

Bases: tuple

name

Alias for field number 1

path

Alias for field number 2

site

Alias for field number 0

util.add_poll_task(source, now=False, **kwargs)[source]

Adds a poll task for the given source entity.

Pass now=True to insert a poll-now task.

util.add_propagate_task(entity, **kwargs)[source]

Adds a propagate task for the given response entity.

util.add_propagate_blogpost_task(entity, **kwargs)[source]

Adds a propagate-blogpost task for the given response entity.

util.add_discover_task(source, post_id, type=None, **kwargs)[source]

Adds a propagate-blogpost task for the given source and silo post id.

util.webmention_endpoint_cache_key(url)[source]

Returns memcache key for a cached webmention endpoint for a given URL.

Example: ‘W https snarfed.org’

util.email_me(**kwargs)[source]

Thin wrapper around mail.send_mail() that handles errors.

util.requests_get(url, **kwargs)[source]

Wraps requests.get() with extra semantics and our user agent.

If a server tells us a response will be too big (based on Content-Length), we hijack the response and return 599 and an error response body instead. We pass stream=True to requests.get() so that it doesn’t fetch the response body until we access requests.Response.content (or requests.Response.text).

http://docs.python-requests.org/en/latest/user/advanced/#body-content-workflow

util.follow_redirects(url, cache=True)[source]

Wraps oauth_dropins.webutil.util.follow_redirects() with our settings.

…specifically memcache and REQUEST_HEADERS.

util.get_webmention_target(url, resolve=True, replace_test_domains=True)[source]

Resolves a URL and decides whether we should try to send it a webmention.

Note that this ignores failed HTTP requests, ie the boolean in the returned tuple will be true! TODO: check callers and reconsider this.

Parameters:
  • url – string
  • resolve – whether to follow redirects
  • replace_test_domains – whether to replace test user domains with localhost
Returns:

(string url, string pretty domain, boolean) tuple. The boolean is True if we should send a webmention, False otherwise, e.g. if it’s a bad URL, not text/html, or in the blacklist.

util.in_webmention_blacklist(domain)[source]

Returns True if the domain or its root domain is in BLACKLIST.

util.prune_activity(activity, source)[source]

Prunes an activity down to just id, url, content, to, and object, in place.

If the object field exists, it’s pruned down to the same fields. Any fields duplicated in both the activity and the object are removed from the object.

Note that this only prunes the to field if it says the activity is public, since granary.source.Source.is_public() defaults to saying an activity is public if the to field is missing. If that ever changes, we’ll need to start preserving the to field here.

Parameters:activity – ActivityStreams activity dict
Returns:pruned activity dict
util.prune_response(response)[source]

Returns a response object dict with a few fields removed.

Parameters:response – ActivityStreams response object
Returns:pruned response object
util.replace_test_domains_with_localhost(url)[source]

Replace domains in LOCALHOST_TEST_DOMAINS with localhost for local testing when in DEBUG mode.

Parameters:url – a string
Returns:a string with certain well-known domains replaced by localhost
class util.Handler(*args, **kwargs)[source]

Bases: oauth_dropins.webutil.handlers.ModernHandler

Includes misc request handler utilities.

messages

list of notification messages to be rendered in this page or wherever it redirects

redirect(uri, **kwargs)[source]

Adds self.messages to the fragment, separated by newlines.

maybe_add_or_delete_source(source_cls, auth_entity, state, **kwargs)[source]

Adds or deletes a source if auth_entity is not None.

Used in each source’s oauth-dropins CallbackHandler.finish() and CallbackHandler.get() methods, respectively.

Parameters:
  • source_cls – source class, e.g. instagram.Instagram
  • auth_entity – ouath-dropins auth entity
  • state – string, OAuth callback state parameter. a JSON serialized dict with operation, feature, and an optional callback URL. For deletes, it will also include the source key
  • kwargs – passed through to the source_cls constructor
Returns:

source entity if it was created or updated, otherwise None

construct_state_param_for_add(state=None, **kwargs)[source]

Construct the state parameter if one isn’t explicitly passed in.

The following keys are common: - operation: ‘add’ or ‘delete’ - feature: ‘listen’, ‘publish’, or ‘webmention’ - callback: an optional external callback, that we will redirect to at

the end of the authorization handshake
  • source: the source key, only applicable to deletes
get_logins()[source]

Extracts the current user page paths from the logins cookie.

Returns:list of Login objects
set_logins(logins)[source]

Sets a logins cookie.

Parameters:logins – sequence of Login objects
preprocess_source(source)[source]

Prepares a source entity for rendering in the source.html template.

  • use id as name if name isn’t provided
  • convert image URLs to https if we’re serving over SSL
  • set ‘website_links’ attr to list of pretty HTML links to domain_urls
Parameters:sourcemodels.Source entity
util.oauth_starter(oauth_start_handler, **kwargs)[source]

Returns an oauth-dropins start handler that injects the state param.

Parameters:
class util.CachedPage(*args, **kwds)[source]

Bases: oauth_dropins.webutil.models.StringIdModel

Cached HTML for pages that changes rarely. Key id is path.

Stored in the datastore since datastore entities in memcache (mostly models.Response) are requested way more often, so it would get evicted out of memcache easily.

Keys, useful for deleting from memcache: /: aglzfmJyaWQtZ3lyEQsSCkNhY2hlZFBhZ2UiAS8M /users: aglzfmJyaWQtZ3lyFgsSCkNhY2hlZFBhZ2UiBi91c2Vycww

classmethod store(path, html, expires=None)[source]

Stores new page contents.

Parameters:
util.unwrap_t_umblr_com(url)[source]

If url is a t.umblr.com short link, extract its destination URL.

Otherwise, return url unchanged.

Not in tumblr.py since models imports superfeedr, so it would be a circular import.

Background: https://github.com/snarfed/bridgy/issues/609

util.cache_time(*args, **kwds)[source]

Times a block of code, logs the time, and aggregates it in memcache.

util.beautifulsoup_parse(html)[source]

Parses an HTML string with BeautifulSoup. Centralizes our parsing config.

We currently let BeautifulSoup default to lxml, which is the fastest option. http://www.crummy.com/software/BeautifulSoup/bs4/doc/#specifying-the-parser-to-use

We use App Engine’s lxml by declaring it in app.yaml.

util.mf2py_parse(input, url)[source]

Uses mf2py to parse an input HTML string or BeautifulSoup input.

webmention

Base handler class and common utilities for handling webmentions.

Used in publish.py and blog_webmention.py.

Webmention spec: http://webmention.org/

class webmention.WebmentionGetHandler(*args, **kwargs)[source]

Bases: util.Handler

Renders a simple placeholder HTTP page for GETs to webmention endpoints.

class webmention.WebmentionHandler(*args, **kwargs)[source]

Bases: webmention.WebmentionGetHandler

Webmention handler.

Attributes:

fetch_mf2(url)[source]

Fetches a URL and extracts its mf2 data.

Side effects: sets entity.html on success, calls error() on errors.

Parameters:url – string
Returns:(requests.Response, mf2 data dict) on success, None on failure
error(error, html=None, status=400, data=None, log_exception=True, mail=False)[source]

Handle an error. May be overridden by subclasses.

Parameters:
  • error – string human-readable error message
  • html – string HTML human-readable error message
  • status – int HTTP response status code
  • data – mf2 data dict parsed from source page
  • log_exception – boolean, whether to include a stack trace in the log msg
  • mail – boolean, whether to email me

wordpress_rest

WordPress REST API (including WordPress.com) hosted blog implementation.

To use, go to your WordPress.com blog’s admin console, then go to Appearance, Widgets, add a Text widget, and put this in its text section:

<a href=”https://brid.gy/webmention/wordpress” rel=”webmention”></a>

(not this, it breaks :/) <link rel=”webmention” href=”https://brid.gy/webmention/wordpress”>

https://developer.wordpress.com/docs/api/ create returns id, can lookup by id

test command line: curl localhost:8080/webmention/wordpress -d ‘source=http://localhost/response.html&target=http://ryandc.wordpress.com/2013/03/24/mac-os-x/’

making an API call with an access token from the command line: curl -H ‘Authorization: Bearer [TOKEN]’ URL…

class wordpress_rest.WordPress(*args, **kwds)[source]

Bases: models.Source

A WordPress blog.

The key name is the blog hostname.

static new(handler, auth_entity=None, **kwargs)[source]

Creates and returns a WordPress for the logged in user.

Parameters:
create_comment(post_url, author_name, author_url, content)[source]

Creates a new comment in the source silo.

If the last part of the post URL is numeric, e.g. http://site/post/123999, it’s used as the post id. Otherwise, we extract the last part of the path as the slug, e.g. http: / / site / post / the-slug, and look up the post id via the API.

Parameters:
  • post_url – string
  • author_name – string
  • author_url – string
  • content – string
Returns:

JSON response dict with ‘id’ and other fields

classmethod get_site_info(handler, auth_entity)[source]

Fetches the site info from the API.

Parameters:
Returns:

site info dict, or None if API calls are disabled for this blog