sharkey_crawler package

Module contents

This provides a sharkey accessor instance.

Classes:

SharkeyServer(base_url)

Local representation of a sharkey server, exposes server api endpoints and parses data.

class sharkey_crawler.main.SharkeyServer(base_url)[source]

Bases: object

Local representation of a sharkey server, exposes server api endpoints and parses data.

If you require more endpoints, feel free to open a pull request or discussion.

Methods:

__init__(base_url)

user_notes(user_id[, with_channel_notes, ...])

This function returns the latest posts about a user.

Parameters:

base_url (str)

__init__(base_url)[source]
Parameters:

base_url (str) – base url of the sharkey server. if no scheme is passed, https is assumed

Returns:

new sharkey proxy instance

user_notes(user_id, with_channel_notes=False, with_renotes=True, with_files=False, with_replies=False, limit=10, allow_partial=False, since_date=None, until_date=None, since_id=None, until_id=None, timeout=300)[source]

This function returns the latest posts about a user.

WARNING: Because the functionality is not documented, I will take an educated guess about the meaning of the arguments. I can only spend looking into other peoples codes for so much time. Please open an issue if I got something wrong. If you want to contribute, have a look at the code yourself at https://activitypub.software/TransFem-org/Sharkey

Parameters:
  • user_id (SharkeyId) – user id you want to crawl

  • with_channel_notes (bool)

  • with_renotes (bool) – include boosts (boosts that quote something are always included)

  • with_files (bool) – include posts with files

  • with_replies (bool) – include replies to other users

  • limit (Annotated[int, Interval(ge=0, le=100)]) – maximum number of posts, between 1 and 100

  • allow_partial (bool) – read only from redis, do not resort to the database to fill the limit

  • since_date (int | None) – get posts after or from this date, expressed as milliseconds since epoch, do not use with other since_ or until_ argument

  • until_date (int | None) – get posts before or from this date, expressed as milliseconds since epoch, do not use with other since_ or until_ argument

  • since_id (SharkeyId | None) – get posts after this id (and this id), expressed as milliseconds since epoch, do not use with other since_ or until_ argument

  • until_id (SharkeyId | None) – get posts before this id (and this id), expressed as milliseconds since epoch, do not use with other since_ or until_ argument

  • timeout (int | float | None) – timeout of the request

Return type:

list[Post]

Returns:

list of posts