Most developers face challenges getting the user's data (Tweet and their engagement information) from the Twitter platform. Twitter provides data in the downloadable zip format based on the user's request. But the main challenge is to parse and have them in some data structure to be used in a consumable form.
This library parses the user's Twitter data and exposes it in a data structure to directly consume in any application or library. The user needs to download the Twitter data from the Twitter platform and provide input to this library. It helps the user access tweets and engagement information in the memory and smoothly consume it to build any business use case.
- Download your Twitter archive (Settings > Your account > Download an archive of your data).
- Twitter may take around 24 hours to provide the zip file.
- After 24 hours, download the zip file from the above location Download your Twitter archive
- Download the zip file into your local system.
- Import the twitter-parser library into your local project.
- To import the library, follow these steps.
- Install pip and pipenv.
- There will be two options, import the library directly from GitHub or clone it into your local and use it.
- Import directly from GitHub (recommended)
# If you are using pipenv, use this """pipenv install git+ssh://[email protected]/getphyllo/twitter-parser@<branch_name>#egg=twitter_parser""" # If you are using pip, use this """pip install git+ssh://[email protected]/getphyllo/twitter-parser@<branch_name>#egg=twitter_parser"""
- Clone the twitter-parser into your local and use it.
# If you are using pipenv, use this """pipenv install <path_to_git_clone_directory>""" # If you are using pip, use this """pip install <path_to_git_clone_directory>"""
- Import directly from GitHub (recommended)
- The library will be installed in your local project.
- Provide the
path_to_zip_file
of the archived data as a parameter to TwitterDataParser class.from twitter_parser.core.parser import TwitterDataParser if __name__ == "__main__": twitter_info = TwitterDataParser(path_to_zip_file='/Users/my_user_name/Desktop/twitter_data.zip') # Here, I have saved the zip file on the desktop, so the path is /Users/my_user_name/Desktop/twitter_data.zip var = twitter_info.retrieve_information() print(var)
- To extract the information of other users, you need to provide the bearer token. (This is mandatory; with this, we can get the other user's info.)
- Once the bearer token is received, go to the
twitter_parser/utils/constants.py
file and search for the variableBEARER_TOKEN
and paste it there. - In the above code snippet, we have called the
retrieve_information
method, but you can also calltweets,
following,
followers,
direct_messages,
group_direct_messages
methods. retrieve information
method returns the following information:user_name following followers tweets DMs media groups_dms following_count follower_count
- Extracts the zip file's contents to the current working directory.
- Converts the tweets to readable format and stores them in List[Tweet] format.
Tweet(BaseModel): tweet_year tweet_type retweeted_from replied_to_names replying_to_tweet tweet_data tweeted_at tweet_url tweets_attached_media
- Replaces URLs with their original versions (the ones found in the archive).
- Copies used images to an output folder to allow them to move to a new home.
- It will query Twitter for the missing user handles (check with yours first).
- Convert DMs to List[DirectMessage] format
DirectMessage(BaseModel): dm_from dm_to dm_data dm_at
- Convert Group DMs to List[GroupDirectMessages] format
GroupDirectMessages(BaseModel): group_name group_dms group_participant GroupDirectMessage(BaseModule): group_dm_from group_dm_data group_dm_at
- Converts the following and followers into List[User] format
class User(BaseModel): user_handle user_profile_url
- Converts all the data in TwitterUserInfo format.
TwitterUserInfo(BaseModel): user_name following followers tweets DMs media groups_dms following_count follower_count