Data Collection
Write a script or use a tool from Github to collect public tweets using the hashtag #TigrayGenocide on Twitter for the period 13 April 2021 - 15 April 2021, and share the dataset as a zipped CSV file.
Data Analysis You can find the data in data folder above The data is a sample of tweets collected during a previous investigation.
For a forensic investigation focusing on social network analysis, defining clusters or groups in a dataset is essential in understanding the formation of communities involved in shaping a particular conversation. An example would be identifying a group of accounts on Twitter that work together to push a particular topic or narrative.
From the data provided above:
-
What subgroups can you identify based on the volume of tweets in each category and the number of unique accounts for each?
-
Who is the most prominent author for each category type?
Establishing who started the conversation on social media, the amplification point and how the conversation evolved over time is key in determining the key perpetrators spearheading a particular disinformation narrative.
Using the data provided, Identify :
-
The account name and account handle of the author who posted the first tweet.
-
Identify the date that had the highest volume of tweets (amplification point).
"I am here for a purpose and that purpose is to grow into a mountain, not to shrink to a grain of sand. - Mandino #quote via @roxanamjones"
This is a tweet found within the dataset. For this exercise :
-
Identify the author of the tweet;
-
Create a subset with all tweets from the same author;
Using a plotting library / tool display the daily number of tweets from ;
-
The top 2 categories within the datasets and rank them by volume of tweets.
-
The 3rd and 4th categories and rank them by volume of tweets.
-
What are some of the characteristics you would consider when identifying fake or suspicious accounts on twitter?
-
Using the data provided, identify one account that you suspect might be automated and flag some of the bot-like traits you identified.