Twitter Scraper using Streamlit and SNScrape

Data Scraping, Data Visualisation, Exploring Streamlit UI

Service

Code UI in Streamlit

Client

Staff at my University

Year

2023

Problem Statement

Today, data is scattered everywhere in the world. Especially in social media, there may be a big quantity of data on Facebook, Instagram, Youtube, Twitter, etc. This consists of pictures and films on Youtube and Instagram as compared to Facebook and Twitter. To get the real facts on Twitter, you want to scrape the data from Twitter. You Need to Scrape the data like (date, id, url, tweet content, user,reply count, retweet count,language, source, like count etc) from twitter.

Approach

⦾ By using the “snscrape” Library, Scrape the twitter data from Twitter Reference

⦾ Create a dataframe with date, id, url, tweet content, user,reply count, retweet count,language, source, like count.

⦾ Store each collection of data into a document into Mongodb along with the hashtag or key word we use to Scrape from twitter. eg:({“Scraped Word” : “Elon Musk”, “Scraped Date” :15-02-2023, “Scraped Data” : [{1000 Scraped data from past 100 days }]})

⦾ Create a GUI using streamlit that should contain the feature to enter the keyword or Hashtag to be searched, select the date range and limit the tweet count need to be scraped. After scraping, the data needs to be displayed in the page and need a button to upload the data into Database and download the data into csv and json format.

Results

The requirement was to build a solution capable of scraping Twitter data and storing it in a database. The solution should also provide users with the ability to download the data in multiple formats.

Project Evaluation metrics

✅ The code must be written in a modular fashion, organized in functional blocks to ensure maintainability even as the codebase grows.

✅ It should also be designed to be portable, working seamlessly in any operating system.

✅ The code must be hosted on GitHub, with a public repository that allows anyone to check the code.

✅ A proper readme file must be maintained for each project development, outlining the basic workflow and execution of the entire project on GitHub.

✅ The coding standards outlined in PEP-0008 must be followed.

✅ Additionally, it is mandatory to create a demo video of the working model and post it on LinkedIn.

Conclusion

It was a great learning experience that I've got on Streamlit and snscrape and how Streamlit helps to create GUI apps on the web using python