{"id":252,"date":"2015-01-21T01:50:04","date_gmt":"2015-01-21T06:50:04","guid":{"rendered":"http:\/\/jkthinks.synology.me\/?p=1765"},"modified":"2020-09-04T22:58:35","modified_gmt":"2020-09-04T22:58:35","slug":"how-to-collect-tweets-for-analysis","status":"publish","type":"post","link":"https:\/\/www.jkthinks.com\/?p=252","title":{"rendered":"How to Collect Tweets for Analysis"},"content":{"rendered":"<p>To analyze how a certain service or product is accepted in a market, many people have tried certain traditional methods such as market survey and FGI. However, it requires expenses and has some limitations of space and time needed to design the research from laying out questionnaire to obtaining survey respondents. There is a simpler way to conduct this kind of analysis by using Python and R.<\/p>\n<p>This analysis consists of two parts: gathering required data for analysis and analyzing sentiment and preferences based on the data.<\/p>\n<p><strong>1. Overview<\/strong><\/p>\n<p>Prior to starting the process, it is required to clarify what I want to know and how to get the right information to make a decision. The primary objective of this work is to quantify how positively or negatively customers think about the product and service. To that end, we need to obtain data from social media without expenses and any prejudice, which can be driven by the coordinator or other participants of the survey.<\/p>\n<p>Twitter is relatively more preferred to Facebook in a way that data can be gathered more easily by using python libraries. Especially, Tweepy enables analyzers to gather relevant tweets based on Twitter\u2019s open API.<\/p>\n<p><strong>2. Creating an app<\/strong><\/p>\n<p>First and foremost, you have to create an app to gather tweets at the development center of Twitter. Log into Twitter (dev.twitter.com) and click \u201cmanage your apps\u201d under the \u201cTools\u201d at the bottom the page.<\/p>\n<p>Click on the \u201cCreate New App\u201d and fill in the blank as you can see from the following example. You can put non-working address into the website unless you need to connect the app to your public website or blog.<\/p>\n<p><a href=\"http:\/\/jkthinks.synology.me\/wp-content\/uploads\/2015\/01\/Twitter.png\"><img loading=\"lazy\" class=\" wp-image-1766 aligncenter\" src=\"http:\/\/jkthinks.synology.me\/wp-content\/uploads\/2015\/01\/Twitter-1024x588.png\" alt=\"Twitter\" width=\"648\" height=\"372\" \/><\/a><\/p>\n<p>After clicking the app that you made and go to the \u201cKeys and Access Tokens\u201d tab (or automatically reload to the page), you will see the button of \u201cCreate my access token\u201d under the \u201cToken actions\u201d at the bottom of the page. Click on the button and you can see a warning message while you wait for completion of the authorization.<\/p>\n<p>Browse information by clicking tabs of the app.<\/p>\n<p><strong>3. Tweepy streaming<\/strong><\/p>\n<p>Install Tweepy by typing <code>pip install tweepy<\/code>. Go to the <a href=\"https:\/\/github.com\/tweepy\/tweepy\">Tweepy GitHub site<\/a>. You will find \u201cstreaming.py\u201d in examples folder on the site.<\/p>\n<p>You do not need to know a lot of things about python code since you can use most codes in the file. Copy and paste codes of the file and rename it to your preference. The important thing, that you have to do, is to put access tokens in the app of Twitter into the following four blanks in streaming.py.<\/p>\n<pre class=\"lang:python decode:true\">consumer_key=\"UF12XXXXXXXX\"\r\nconsumer_secret=\"    \"\r\naccess_token=\"     \"\r\naccess_token_secret=\"    \"<\/pre>\n<p><strong>4. Filtering<\/strong> <strong>keywords<\/strong><\/p>\n<p>You can see the final line of code, which is a keyword that you want to filter for data. Default keyword is \u201cbasketball\u201d and you can change it according to the objective of your analysis. If you want to know several keywords at the same time to check rivalry or competition, you replace the original line with the following lines.<\/p>\n<pre class=\"lang:python decode:true\">stream = Stream(auth, l)\r\nstream.filter(track = keywords)\r\nkeywords = [\"Samsung\", \"Apple\"]<\/pre>\n<p>If you run the python file, you can gather the information. This is really awesome. However, you may need to sort out data in a more improved way by using JSON (<a href=\"http:\/\/en.wikipedia.org\/wiki\/JSON\">http:\/\/en.wikipedia.org\/wiki\/JSON<\/a>).  You can check some ways how to save gathering data in a text file based on JSON. I may handle it later.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>To analyze how a certain service or product is accepted in a market, many people have tried certain traditional methods such as market survey and FGI. However, it requires expenses and has some limitations of space and time needed to design the research from laying out questionnaire to obtaining survey respondents. There is a simpler [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[9],"tags":[],"_links":{"self":[{"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=\/wp\/v2\/posts\/252"}],"collection":[{"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=252"}],"version-history":[{"count":1,"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=\/wp\/v2\/posts\/252\/revisions"}],"predecessor-version":[{"id":278,"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=\/wp\/v2\/posts\/252\/revisions\/278"}],"wp:attachment":[{"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=252"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=252"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=252"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}