{"id":253,"date":"2015-02-16T18:28:10","date_gmt":"2015-02-16T23:28:10","guid":{"rendered":"http:\/\/jkthinks.synology.me\/?p=1794"},"modified":"2020-09-04T22:58:35","modified_gmt":"2020-09-04T22:58:35","slug":"rfm-simple-efficient-way-to-focus-on-highly-responded-customers","status":"publish","type":"post","link":"https:\/\/www.jkthinks.com\/?p=253","title":{"rendered":"RFM: Simple &#038; Efficient Way to Focus on Highly-responded Customers"},"content":{"rendered":"<p>When you try to focus on the target segments with a high response rate, RFM is one of the most useful methods. Most of all, RFM is intuitive and easy to get results in a way that it is a kind of heuristic analytics, which is different from a regression model. RFM is an acronym of \u201cRecency\u201d, \u201cFrequency\u201d, and \u201cMonetary\u201d, meaning \u201chow recently did the customer purchase?\u201d, \u201chow often do they purchase?\u201d, and \u201chow much do they spend on average?\u201d separately according to Wikipedia.<\/p>\n<p>RFM is simply to find out the groups with higher response rates based on empirical data and to spend money or put more efforts into these groups, resulting in higher profitability when it comes to sales promotion. This analysis is similar to segmentation methods, and is actually more streamlined method. Each variable of RFM plays a role to put a code or criteria, which makes one group differentiated from others. For example, customers who have the same group with \u201c231\u201d are assumed to have same purchasing pattern while they are different from other customers who have a code of \u201c233\u201d.<\/p>\n<p>How to codify customers? This is simple. RFM analysis divides customers into 5 (quintile) or 10 (decile) groups in each variable. This grouping follows the order of each variable, meaning the customer who purchased 10 times is ahead of the other who purchased less than 10 times in terms of frequency. Let\u2019s say, we follow the rule of decile. The total number of groups is 1,000 (=10*10*10). If it\u2019s quintile, the total number of groups is 125 (=5*5*5)<\/p>\n<p>Now, start the way RFM works in R.<\/p>\n<p><a href=\"http:\/\/jkthinks.synology.me\/wp-content\/uploads\/2015\/02\/sample.jpg\"><img loading=\"lazy\" class=\" wp-image-1796 aligncenter\" src=\"http:\/\/jkthinks.synology.me\/wp-content\/uploads\/2015\/02\/sample-1024x742.jpg\" alt=\"sample\" width=\"627\" height=\"454\" \/><\/a><\/p>\n<p>I will use the sample data, which consists of several variables representing recency, frequency, and monetary. After reading and sorting data, classify customers based on quintile. Actually, there are three approaches to classify customers: sequential, independent, and intuitive. I will use the independent approach, which classifies customers with no consideration of the relation or importance of variables.<strong><strong> <\/strong><\/strong><\/p>\n<pre class=\"lang:r decode:true\"># set a working directory\r\nsetwd(\"D:\/Rstudio\/R files\/Project\/RFM\")\r\n\r\n# read a data file from the working directory\r\nd1 &lt;- read.csv(\"sample.csv\")\r\nhead(d1)\r\n\r\n# make data frame with relevant columns\r\nd2 &lt;- as.data.frame(cbind(d1[,1],d1[,4],d1[,2],d1[,3],d1[,5],d1[,6]))\r\n\r\n# change the column names\r\nnames &lt;- c(\"ID\", \"Recency\", \"Frequency\", \"Monetary\", \"Buy\", \"avgExpense\")\r\nnames(d2) &lt;- names\r\nhead(d2)\r\ndim(d2)\r\n\r\n# 5 quantile for recency\r\nRq &lt;- quantile(d2$Recency, probs = seq(0, 1, 0.2), na.rm = FALSE, names = TRUE)\r\nd2$R_Score[d2$Recency &gt;= Rq[5]] &lt;- \"1\"\r\nd2$R_Score[d2$Recency &lt; Rq[5] &amp; d2$Recency &gt;= Rq[4]] &lt;- \"2\"\r\nd2$R_Score[d2$Recency &lt; Rq[4] &amp; d2$Recency &gt;= Rq[3]] &lt;- \"3\"\r\nd2$R_Score[d2$Recency &lt; Rq[3] &amp; d2$Recency &gt;= Rq[2]] &lt;- \"4\"\r\nd2$R_Score[d2$Recency &lt; Rq[2]] &lt;- \"5\"\r\n\r\n# 5 quantile for frequency\r\nFq &lt;- quantile(d2$Frequency, probs = seq(0, 1, 0.2), na.rm = FALSE, names = TRUE)\r\nd2$F_Score[d2$Frequency &gt;= Fq[5]] &lt;- \"5\"\r\nd2$F_Score[d2$Frequency &lt; Fq[5] &amp; d2$Frequency &gt;= Fq[4]] &lt;- \"4\"\r\nd2$F_Score[d2$Frequency &lt; Fq[4] &amp; d2$Frequency &gt;= Fq[3]] &lt;- \"3\"\r\nd2$F_Score[d2$Frequency &lt; Fq[3] &amp; d2$Frequency &gt;= Fq[2]] &lt;- \"2\"\r\nd2$F_Score[d2$Frequency &lt; Fq[2]] &lt;- \"1\"\r\n\r\n# 5 quantile for monetary\r\nMq &lt;- quantile(d2$Monetary, probs = seq(0, 1, 0.2), na.rm = FALSE, names = TRUE)\r\nd2$M_Score[d2$Monetary &gt;= Mq[5]] &lt;- \"5\"\r\nd2$M_Score[d2$Monetary &lt; Mq[5] &amp; d2$Monetary &gt;= Mq[4]] &lt;- \"4\"\r\nd2$M_Score[d2$Monetary &lt; Mq[4] &amp; d2$Monetary &gt;= Mq[3]] &lt;- \"3\"\r\nd2$M_Score[d2$Monetary &lt; Mq[3] &amp; d2$Monetary &gt;= Mq[2]] &lt;- \"2\"\r\nd2$M_Score[d2$Monetary &lt; Mq[2]] &lt;- \"1\"\r\n\r\n# convert character to numeric\r\nd2$R_Score &lt;- as.numeric(d2$R_Score)\r\nd2$F_Score &lt;- as.numeric(d2$F_Score)\r\nd2$M_Score &lt;- as.numeric(d2$M_Score)\r\n\r\n# calculate the total score\r\nTotal_Score &lt;- c(100*d2$R_Score + 10*d2$F_Score+d2$M_Score)\r\nd3 &lt;- cbind(d2,Total_Score)\r\nhead(d3)\r\n<\/pre>\n<p>After getting codified segments, you need to get the highly responsive groups who will account for better profitability. To that end, it is required to find a breakeven point, which will divide sample groups into two segments: a target group and a non-target group. The breakeven point is the response rate that makes profits from marketing activities equivalent to costs for those activities.<\/p>\n<p><strong>Number of target customers * Breakeven point * Profit &#8211; Number of target customers * Marketing cost = 0<\/strong><\/p>\n<p>From the above formula, the breakeven point is obtained by dividing costs with profits. We need to apply this response rate to R, resulting in the target customers who have higher than the rate. Before this process, you may need to check how customer groups account for response rates. Higher number of groups show higher response rate from marketing activities in a bar chart. i.e. the group of 555 has higher possibility to buy some stuff than the group of 111.<\/p>\n<p><a href=\"http:\/\/jkthinks.synology.me\/wp-content\/uploads\/2015\/02\/Rplot.png\"><img loading=\"lazy\" class=\" wp-image-1795 aligncenter\" src=\"http:\/\/jkthinks.synology.me\/wp-content\/uploads\/2015\/02\/Rplot.png\" alt=\"Rplot\" width=\"415\" height=\"402\" \/><\/a><\/p>\n<pre class=\"lang:r decode:true\"># draw a bar chart to check the relation between group and response rate\r\ny1 &lt;- aggregate(d3$Buy, by=list(d3$Total_Score), FUN=mean, na.rm=TRUE)\r\nhead(y1)\r\nbarplot(y1$x, names.arg=y1$Group.1, ylim=c(1,1.10), col=rainbow(25), ylab=\"Average response rate\", xlab=\"Groups\", xpd=F)\r\n\r\n# if you make the value of x be between 0 and 1, but this job requires a lot of RAM memory\r\nsapply(y1$x,function(x) {as.numeric(y1[,2])-1})\r\n\r\n# find the highly responsive customers above the break-even point\r\n# break even can be calculated by dividing costs by profits, let's say 2%\r\ny2 &lt;- sapply(y1$x, function(x) {x&gt;1.02})\r\nhead(y2)\r\ny2 &lt;- cbind(y1,y2)\r\nnames &lt;- c(\"Group\", \"R_Rate\", \"Target\")\r\nnames(y2) &lt;- names\r\nhead(y2)<\/pre>\n<p>Now you know who the target customers are, and you can save money for your marketing activities such as mailing or distributing coupons.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>When you try to focus on the target segments with a high response rate, RFM is one of the most useful methods. Most of all, RFM is intuitive and easy to get results in a way that it is a kind of heuristic analytics, which is different from a regression model. RFM is an acronym [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[9],"tags":[],"_links":{"self":[{"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=\/wp\/v2\/posts\/253"}],"collection":[{"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=253"}],"version-history":[{"count":1,"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=\/wp\/v2\/posts\/253\/revisions"}],"predecessor-version":[{"id":276,"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=\/wp\/v2\/posts\/253\/revisions\/276"}],"wp:attachment":[{"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=253"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=253"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.jkthinks.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=253"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}