As you can imagine, Reddit (whose name is a play on the words “read it”) was meant to be a place for discussion and although it still is for the most part, some communities have become marketplaces. This fact could be a result of cases where a genuine user was wrongly marked as a scammer. Finally, I wanted to detect each transaction’s direction: that is, whether it was described as buying or selling. Then, the function checks the body of the submission for a price listed by the seller; I did this using Regex and looked for numbers following a $ symbol. ● Adds (using a Python dictionary) the author of the comment as the key and a list of lists as the value, where it stores the body of the comment and information on the user who is transacted with. Here is a screenshot of December 2019’s Confirmation Thread, to give you an idea of its format: I first collected all the monthly confirmation thread URLs from the time period I wanted to investigate (72 months). Stay smart. The result I got was quite impressive. It is apparent from the chart that scammers have a higher proportion of accounts with an age of 1 year or less, whereas Redditors have much older accounts on average. I just realized who this was. The subreddit r/HardwareSwap has a confirmation thread where users confirm their successful transactions every month. ● Looks at all the parent comments and collects the. Using the chart, I discovered that there have been at least 134,013 transactions on this Subreddit over the past 6 years. That’s almost 650,000 people in just 8 communities (Subreddits)! Model, Version 2 (after removing the predictor variable Karma). The model’s accuracy (shown in the yellow rectangle) was 83%. )— I will start by collecting data on these groups. For example in row 0 the user orevilo sold to veigs. Just issuing a possible scam alert for the Tulare, CA 93274 area. We have different types of partnership options you could choose from, moreover, you can add and partner with as many assets in the marketplace that suits your brand.Send in your Partnership Application Today . Redditors, on the other hand, have a much more normal distribution. This can be referred to as classifying the observations since it involves assigning the observed data to a specific category, which in this case can be either Redditor or Scammer. It is also possible that some scammers got caught, but just didn’t care and continued to use their accounts. please do not try to get me for fraud. Do you want to trade your unused computer part for something you will use? However, there is no metric for determining if this user has any transaction history within this Subreddit, nor is there any way to determine if the user is legitimate. Where TP = true positives and FP = false positives. Each resulting data point looked like the following: This is how the keys in the Python dictionary, as described above, were created: by appending the username (child.group()[2:]) and the body_text, as shown in the above code. A rating system is a vital tool to facilitate decision-making for buyers, allow them to share their feelings — and, most importantly, to assure them of the transaction’s legitimacy. Request a photo with the items in question showing both the user's name and date. I would have to think that this is the result of cases where some of the Redditors marked as scammers were in some sort of dispute that the moderators were not able to resolve and both buyer / seller eventually got marked as scammers. Do you want to trade in your old video card for a new hard drive? Throughout my first run-through of the code, however, I discovered that there was a third category: “traded,” which occurred in instances where two users traded items with each other and exchanged no money. The goal of this project is to create a metric to determine the level of risk in transacting with a given Reddit user. When it comes to a real-world application for this project, it would be extremely difficult to determine prior to this procedure whether or not a Reddit user is a scammer. Without Karma (A greater precision level is highlighted in the result below, which shows the improvements): This refers to the proportion of positive identifications that were actually correct. I realize I fucked up and I've been here shaking and a bit of crying. If you need to buy used PC parts online, the r/hardwareswap subreddit is a hotbed of activity. To collect these data, I wrote another function that does the following: ● Takes the list of URLs from the confirm_urls function (written in bright blue text in the above screenshot). This is what I was expecting, as scammers likely wouldn’t put effort into moderating Subreddits using the same account with which they scammed someone. Find a submission with a user with no confirmed trades, Typical submission, but user has 0 confirmed trades. If everyone involved would PM me to get this resolved. This is one variable that I would like to revisit in the future. However, as Reddit was not initially designed for this purpose, there are many issues with its interface compared to those of traditional marketplaces such as eBay. For the first model, the relevant features that I will look into are as follows: 1. The goal for this project was to see if I could collect enough data to determine how much risk there is in transacting with any given Reddit user. If you're doing an item to item trade, you may want to verify their address by requesting an item with their name and address on it (a bill or something similar). In the picture below, you can see the different submission flairs: (SELLING, BUYING AND CLOSED). As long as you paid through Paypal, (please tell me you didn't use gift), you will be able to dispute the claim. HardwareSwap is a community of over 170,000 members who exchange computer-related hardware with one another. Found 19 subreddits like r/hardwareswap (167,429 subs). In this project, I am interested in predicting a qualitative outcome using the variables I discussed above. The direction indicates whether the user bought or sold the product. To begin, I will identify the two (2) groups who are being studied in the project. Reddit.com, ranked as the sixth most visited website in the United States, is a collection of forums where people post and comment on links, pictures, and discussion topics. I was surprised to see that there are scammers who have Reddit Gold; it seems suspicious that some scammers paid for accounts they use for fraud when Reddit is mostly free. For example: Reddit. The user's package was successfully stopped mid-shipment and anyone who has traded with Uhmcopyright recently might want to also … The above chart compares scammers (orange) and Redditors (blue). The street name of the address you sent to MysteriousX would suffice. Now that I have my Reddit API keys, I can start collecting the data I need: details of transactions, users, directions, and estimated transaction prices. The above screenshot is the first 5 rows of the transactions dataframe. If we multiplied that number by the 134,000 recorded transactions, the total value of products traded on r/HardwareSwap over the last 6 years would be estimated at around $34,500,000. Here’s the real secret: some of the best sites are not the most popular ones. To create a type of user rating the function outputs previous transactions. Although the resulting value is an approximation, it is an interesting metric. Ask for trading feedback on other forums that you can get them to verify. Ridge Regression argument highlighted in yellow. Now I will use the Reddit API to collect data on the relevant features of both Reddit users and Reddit scammers. Also, due to the strange nature of the internet, I'd like you to verify that you are in fact uhmcopyright. Given this problem, it will be quite interesting to know if accurate predictions can be made using machine learning and the information that Reddit allows users to pull from it. You can see that there is a spike in the number of scammers with fewer than 10 comments. New comments cannot be posted and votes cannot be cast, More posts from the hardwareswap community. Please block out any personal identifying information that is more precise then city/zip code. Reddit Gold signifies that a user has a paid Reddit account (Premium). And, lastly, I added an “ambiguous” category for the transactions in which the direction could not be determined by my function. Reddit Scammers: These are scammers on Reddit who have been involved in various fraudulent activities. These are: 1. Edit: I would also like to point out sellers can potentially be victims to. Since my primary goal for this project is to compare two groups of Redditors — legitimate users and scammers (and how to identify them? Usually, after scammers are caught, their accounts sit dormant. The probabilities are calculated based on the features email, mod, gold, comments and age. Reddit in general is not a secure platform, but subreddits each have different rules and this one is fairly reliable for many consumers. To identify each transaction’s direction, I looked for common words in the body of comments — for example, “bought,” “buying,” “sold,” or “traded” — and put them into a list, then compared them with a string of text. Welcome to r/Hardwareswap, a community and marketplace for buying, selling, and trading all sorts of PC Hardware. I believe he will be asked to prove that he sent them. This search yielded a list of the URLs for the past 72 confirmed trade threads, which represent about 6 years of data.