Introducing Steemfilter: a new tool to find quality posts and promising authors on Steemit

It’s no secret that it’s hard for new authors to get noticed and build a following on Steemit. In addition, Steem is getting flooded with spam and valueless posts. That’s why I am developing Steemfilter, a website which can contribute to solving this problem. First, Steemfilter can help new Steemit users find more meaningful posts. It could also prove useful for established users wishing to support new community members and for whales to discover new interesting authors worth voting.

Currently it’s just a proof of concept. The load time is weird (more on this later) so be patient please. What matters now is whether the project’s central idea, namely to determine certain criteria to filter out possibly less relevant posts, can be valid.

How it works

Steemfilter loads new posts starting from about 15 minutes old and processes them using the following filters:

1. Language detection

Steemfilter uses Google language detection algorithm to filter out non-English posts. No offence intended to non-English authors — later I’ll add custom filters including language.

2. Short post detection

It’s rarely possible to convey something of value in a twitter-like fashion. I believe a good post should be at least 1000 characters long. I know some good photos and videos could be filtered out, so I’ll think about a separate filter using early votes and comments to predict more interesting short posts containing photos or videos.

3. Plagiarism check

Currently I’m using a workaround — instead of direct plagiarism check it just checks if a post wasn’t marked by @cheetah bot. Maybe not all 15-minute old posts get this bot’s attention. Later I’ll incorporate plagiarism check into the code directly.

4. Post with no images

The last check so far is to make sure the author took time to put to her/his post at least one image. Again, I know some posts could be cool even without images, so later I’ll tweak it in the same manner as above.

Technical details

The code is written in PHP and uses official Steem API. The website runs on a WordPress install and can use all the rich functionality of this engine. I can replace the site design easily using ready-made WordPress themes. After things get more stable, I’ll make a custom theme.

This is my first Steemit-related coding project, and I still have to solve the site load time problem — API requests take quite a lot of time despites the site is located at WP Engine, one of the fastest WP hosting platforms out there. Any advice on it would be greatly appreciated.

Future plans

I’m already using Steemfilter on my own to support new authors and find interesting posts to vote for. It works.

The closest steps I’m considering is adding customizable filters, especially for tags and languages support. The closest next step would be a possibility to load fresh #introduceyourself posts.

If the project proves useful, I’m thinking about adding more complex text analysis tools like readability tests, topic detection or English grammar check to filter out posts translated with Google.

How you can help

Try using Steemfilter for a while and share your experiences to help me polishthe algorithm as well as the look and feel. If you’re an experienced Steemit coder, I will be happy if you answer a few questions I can’t solve on my own yet. Just leave a comment here if you’re willing to help and I’ll connect to you on discord. Resteem this post so that it could to reach more people and get more feedback.


Thanks to @voorash for helpful answers.

First published on Steemit

Leave a Reply

Your email address will not be published. Required fields are marked *