Alright, the title might sound a bit sensational, but actually, „how do I get more visitors“ is always an issue and the following article will hopefully highlight how to work towards this direction and what WDF*IDF has to do with it.
If you’re dabbling in SEO circles, „WDF*IDF“ should be a household name to you. But not every webmaster or blogger has extensive knowledge in SEO. I even dare to say that most of you might have heard about WDF*IDF from „hearsay“, if you even heard about it at all. Therefore, I would first like to provide a basic understanding of the subject matter and then present you with a useful tool.
If you know about WDF*IDF already, just skip the next two paragraphs and go straight to the
WDF*IDF is not a song by „Die Fantastischen Vier“
Although WDF*IDF might sound like a part of the lyrics to the Song „MFG“ by Die Fantastischen Vier, it is, simply put, actually a formula to describe the quality of a text (by calculating the WDF*IDF for all terms of the text).
Aside from many other factors, such as the domain, the content (quality) of the page and the social signals, the term weight of a document plays a crucial role to get a good ranking in the search results. The term weighting describes how often a term (word / word combination) occurs in a document.
Previously, the term weight was based on the keyword density. After several considerations, this has changed and term weight is now based on the WDF*IDF formula, which originates from the field of information retrieval and has established itself for years. (If you want to know why exactly keyword density is nonsense, keep reading here – Note: content in german language).
How is WDF*IDF calculated
I don’t want and can’t offer the complete derivation and implementation of the WDF*IDF formula at this point. However, I would like to bring the basic idea of WDF*IDF closer to you. If you understood the formula somewhat, it will be easier to work with altogether.
The green part of the image shown above is the WDF, the within document frequency. To the right, next to the multiplication symbol, you see the IDF, the inverse document frequency. Left of the equal sign is the result of WDF*IDF, the term weighting.
Looking at the WDF part and ignoring the logarithm for now, you will see „frequency of term i in document j“, divided by the sum of all terms in the document, which is subtracted again by the occurrences of the term.
So, WDF calculates the number of terms in a document in relation to all other terms (that aren’t the term which was searched for). In addition to that, you have the logarithm to base 2, which is due to our language’s nature. (I’ll trust Mr Kratz’s statement on this one.)
The IDF can be calculated if you divide the set union of all results in an information retrieval system (e.g. database), in which the term i occurs, by 2, add 1 and take the logarithm to base 10.
Here’s a practical example for better understanding. Our document contains the text: „I need a job“. The term i that we want to evaluate is „job“. The information retrieval system is Google. Now we search Google for the term „job“ and get 8 results. Therefore, N(i)=8.
To calculate the numerator in the fraction of the IDF part, we will search Google for all terms in the document. For „I“, we get 5 results, for „need“, we get 4 and for „a“, we get 6 results. Now we add all search results and remove duplicate entries. This would be the set union.
Back to the theory. In practical use, the WDF*IDF isn’t calculated manually, but with the help of tools. I would like to present you with one, which is free as well.
Free WDF*IDF tool
To work with WDF*IDF in practice, you don’t need the complete set of tools for a couple hundred Euros a year right away. A good way to work with WDF*IDF is, for example, the tools provided by wdfidf-tool.com. They offer two free tools at the moment. One is a pure WDF*IDF analysis tool, the other one a WDF*IDF text editor.
The analysis tool expects a keyword and, if wanted, a comparison URL. For this article, I chose „Cloud Downloader“ (my own tool) as a keyword and selected my article about the Cloud Downloader.
As a result, you get the overview tab and the competition tab. The overview tab shows the relevant keywords with the highest weighting in both graphic and tabular form. The table also provides the exact WDF*IDF value and, if applicable, the calculated value of the comparison link.
Looking at the competition tab, you can quickly see the terms that your competitors are betting on and which keywords they utilize as proof keywords.
The WDF*IDF text editor takes the reverse approach You provide a keyword towards which should be optimized and after that, you write your text. In between, you can let the editor analyze the text over and over again. After analyzing, the editor shows you which keywords should be used how frequently in the text.
The editor helps proactively when writing WDF*IDF optimized texts.