The frequency list is mostly accurate, but some words are ranked lower because they often end up having multiple
possible forms after being lemmatized. In those cases, as we could not be certain which one was correct, they
were excluded. I am aware of it and this is not really a bug itself, just the nature of it. I am hoping to come
up with a better solution in the future :)