Close Navigation
Learn more about IBKR accounts
3 Packages to Build a Spell Checker in Python

3 Packages to Build a Spell Checker in Python

Posted February 15, 2023
Andrew Treadway
TheAutomatic.net

This post is going to talk about three different packages for coding a spell checker in Python – pyspellcheckerTextBlob, and autocorrect.

pyspellchecker

The pyspellchecker package allows you to perform spelling corrections, as well as see candidate spellings for a misspelled word. To install the package, you can use pip:

pip install pyspellchecker

Once installed, the pyspellchecker is really straightforward to use. Note that even though we use “pyspellchecker” when installing via pip, we just type “spellchecker” in the package import statement. The first piece is to create a SpellChecker object, which we’ll just call “spell”.

from spellchecker import SpellChecker
 
spell = SpellChecker()

Now, we’re ready to test this out with a few misspellings. We’ll use a few words from this list of commonly misspelled words.

To attempt a correction, you can use the correction method:

spell.correction("adress") # address
spell.correction("becuase") # because

pyspellchecker also has a method to split the words in a sentence.

spell.split_words("this sentnce has misspelled werds")
 
#['this', 'sentnce', 'has', 'misspelled', 'werds']

Once we have a list of the words in the sentence, we can just loop over each word (via a list comprehension) using our SpellChecker object.

words = spell.split_words("this sentnce has misspelled werds")
 
[spell.correction(word) for word in words]
 
#['this', 'sentence', 'has', 'misspelled', 'words']

If you just want to flag what words in a sentence are misspelled you can use the unknown method. This method will return a Python set of the potentially misspelled words.

spell.unknown(["dilema", "column", "aquire"])
 
#{'aquire', 'dilema'}

We can also see the candidate spellings for a misspelled word.

TextBlob

The powerful TextBlob can also do spelling corrections. To install TextBlob we can use pip (note all lowercase):

pip install textblob

To use TextBlob’s spellchecking functionality, we just need to import the Word class. Then we can input a word and check its spelling using the spellcheck method, like below.

from textblob import Word
 
word = Word('percieve')
 
word.spellcheck()
 
# [('perceive', 1.0)]

As can be seen above, TextBlob returns two pieces – a recommended correction for this word, and a confidence score associated with the correction. In this case, we just get one word back with a confidence of 1.0, or 100%.

Let’s try another word that returns multiple possibilities. If we input the string “personell”, we get a list of possible corrections with confidence scores because this string is fairly similar in spelling to a few different words.

word = Word('personell')
word.spellcheck()
 
#[('personal', 0.65),
#('personally', 0.2642857142857143),
# ('peroneal', 0.06428571428571428),
# ('personnel', 0.014285714285714285),
# ('personen', 0.007142857142857143)]

According to its documentationTextBlob’s spelling correction feature is about 70% accurate.

autocorrect

The last package we’ll examine is called autocorrect. Again, we can install this package with pip:

pip install autocorrect

Once installed, we’ll import the Speller class from autocorrect. Then we’ll create an object that uses the English language (lang = ‘en’). We’ll use this object to do spelling corrections.

from autocorrect import Speller
 
check = Speller(lang='en')

Next, we can input a sentence to our object, and it will attempt to correct any misspellings.

check("does this sentece have misspelled wordz?")
 
# 'does this sentence have misspelled words?'

A few caveats

It’s important to keep in mind that no programmatic spell checker is perfect. However, Python does have several pre-made options available, as described above, but you could also potentially build your own as well using fuzzy matching. Also, words outside of context make it more difficult to determine the correct spelling if the misspelled string is similar to multiple words. For example, take the string “liberry”. This is a known misspelling for library. However, it is also just one letter off from liberty.

If we use one of the packages above, we get the word “liberty” returned, which is not illogical, as the string is very close in spelling, but context could help reveal which word makes the most sense. For building a contextual spell checker in Python, you might want to check out recurrent neural networks or Markov models.

spell.correction("liberry") # liberty
 
word = Word("liberry")
word.spellcheck() # liberty
 
check("liberry") # liberty

Originally posted on TheAutomatic.net Blog.

Disclosure: Interactive Brokers

Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

This material is from TheAutomatic.net and is being posted with its permission. The views expressed in this material are solely those of the author and/or TheAutomatic.net and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.

IBKR Campus Newsletters

This website uses cookies to collect usage information in order to offer a better browsing experience. By browsing this site or by clicking on the "ACCEPT COOKIES" button you accept our Cookie Policy.