3/7/16

This is a slightly modified web version of a presentation I gave at work in 2011 on using the keys functionality in SAS, as well as several different ways to search variables that are strings (ie. characters). I include all slides below, some notes for some slides, as well as a link to the .pdf file of the slides here.

This can submit all code, or highlight code and submit highlighted code.

Distance methods are ways of measuring how bad your mistake is. If you have Justin, and type in Dustin or Justine, it is not that bad. But if you have Justin and type in Mary, that is much worse.

This is good for anticipating possible typos and variations.

The general algorithm for this is from ~1920! Basically, you retain the leading consonant, drop vowels, assign numbers to remaining consonants in a certain way. One can also use the operator =* for the sounds-like operator. For example, Jobtosearch=*jobfromlist.

- Double a letter, cost 50
- Add a letter to the end, cost 35
- Insert a letter at the beginning, cost 200
- Etc.

Compged is more generalized than complev, not symmetric, and the order matters (SAS 9). Also using the COMPCOST function, you can set your own cost for operations.

I hope you found the presentation informative.

If you enjoyed *any* of my content, please consider supporting it in a variety of ways:

- Check out a random article at http://statisticool.com/random.htm
- Buy what you need on Amazon from my affiliate link
- Share my Shutterstock photo gallery
- Sign up to be a Shutterstock contributor
- Search Statisticool.com: