Basically, a “word” as seen by the hyphenation algorithm is a sequence of characters with non-zero \lccode, which are not immediately preceded or followed by “bad stuff” (e.g. characters with zero \lccode, \discretionary, explicit \kern, vertical mode material); this is the simplified version, see appendix H of the TeX Book for the exact details. This implies, BTW, that words which are joined by an explicit hyphen are not hyphenated because TeX inserts a \discretionary after each hyphen automatically.

