The solution to problem 10 (find unique words) considers a "word" to be anything that matches the pattern r"[0-9a-zA-Z-']+"
Alas, there are many words in shakespeare.txt that contain words outside of this pattern. The first few are:
- Personæ
- Phœbus
- dæmon
- Cæsar
- Æneas
These are mis-detected as
-'Person'
-['Ph', 'bus']
-['d', 'mon']
-['C', 'sar']
-'neas'
The solution to problem 10 (find unique words) considers a "word" to be anything that matches the pattern
r"[0-9a-zA-Z-']+"Alas, there are many words in shakespeare.txt that contain words outside of this pattern. The first few are:
These are mis-detected as
-
'Person'-
['Ph', 'bus']-
['d', 'mon']-
['C', 'sar']-
'neas'