So my free thought on Shannon :
Data words are generally incoherent to humans.
In humans Redundancy reduction based on probability is with creating sets called *categories* which contain attributes of objects.
But also concepts which are theories of categories can be expressed in phrases not just single words.
Ok. It’s clicking a little.