"He has...the habit of speaking in complete paragraphs, as though he's lecturing a psychology class instead of having a conversation." This is from a Mother Jones article, "What If Everything You Knew About Disciplining Kids Was Wrong?" But I don't want to comment on the content of that article, but on what this small observation can tell us about discourse and dialog. The discourse structure of text paragraphs (written or spoken) is nothing like that of natural dialog. In my view, this stems from the utterance situation of the author, who needs to plan for meaning discovery with a reader (of some targeted ideal type), knowing that the shared context with the reader will not be constrained to any particular physical situation but only constrained by the expressions on the page or screen. In this departure from the evolutionary context of language as turn-taking interaction, an author needs to plan for a "half-silent dialog" where they trust the reader to make the cognitive steps of construing what each sentence adds to (this reader's version of) the shared information state (SIS) constructed from the text. (The author also scaffolds the text with clues to the reader that they are on the right track.) The text is successful in its speech acts if the reader's SIS is substantially similar to the author's targeted SIS. Now if that account can be made mathematically precise, we could train computers to become competent readers and maybe even writers. It clearly needs a large component of pragmatics (see Korta and Perry's Critical Pragmatics, or even their SEP article "Pragmatics") to get at the author's meaning intentions (a Gricean word). Getting to certain semantic levels of content are just steps along the cognitive path to constructing an "intended" SIS. I do think a precise account of semantics and content is needed, and I think that will involve "construal" of word senses and their contribution to a dynamically evolving SIS. That in turn requires constructing "enriched content" that incorporates additional information from lexical knowledge, and be able to use the enriched content to resolve lexical and constructional ambiguities. With construal of enriched content from multiple alternatives, it is then possible to focus (in consciousness-accessible working memory) on a more "austere" content that corresponds more directly with the tracked phonological forms of the utterance. This austere content is closer to what philosophers have studied as "what is said".