Philadelphia Reflections

The musings of a physician who has served the community for over six decades

Related Topics

No topics are associated with this blog

George: 10 Randomly numbered Blog ID numbers into 20 Chapters in Two Steps

The blogs and blog IDs are presently generated by the machine as "New Blogs", and later converted to Topics. In the process, every item of text, whether a blog, a topic name or a chapter name, acquires a blog ID number. Much of the following discussion is conducted by manipulating blog IDs and then mentally converting them to Blog Names, Topic Names, and Chapter names -- without excessive description. Chapter Headings are manually generated by the author, supplied as a list (see below) to be manually replaced as needed; this process is continued wherever the material justifies it. Alternative pathways are deletions of extra steps, occasionally required by the material. To be entirely comprehensive, it is contemplated that the machine will make the selection when feasible. For the most part, the material will not cost-justify a completely automated operation. The first step, conversion of English terms to Topic Headings, is so burdensome it justifies the programming effort.

The Mechanical Process alternating with the Manual Process. The problem to be solved is to present the operator with ten Blog-ID's listed in the modified table of contents by the TF-IDF process, from which approximately 2-5 blogs are manually chosen for relevance, and the remaining blogs either made invisible or erased (or ignored). ( In this way, a set of up to ten blog numbers are selected randomly by the machine, as relevant to each Chapter heading grouping of 2-5 -- also generated manually at random times. {Out of these, two blog numbers are selected and printed, occasionally more, occasionally less, but eventually, about two are selected manually to accompany one Chapter heading.} This seemingly impossible connection is accomplished by TF-IDF selecting ten keywords as intermediary steps mechanically, and then manually selecting the (up to) 2-5 finalists. At the moment, the synthetic intermediary TF-IDF output need not be displayed, but it may have other uses.

At this First step, the manual input of Chapter headings is produced in English prose, and the Blog ID is produced by the machine. The relevant connecting step is produced by TF-IDF. The intermediate connecting number need not be printed out, at least at present. {The Blog ID is later to be converted to a list of Chapter Headings with related blog IDs but this last step is invisible, merely printed out as the random Blog IDs within Chapter headings.} The "Chapter Heading" or manually produced Topic Grouping (they are the same because all text is first entered as a Blog) is entered manually, but here reproduced as a guide, but produced in "Table of Contents" language. In a day or two, we will have constructed the two lists of two sets of outputs, each produced both manually and by machine. The assumption to be tested in two widely different subjects (A collection of Japanese Haiku 14-line poems, and a History of American Constitutional sovereignty arguments) is that the TF-IDF product and the randomly-assigned Blog IDs are substantially interchangeable, at least for this purpose.

Early "Chapter Headings": Introduction, The English Settlements 1619-1776, William Penn, Quakers Feel Their Oats, The Era of French and Indian War 1763-1776, Redirecting the Revolution Toward Independence: The British Prohibitionary Act of 1775, Subjugating the Mid-Atlantic States 1776-1778, Constitutions: What's So Good about Ours; Why Does Europe's Fail Them?; The Federalist Founders; The National Perspective; Articles of Confederation, Would They Suffice?; Architecture of a National Governance; Afterthought Amendments 1793-97; Marshall and the Third Branch of Government; The Small-Government Rebels; Ratification and Balance; Washington's Two Terms; Chaotic Rebalancing; Pre-Civil War; The Guano Approach; The Lincoln Approach; Reconstruction; The Gilded Age; That Damned Teddy Roosevelt Cowboy; Woodrow Wilson; The First World War; The Greatest Generation; Bretton Woods; The Woodstock Phenomenon; Opening Up Asia; The Third World War; American Dominance; etc.

Second Step, assuming tests of the First Step show substantially identical results between manual and automated:

There is one weak step. Without trying it, it is is not possible to judge which of two branches to take, at worst, it requires both of them. The first step assumes the Chapter Heading is the same as the Blog heading. In fact, it is only one-tenth its size and the blog quantity must be reduced by 90%, to accomplish the desired outcome. (There are about 10-20 Chapters and about a thousand Topics, so the Topics are in need of 10:1 reduction.) In one example (U.S. Constitution), TF-IDF is perhaps completely unnecessary, because the linkage between blogs and chapters is almost entirely chronological. This issue should be tested.

However, where the linkage between blog numbers and chapter headings is chronologically impossible (as in Terse Verse), some other method of linkage must be provided. One approach is to envision the process as linking three stages (blog heading, topic heading and Chapter heading), but only printing the Chapter heading. Where there is only one blog per chapter or the topic chapter heading is blank, the goal is satisfied by producing the Chapter headings from the Blog headings. Where there is more than one blog per chapter, an additional step is required. It should be remembered that the purpose of automation is the reduction of manual input, and sometimes the manual phase does not justify the cost of automation. The provision of twenty Chapter summaries might be the price of adopting this approach, although the combined blog titles might suffice for a re-run of IF-IDF if the Chapter titles are supplied. If we are lucky, the IF-IDF processing might be unnecessary.

The submission of a one-paragraph summary at this point would surely suffice, or else the multiple blog titles from IF-IDF, perhaps in the Description block.The changed goal of this approach is to reduce the eligible contestants to the point where manual completion is feasible, a ratio of at most 3 or 4 per one. As an act of desperation, completely manual Chapter Heading composition is probably feasible.

 

Please Let Us Know What You Think

 
 

(HTML tags provide better formatting)