

This thread is a place to recommend such sources. I am referring to thinkers who made a very large (indirect) impact, or should have made such an impact. Nevertheless they disappeared down the memory hole. Topics should be related to: machine learning, physics, applied math, or other topics related to trading.
One of them was Walter Pitts, pioneer in this neural networks (there was a biographical film loosely based on his life, “Good Will Hunting”). Pitts was the reallife janitor who intellectually ‘humiliated’ professors:
Walter Pitts was the reallife MIT janitor with no formal education, that by the age of 13 could intellectually humiliate and dominate the greatest mathematicians, physicists, biologists, chemists, engineers, philosophers, economists, etc. in the world  can simply walk right into their offices and detect deep errors or technical improvements on their papers or the work they were doing.
Sources: http://cognet.mit.edu/book/talkingnets
Key quotes: "For three days he remained in the library until he had read each volume [of Principia Mathematica] cover to cover — nearly 2,000 pages in all — and had identified several mistakes. Deciding that Bertrand Russell himself needed to know about these, the boy drafted a letter to Russell detailing the errors. Not only did Russell write back, he was so impressed that he invited Pitts to study with him as a graduate student at Cambridge University in England. Pitts couldn’t oblige him, though — he was only 12 years old."
"During that year at the University of Chicago, Walter had gotten hold of Carnap's new book on logic. This was in 1938. He walks into Carnap's office with his own annotated version of the book, pointing out some flaws. And he gives it to Carnap, talks to him a while, then goes out, but doesn't introduce himself. Carnap spends the next couple of months hunting high and low for that "newsboy who knew logic." In the end, he did find Walter and persuaded the University of Chicago to give him some menial job. Walter had no funds, had separated himself from his family, so that was good."
"In a letter to the philosopher Rudolf Carnap, McCulloch catalogued Pitts’ achievements. “He is the most omniverous of scientists and scholars. He has become an excellent dye chemist, a good mammalogist, he knows the sedges, mushrooms and the birds of New England. He knows neuroanatomy and neurophysiology from their original sources in Greek, Latin, Italian, Spanish, Portuguese, and German for he learns any language he needs as soon as he needs it. Things like electrical circuit theory and the practical soldering in of power, lighting, and radio circuits he does himself. In my long life, I have never seen a man so erudite or so really practical.”"
"Walter said he felt like a “mutant” who was virtually incapable of asking someone to dance at a party—he could lecture, but not converse. This aspect of his character had an unfortunate backlash, for some people could not accept that they could never top him at anything. Once, at supper with the literary critic Edmund Wilson, Pitts told him that he had wrongly interpreted a particular historical point; after 45 minutes of lecturing, Pitts was finally thrown out of the house."
"In 1943, Lettvin brought Pitts into Wiener’s office at the Massachusetts Institute of Technology (MIT). Wiener didn’t introduce himself or make small talk. He simply walked Pitts over to a blackboard where he was working out a mathematical proof. As Wiener worked, Pitts chimed in with questions and suggestions. According to Lettvin, by the time they reached the second blackboard, it was clear that Wiener had found his new righthand man."
"Walter was sick. He was hard to talk to or find. I managed to talk with him quite a bit, and always he kept driving home the idea that what one should really do is to look at continuous approaches to neural networks rather than the discrete approaches. There was far more mathematical machinery available, and it was more natural to try to look at the statistical mechanics of large populations than to look at just small network problems."
From p.44 in "The Cybernetics Group" by Steve Heims:
"Pitt's talents were of a particular kind. He was primarily selfeducated. He was known to master the contents of a textbook in a field new to him in a few days. When he was only twenty, his detailed, precise, and comprehensive knowledge and understanding of mathematical logic, mathematics, physiology, and physiological psychology were already on par with those of leading practitioners in each of these fields. He was also studying and digesting the thinking of major Western philosophers, appreciating subtle features of their thought, and carrying on "conversations" with them in which McCulloch and friends could serve as third or fourth parties."
Pitts originated the first artificial neural network, use of Boolean logic in neural networks, the proof of the equivalence between automata and Turing machines, and also the entire use of statistical methods in neuroscience. Von Neumann repeatedly cited his paper with McCulloch as the founding result for automata theory. He quickly left academia however because of personal issues (see above article).
"A Logical Calculus of the Ideas Immanent in Nervous Activity" by McCulloch and Pitts (1943) https://www.cs.cmu.edu/~./epxing/Class/10715/reading/McCulloch.and.Pitts.pdf
Now the Macy Conference transcripts on cybernetics/early neural networks are fully published (see Claus Pias), and you can see Pitts intellectually dominating every single conference, dominating others (like von Neumann, Wiener, L. Savage and Shannon) in discussion (say in p. 61), correcting their mistakes and making complex analyses on the spot: https://press.uchicago.edu/ucp/books/book/distributed/C/bo23348570.html 





There is some very interesting early history on neural networks (Pitts vs. Householder):
From "Walter Pitts and A Logical Calculus" by Schlatter and Aizawa: https://link.springer.com/article/10.1007/s1122900791829
"What is clear, however, is that when [Pitts's] first papers on neural networks came out in 1942 (Pitts 1942a,b, 1943), when he was about 19 years old, Pitts’s work showed greater mathematical sophistication than did Householder’s. Where Householder had used brute force to work only a portion of the way through the general problem of a network’s response to external stimulation using special case after special case, Pitts achieved solutions of a broader scope via simpler means. This is all the more impressive, since Householder had already received his PhD in mathematics prior to his joining Rashevsky’s group. In these achievements, we see part of what constituted Pitts’s often noted brilliance.
In his first paper on the topic, Pitts reconceptualized Householder’s problems and thereby condensed the results of Householder’s first three papers into one elegant proof. Pitts’s first change involved thinking of stimulation as a function of time. That is, Pitts decided to find excitation patterns when stimulation values yi(t) were a function of time t, not merely when stimulation values were fixed constant values. (Here Pitts made the assumption that it takes one unit of time for a stimulation to traverse a fiber.) At first glance, finding the stimulation values for all time values instead of their eventual constant value appears to be a much harder problem. However, Pitts was able to write difference equations for these functions based on the traversal of signals around the loop. Then Pitts could find the steady states yi by taking the limit of yi(t) as t went to infinity. In other words, instead of taking Householder’s approach that a steady state was a solution to a set of equations, Pitts thought of a steady state as the limit values of a set of functions. As an added benefit of this dynamical approach, Pitts gained much more information about the behavior of the network over time."
Pitts, W. (1942a). Some observations on the simple neuron circuit. Bulletin for Mathematical Biophysics, 4, 121–129. Pitts, W. (1942b). The linear theory of neuron networks: The static problem. Bulletin for Mathematical Biophysics, 4, 169–175. Pitts, W. (1943). The linear theory of neuron networks: The dynamic problem. Bulletin for Mathematical Biophysics, 5, 23–31.
Further in that book I cited above (by Pias) there are very key conversations among: von Neumann, Wiener, Pitts, Shannon, MacKay, L. Savage, etc., that attempt to push the formalism to be robust against biological assumptions:
In p.500, Pitts corrects Claude Shannon's understanding of Church's theorem:
"Pitts: Certainly both must be considered, but the question as to how you want to calculate the information numerically depends upon the exigencies of the particular situation, of course. I am sure we are all agreed on the necessity for considering both factors. With respect to Church’s theorem which Shannon mentioned earlier, just for the record I should like to say one thing about that. It is not impossible to make a machine that will prove provable theorems; but what Church’s theorem asserts is that it is impossible, given the theorem, to set any upper boundary to the time it may take. It is very easy to show that you can make a machine to print all theorems because you can write out the axioms in a finite list as they are generally constructed, and you can reduce the rules of procedure to be applied to those to a small finite number. Then you can simply classify all the theorems as those which result from one application of the rule, those which result from two applications of the rule, and so forth; that is, the machine can print all theorems in order, starting from the axiom. The only point is, if you are given a theorem, you don’t know how long it will be before that particular theorem shows up. A random process doesn’t help because there, again, although you may be able to be sure with Probability 1 that every theorem may occur sooner or later, still you can place no upper limit to the bounds which it may take for a given theorem, so it doesn’t help you there. But that wasn’t one of your important points.
Shannon: Perhaps I misunderstood the theorem, but I didn’t have that impression of it.
Pitts: Well, you see, in the case which I mentioned – in the sense that the common systems to which Church’s theorems apply can be listed that way – since the theorem is defined as the end result, and since the single steps are each of a mechanical character, of course, all can be obtained."
In p. 3161, von Neumann makes an estimate of 10^10 relays that are still insufficient for expressing cognition (this is extrapolated from the army ant where there are 300 neurons, but the performance exceeds what can be computed by 300 relays), and this is discussed by Pitts. (Quote omitted.) 







