### Apple’s first AI paper focuses on creating ‘superrealistic’ image recognition

##### December 28, 2016

Apple’s first paper on artificial intelligence, published Dec. 22 on *arXiv *(open access), describes a method for improving the ability of a deep neural network to recognize images.

To train neural networks to recognize images, AI researchers have typically labeled (identified or described) each image in a dataset. For example, last year, Georgia Institute of Technology researchers developed a deep-learning method to recognize images taken at regular intervals on a person’s wearable smartphone camera.

The idea was to demonstrate that deep-learning can “understand” human behavior and the habits of a specific person, and based on that, the AI system could offer suggestions to the user.

The problem with that method is the huge amount of time required to manually label the images (40,000 in this case). So AI researchers have turned to using synthetic images (such as from a video) that are pre-labeled (in captions, for example).

**Creating superrealistic image recognition**

But that, in turn, also has limitations. “Synthetic data is often not realistic enough, leading the network to learn details only present in synthetic images and fail to generalize well on real images,” the authors explain.

So instead, the researchers have developed a new approach called “Simulated+Unsupervised (S+U) learning.”

The idea is to still use pre-labeled synthetic images (like the “Synthetic” image in the above illustration), but refine their realism by matching synthetic images to unlabeled real images (in this case, eyes) — thus creating a “superrealistic” image (the “Refined” image above), allowing for more accurate, faster image recognition, while preserving the labeling.

To do that, the researchers used a relatively new method (created in 2014) called Generative Adversarial Networks (GANs), which uses two neural networks that sort of compete with each other to create a series of superrealistic images.*

A visual Turing test“To quantitatively evaluate the visual quality of the refined images, we designed a simple user study where subjects were asked to classify images as real or refined synthetic. Each subject was shown a random selection of 50 real images and 50 refined images in a random order, and was asked to label the images as either real or refined. The subjects found it very hard to tell the difference between the real images and the refined images.” — Ashish Shrivastava et al./

arXiv

So will Siri develop the ability to identify that person whose name you forgot and whisper it to you in your AirPods, or automatically bring up that person’s Facebook page and latest tweet? Or is that getting too creepy?

** Simulated+Unsupervised (S+U) learning is “an interesting variant of adversarial gradient-based methods,” Jürgen Schmidhuber, Scientific Director of IDSIA (Swiss AI Lab), told *KurzweilAI.* *

*“An image synthesizer’s output is piped into a refiner net whose output is classified by an adversarial net trained to distinguish real from fake images. The refiner net tries to convince the classifier that it’s output is real, while being punished for deviating too much from the synthesizer output. Very nice and rather convincing applications!”*

*Schmidhuber also briefly explained his 1991 paper [1] that introduced gradient-based adversarial networks for unsupervised learning “when computers were about 100,000 times slower than today. The method was called Predictability Minimization (PM).*

*“An encoder network receives real vector-valued data samples (such as images) and produces codes thereof across so-called code units. A predictor network is trained to predict each code component from the remaining components. The encoder, however, is trained to maximize the same objective function minimized by the predictor.*

*“That is, predictor and encoder fight each other, to motivate the encoder to achieve a ‘holy grail’ of unsupervised learning, namely, a factorial code of the data, where the code components are statistically independent, which makes subsequent classification easy. One can attach an optional autoencoder to the code to reconstruct data from its code. After perfect training, one can randomly activate the code units in proportion to their mean values, to read out patterns distributed like the original training patterns, assuming the code has become factorial indeed.*

*“PM and Generative Adversarial Networks (GANs) may be viewed as symmetric approaches. PM is directly trying to map data to its factorial code, from which patterns can be sampled that are distributed like the original data. While GANs start with a random (usually factorial) distribution of codes, and directly learn to decode the codes into ‘good’ patterns. Both PM and GANs employ gradient-based adversarial nets that play a minimax game to achieve their goals.”*

*[1] J. Schmidhuber. Learning factorial codes by predictability minimization. Technical Report CU-CS-565-91, Dept. of Comp. Sci., University of Colorado at Boulder, December 1991. Later also published in *Neural Computation*, 4(6):863-879, 1992. More: http://people.idsia.ch/~juergen/ica.html*

## Comments (28)

January 14, 2017by kewac01

A major problem with machine translation of languages, making short synopsis of longer articles and even the theory of grammar and understanding of word concepts is hampered by not understanding how meaning is learned not only visually but for each sense individually. The theories are crude and blocked by not understanding how we understand.

A theory of grammar has to be based on understanding, to be comprehensive.Otherwise you get a crude shell, which poorly expresses what is happening to produce composite understanding.

I have discussed visual objects with some of there transformations, and motion in a preliminary manner. Words which are at the heart of language, begin with visual objects. Mama and papa are visual concepts from which a child expands to other concepts which in the simplest sense are forms of nouns. Some words are verbal sounds for some types of objects, as motions as typed and generalized are forms of transitive verbs.As we add time to the mix we get tenses. How we can teach a machine to understand intransitive verbs ( essentially forms of the verb to be) I am unsure, since it does not have consciousness and so does not have any experience being as humans do.

The present systems of machine language translation involve using machine learning on huge reams of data to statistically determine the positions of words relative to one another without real understanding of meaning. So the results are crude.

Our understanding of how the ear hears and determines what is being said, is hampered by being crude, so that we cannot recognize different accents by machine, and have great difficulty generalizing words since we only poorly can localize and clearly distinguish individual voices. We have difficulty nullifying for fast speech as against slow speech. As a consequence machine understanding is much poorer than human understanding. Given that these days machine calculation is so dazzeling with its incredable speed, if we could understand more completely what is going on in our subconscious then we would at least begin to approach Kurzweill’s singularity.

January 16, 2017by AlphaAlcott

How can we expect any AI network to completely flourish within its developing stages?

January 2, 2017by kewac01

Freud died in London in Sept 1939. Although there was some attempts at calculation by steam and IBM was creating electro mechanical calculators, the electronic vacuum calculator was yet to be invented in the 40′s. The first machines were the size of football fields and probably could do less than a good hand held machine today. He, i believe was the first to suggest that there was a subconcious, albeit of the emotions and sexuality. His theories including ego, superego and id, do not touch on the way we recognize objects, with there very extensive transformations. I do not know who suggested that there might be such a thing, but there have been numerous attempts to try to develop programs that enable machines to control and recognize objects way back in the 50′s and 60′s.

In the case of objects and pattern recognition ( which I have considered to be only for 2D objects) they never hit on the math which is needed to understand this stuff. The reason why it is so hard to comprehend is that virtually all of the learning was “learned’ by evolution, over many eons of time, and is hard wired into our brains and to a lesser extent in the brains of many higher animals. My explanations are a theoretical attempt to explain how we accomplish some of these feats. You open your eyes and you see the effects,the end results, with no clues as to how it happened. I am attempting to explain it, with the end goal of creating humanoid robots, that will do ALL work, that humans now do and probably much more efficiently and faster. It will create terrible problems and could easily result in the demise of all life on earth. For it to become the potential panacea that it can be, humans MUST confront the problems that must happen as the old society is shredded to its very foundations.

A major problem is the combinatorial explosion, which will quickly overwhelm even the fastest and largest computers unless you hit on the correct form of invariance in the transformation.

Seemlessly, I believe, from the comprehension of simple objects and transformations into ever more complex transforms and then to abstractions and understanding there is build up to the development of human language and thought. With our understanding will come the ability to construct truly humanoid robots rather than toys.

I can still remember the beauty of Walt Disney’s Pinnochio, which my grandmother brought me to see as a treat. I was awed by Gepeto’s ability to breathe life into a piece of wood. I guess it stimulated a desire to emulate him. Probably I’ll die trying.

January 4, 2017by kewac01

I have so far not spoken about the treatment of 3D surfaces. Even in simple algebra of 2D, an expression which is moved WITHOUT change of size will have its expression algebraically change with little predictability. With size change it becomes a monstrocity. So, there is virtually no study of size change since Euclid spoke of similarity that I know about. Since our main coordinate systems are located in each eye in our head, which moves around every time we move , it becomes virtually impossible to recognize objects by machine, since you will have combinatorial explosions all over the place if trial and error is used. Once you have size invariance under movement transformations the problems are drastically simplified as they are now for lines in 2D and 3D.

Without spending time on it we use well understood systems of stereopsis to convert the 2D images in each eye into 3D. The best programming code uses the least amount of space with the greatest accuracy.

I have been using the Discrete calculus, where there is no infinite divisabilty and all results end in numbers rather than algebra. Differential Geometry(DG) has some excellent potential theory for use in a robot, but we have to convert this theory to numbers and not end with algebra, ideally at every step.Where the algebra is written in the geometry it has been carefully vetted by mathematicians and is very usable if translated as much as possible to numbers.

Surfaces in DG are divided up into very small contiguous surfaces called patch’s (Monge patches). These patches are so small that only one type out of four possibilities is in it ( ideally). The whole surface has to be expressed as points which are surrounded by patches with no overlap but with no spaces uncovered by a patch.

As I understand it, ( and I may be wrong having been totally self taught in DG ), Geodesics can neatly divide most surfaces nicely into patches. The determination of geodesics involves very abstruse algebra, which of course is not well explained anywhere. It does involve taking a plane tangent to our point(on the surface) and projections on that plane.Suffice to say somebody can program it,hopefully simply. Assuming we form our patches from geodesics than the rest is simple.

If we take a second plane perpendicular to the tangent plane thru the point and rotate. As we do so we are cutting out lines on the surface of the patch. Each of these lines has a curvature. If the curvatures as we rotate the plane vary, then there is a maximum curvature and a minimum curvature. Sometimes all the curvatures are equal ( a sphere) others may be zero.

The Gaussian curvature for a surface is the max curvature times the min curvature. There are simple plug in formulas using the curvatures of geodesics to find the Gaussian and Mean curvatures.

The Mean curvature of a surface is the sum of both curvatures divided by two. It is the average of the max and min curvatures.

I stated the theorem previously that if the size change is multiplied by L times then the curvature is 1/L times the curvature. I proved it on my own, since there was nothing about size in DG theory.

If it holds on surfaces, then Gaussian curvature is multiplied by

1/ L(squared) .The Mean curvature is only multiplied by 1/L. Since the Gaussian curvature can be zero whereas the Mean curvature will be zero if all curvatures on the surface are zero, the size invariant curvature for surfaces is best stated as the Size curvature =

the Gaussian curvature / the Mean curvature squared

With this simple formula that achieves size invariance we should easily be able to detect other cars or people crossing the street or walking their dogs on a leash etc.

Likewise most factory work can with tweaks be automated.

Since the torsion parallels the curvature especially under size change, You can have a Gaussian torsion and Mean torsion with a Size torsion as well. In the case where we have straight lines or planes in our patch, then we will be dividing by zero which is impossible.

Although I am not knowledgeable about Reiman Manifolds in 4 or more dimensions, I presume analogies to the Size curvature and torsion do exist. Possible there will be many other analog curvatures (anacurvatures) in higher dimensions?

January 10, 2017by kewac01

For some reason whatever I write is not going in as a comment and I am losing the results?

January 10, 2017by kewac01

I will try again.

Artists have often led in the study of pattern recognition as they evolved different techniques. The first realistic images of human faces was produced by the Old Masters. Pointillists produced the first tri-color pixels to paint different colors.This process is now used in TV cameras.Child coloring books often contain linear images of people and objects devoid of shading and color.

The first attempts at AI deliniation involved taking photographs and deliniating color fronts. Because of shading this often resulted in useless lines. If we have color photographs this can be easily rectified by only using solid colors eliminating shading.The fronts should produce most lines properly. If not then with AI training the machine should do well. In the case of black and white photos it is much more difficult, However using differing shades of gray to determine where fronts exist may be helpful. The key is to get rid of shading due to lighting from the sun or artiificial lamps possibly by using AI training and then properly diliniate.

January 10, 2017by kewac01

Continuing in trying to say my piece.

In programming computers to recognize a face from millions of choices they almost certainly had to use aeriation ( the spacial measurement of parts of the face in the face, nose, mouth, eyes, hair, teeth, cheeks etc. They also probably took into account size in some way or other. The concept and its implementation in programming was not carried over generally to other programs. It probably stands in isolation in part because of the extreme difficulty for anyone but an expert to figure out from the programming what is going on and what is to be achieved, and how. Since most programs are copywrited or patented this sifles other innovation. It can be argued that copywrights and patents drastically obstruct the improvement of programs for machines.

I believe the programming of humanoid robots should be open source and development should be accomplished using methods similiar to Wikipedia for use by ALL mankind.

January 10, 2017by kewac01

We have the capacity to recognize analogous shapes even though they are distorted. Indeed it is at the heart of our capacity to understand words with their generality, to recognize what an object is although we never saw it before.

Take some simple 2D object like a square. Place its sides so they are parallel to one of the axes. Now mujltiply one axis by 2X. or some other multiple. then the square will become a rectangle.This is an analagous shape to a square ( annate). ( what its shape is I am unsure ). if instead of multiplying an axis we linearly increase the size between numbers on the axis we also get another type of annate. We can use sinosoidal shapes as the determinaton for our number positions on the axis creating still other annates.We can use irregular distances between numbers to create further distortion. If we stated with a circle rather than a square we would obtain ellipses etc. We can do the same with distortions in 3D on each of the axis. Take a cube or sphere to start with and then distort.

In 3D if we study a few cars to learn their general features then we can recognize almost any car type we have neve seen before.

January 10, 2017by kewac01

In the above post there are two parenthesis together. Part of a sentence was left out due to problems which i mentioned. The sentence should be: If we rotate the square so its sides are not parallel to the axes and multiply an axis by 2X then we get an annate ( what its shape is I am unsure.)

I left out in the chaos, a statement about style and its importance in art and pattern recognition.

We often can tell an artists work simply by its style. Many are world famous. Van Gogh, Gaughin, Picasso, etc. If a machine were programmed to discover the essence of a particular style, hypothetically we could use this to convert any picture or movie to this style.

In the recognition of some shapeless objects it is their style which we recognize. For example, Trees of a particular species without leaves.The branching and the size of branches are unique and no two trees are alike but we still can classify them.

Clouds, cracks on the side walk, chewing gum that has been used, rocks, stones, pebbles, water waves, sand, dirt the list is endless. Comparing shape is not enough, since no two examples are alike in shape.The style is crucial to recognition.

January 10, 2017by kewac01

What is a 3D object? which is presented to us by our subjective sense of sight? Remembering that a great deal of the analysis is hard wired into our subconcious and we are presented with a fete acompli, a finished product which we have to guess at how it was done?

Let’s go back to the 3D surface which we first translated on a plane and then rotated it at each point and were able to recognize the transformations by using angles that remain invariant under the transformations and line length approximations which are nullified using careful selection of adjacent lengths so that whole and part is preserved in recognition.

We can rotate the 3D object in any direction, but as we do so new parts of the object appear and some old parts disappear from view.If we have “learned” the surface from 3D image parts which are connected in our subconcious as a 4D enclosed surface where all sides are “known” then we can recognize the total object from only the part of it that is seen at any one time. On the 4D object in our subconcious all possible view points, which are presented as 3D subsets become invariant on the 4D object. Aereation of the smaller parts of the object in relation to other nearby parts are measured with size invariant constancy so that invariance is understood under most transformations.A 4D object is produced in much the same way that planets are sewed together from many strips into a whole picture.

If you watch a child leaning a toy, it turns it around so that it sees all sides of the object which are (if my theory is correct) sewed together into a 4D object which it ” understands” all 3D surfaces ( with 3D aeration of parts nullified for size) on each 3D surface on the 4D surface. Anate distortion is also hard wired in and nullified for size. That’s what it means to “understand” an object according to my theory.

January 13, 2017by kewac01

The 4D object I’m hypothesizing is different from the usual mathematical concept of the 4th dimension. 4D normally is understood as being an algebraic expression with 4 variables. Ditto for higher dimensions, where new variables are added for each added dimension. What I am using is geometric, basically using Analytic Geometry with numbered points. Geometrically the 4th dimension can not be actually constructed as the other three dimensions are, since the hard wiring of our brains does not allow us to picture that dimension as really perpendicular to the other three. I suppose time is the 4th dimension since we have to turn the 4D object to see the hidden side, which involves motion, and time.You might use clock time as the t dimension for a cartesian point in 4D, but clock time is not numbered like linear dimensions. Also what is the t of the hidden side when you are looking at the other side? I haven’t worked that out to my satisfaction but it opens up questions. It does suggest that time is different from the distance dimensions. It can only be viewed through motion, never alone and isolated as each distance dimension can be isolated into axis.

We always view solids as 3D surfaces from a specific viewpoint that can be changed with movement. A good example of a 4D object is the Moon. We see only one side, but even before the hidden side was seen by satelite most people assumed it was a rough sphere, which was confirmed by satellites. We can combine all different viewpoints, around the Moon to form the 4D rough sphere with pock marks etc. Because in most objects (although not the sphere) the outline of the 3D image can be very variable for each posiition of rotation, the outline is only useful in determining where the viewpoint(eyes) are located relative to the object observed. Recognition of the object is determined from the positions of the distinguishable parts of the object which are related by areation.( which is the key to detection.) Once we have “sewed ” together the various views of the object in 3D such that overlaps are only shown once, then with the 4D composite recognized we “know” the object, and can pick it out from crowds of different objects. The 4D object is an imaginary construct in our brains that we know but NEVER see.

The same is true for solid objects. We only see surfaces, but thru cuts in the surface we infer the nature of solids which we can only see in parts.

January 13, 2017by kewac01

One transform we have barely touched on is 3D objects projected onto a 2D surface (usually but not always a plane). In Differential Geometry there are some simple formula conversions using vectors which easily can project the 3D object onto a 2D plane (or another surface).

The reverse concept, taking a 2D image of an object ( on one plane of the eye, without using steropsis ) and converting it back to a 3D object is extremely difficult. Although I cannot do the math, possibly the inverse of the 3D object transformation to 2D image will allow for a 2D image to be transformed into a 3D object. This would be analogous to taking the derivative of a function and then taking the integral of the derivative function to get a close approximation of the original function with only some added constants.

The problem which bothered me for a long time is that under these spatial transformations the usual measures such as angles and lengths change unpredictably and it is hard to see any unchangeable factors under these changes.

From simple observation especially with some black and white photos, where there is confusion between lighting and shade and the object, it can take a few minutes to figure out what the object is? I believe the mind subconsciously tries a number of possibilities until it connects the 2D image with some 4D objects and then all other objects snap into place.

If my guess about the inverse projection works, then it can be argued that many higher abstractions which we use and understand may not at times use transformation constancy in their usage.

The style objects I discussed previously, are one possible example, as is the 2D to 3D conversion which we just discussed.

January 14, 2017by kewac01

Motion of an object whether it is the actual 3D surface or a 2D projection as occurs in a movie in fact requires the 4D “understanding” in many if not most cases since an object or its projection on a plane screen can involve rotation of the object in any way. For example as a human is dancing he/she can turn which requires 4D understanding to know it is part of the original starting object. If the 3D person is photographed ( projected onto a 2D screen ) as there is movement the changes can only be understood from the 4D concept. How from the 4D concept the 2D projection is easily recognized probably without a combinatorial explosion is a question I cannot now answer. However some technique has to be used which allows us to pretty easily recognize the projection as part of the 4D concept. How else could we so easily recognize famous personalities?

When limbs are moved in whatever manner we easily know the motion that has taken place in 2D or 3D.

Motion in particular has been studied in the making of cartoons. However it involves a lot of artistic feelings and measurements which are hard to completely automate, since many of these drawings are in part subconciously constructed by the artist. A machine has to be programmed with all the steps known to produce these images in motion.

Cartoons such as Mickey Mouse or Donald Duck etc. often involve imaginary combinations of animals and human features both of which are recognizable by children. They never occur in real life, but are a form of abstraction. Micky’s hands and arms and feet are not mouse paws, and his face is not that of a mouse. Children in particular easily identify with these creations and do not have to learn anything to understand them.

If motion theory of objects were much more completely known with the accuracy of machine measurements with their potential speed, then extrordinary accuracy of motion at much greater than human speeds becomes theoretically possible.

January 2, 2017by un11imig

Apple FIRST PAPER… lol.

December 31, 2016by almostvoid

At uni I was lucky to have done a semester of History & Philosophy of Science and our esteemed professor warned and advised us that when so-called experts talk the convoluted redefined verbal morass pretending to intelligent content so obviously missing in the context whilst overvalued in the abstract revealing a paucity of real time information of any worth apart from the known obvious than I am glad of his sage advice. The verbal antics of this article have proven my late great professor’s warning. Apple is trying to be smart and intelligent and mixing it with the best but what have they really come up with? A lot of digital dollops

December 30, 2016by polemus

Interesting but do you know whether the authors will try submitting this work in a peer-reviewed scientific journal?

December 30, 2016by Editor

I’ve requested that information from the senior author and will post.

December 30, 2016by DevilDocNowCiv

Havin read for decades – just once in a while, mind; most of my reading is simply sci fi novels – that “peer review is broken,” my reaction to this was first to google that phrase piece and see what I get. I got plenty, and given this submission to an open source e-journal with established respect it may certainly be better for all that this journal was used. Open, and thus open to analysis and criticism from many, quickly.

December 31, 2016by polemus

Well, I won’t disagree with you, but your comment on peer-reviewing would be true if Apple had already a large collection of published articles. I am aware of the limits of peer-reviewing having published and reviewed articles in different health-related/neuroscience journals. In the absence of a better, more accurate system, it is the best way to determine the relevance of a work within the context of scientifical knowledge. I am afraid that a “literature” that would not be critisized by expert would only rely on authors’ notoriety. And, it appears this is the case here: the “noise” about Apple’s “paper” is, at least partly, due to the origin of the paper and not about the content. Am-I wrong?

December 31, 2016by Editor

Papers are often posted to

arXivprior to peer-reviewed publication in these areas: Physics, Mathematics, Computer Science, Quantitative Biology, and Quantitative Finance and Statistics.December 31, 2016by almostvoid

bravo-spot on

December 29, 2016by kewac01

There are certain capacities which humans have for recognizing even solid objects that do not change their internal shape or coloration. These transformations start out very simply and work up to become very complex geometrically.

The simplest change involves movement in 2D without rotation at any point. What the brain analysis is using to produce an invariant concept is those features that are unchanged with this motion.The angles between adjacent vectors that are very close, is the most important one. Providing there is no size change, the lengths of simple lines in the outline and internally also will be invariant. If the object is enclosed, then there should be lines or points inside the outline, there positions relative to one another have to be measured and also the quirky differences( if they exist) in the outline must be measured. I like to call this aeration to distinguish it. If angles can be used to determine relative positions in aeration then it is likely to be much more invariant particularly under size change then linear measurements.

If we rotate the object about a point on any point in the plane then these invariants will be sufficient to clearly identify the visual object. Assuming the object is enclosed by an outline, although 2D shapes do not have to be so enclosed. A 3D object moved on a 2D screen will be invariant usually only if it has an outline and is enclosed.

Invariance is the key to object recognition. It prevents the development of combinatorial explosion, which otherwise would prevent even very powerful computers from being able to recognize the object. Invariance is very important, although with different variations and central inputs for ALL the senses. In the case of sound, spoken words are ( subject only to different accents), the invariances independent of the voice used and its amplitude within very wide ranges. In the cases of smell and taste, there are certain basic smells and tastes. By measuring the amounts of each of them a pattern is formed which defines the substance. In the case of touch, pressure, pain, hot and cold are the key parameters. In a way the human brain uses by analogy the concept of invariance( different for each sense) to enable recognition of analagous objects.

The next step in determining size transformations is size change on a 2D screen. ( we build up the crescendo.) Size change invariance is really problematic. Especially for curved irregular lines it is really impossible to get the exact points that correspond to a smaller or larger version of the shape. There is one form of geometry and the Calculus which offers some help. Differential Geometry attempts crudely to find invariances in lines and 3D algebraic shapes. They do not deal with size change at all, and the study of 3D shapes is particularly crude and poor. Algebra in the study of shapes causes more problems then it can ever solve. With algebra the enormously complex objects our eyes work with would be impossible to express and work with in any way. However the curvature(greek k) and torsion ( greek t ) are a little useful. We have to set up ratios which eliminate the size change distortion.

I will continue in my next post.

December 30, 2016by smb12321

My question: Wouldn’t you think those who work with this subject would have already considered these ideas? What kind of project would it be if experts missed something so obvious?

December 30, 2016by DevilDocNowCiv

smb,

Yes, that stuff would be the spitballing conversation before they dug in and started working with existing programs to refine this idea. The article was more general by far, much as Ray describes our frontal lobe brain-mind module as identical as a module of neurons in the cortex, but handling more general and sophisticated concepts.

December 30, 2016by kewac01

What I said especially at the beginning of my post is deceptively very simple. Actually, when you think about it, what makes something seem simple is really almost impossible to clearly define. We know it when we see it, but why we can’t say. The measurements whatever they may are I would think are subconcious( almost totally except for the finished end judgement.) hidden from us. Occam’s razor is a good example of the use of simplicity to determine the best theory. Why? Who knows but it does work. Mathematical elegance is also an example of simplicity of ideas often with powerful and not obvious results.( at least before shown the way ).

In Euclidian geometry, two shapes are similiar if they have the same corresponding angles and proportional corresponding lines. The point is that under size change, ONLY the angles are totally unchanged. The proportionality of the straight lines is more problematic. If you choose one line as the standard and divide all other lines by its length then you have a problem. We and many higher animals have the capacity to detect whole and part. If your standard line is not part of your image part then you can’t see the similarity. The numbers will be different in each object.( we recognize a horse head as coming from a horse, because the numbers we derive are the same whether the head is attached or not.It is dependent on each small section of the head, and not of the tail or body etc.) With lines in 2D objects we can with three adjacent lines obtain one measure which is independent under size change.Take a central line length, with its two ends attached to left and right lines. Use the central line as the divisor for each line (right and left) and the numbers obtained will be invariant under size change and whole and part. Simple isn’t it.

I mentioned Differential Geometry and the curvature and torsion in my statement. In this Geometry, they are invariant for all linear continuous curves in 2D and 3D. To understand the geometric definition of the curvature and torsion ( C and T ) you have to think of an xyz axis moving along the line. However they use vectors the tangent , normal and binormal to determine the size and direction of the trihedron.from point to point. There is a tiny distance between each adjacent point.The angle between the two adjacent tangents (x axis) divided by the distance between the points is the curvature. The angel between the binormals (z axis) divided by the same distance is the torsion.

It can be shown, that the that if one line is multiplied by some constant L, then both the curvature and torsion are multiplied by 1/L. This is general for all lines. The torsion ONLY is greater then zero if a line curves in 3D. Since the multiplier is the same if we form a ratio we get a constant under size change. However in 2D the torsion is zero. So to avoid infinities, we take the T/K ratio. It will be invariant under size change,for ALL lines but it can NOT distinguish between 2D curves. What ratio will? Tune in next time.

December 30, 2016by ivaray

Thanks for this insightful post.

January 1, 2017by kewac01

As I tried to indicate before, algebraic expression of 2D and 3D functions is a very poor way to easily, simply and concisely express any object abstractly. While proofs using general functions can have very broad scope, they do not represent the way the eye works or the natural world is built according to quantum theory. In both cases, there is a point of division beyond which you cannot get any smaller. The calculus assumes that most functions are infinitely divisible. Take the Earth as a model object. If it were a perfect sphere, then an algebraic expression for it would be very conscise. It is an oblate spheriod with all sorts of irregular undulations on its surface from mountains and plains, and rivers and oceans with waves etc. To even with some accuracy express the earth algebraically we would probably have to use fourier series which are usually infinite sums and are very messy to work with, even on computers. It is much easier to simply use a collection of discrete points to represent any object fairly. Instead of using algebra of the calculus it is much more realistic to use number representations and average rates of change, where the derivatives never go to zero. In this way most curves and surfaces become “differentiable”. I used the geometric definitions of the curvature and torsion because it is much clearer and more understandable than their algebraic expressions.

One way to solve the problem of 2D curves not having a torsion greater than zero and nullify for size change in 2D, is to only use the curvature, which is a form of derivative ( and which can be approximated using arithmetic expressions , especially geometrically ). There is a general therem in the calculus, which states that if a derivative is muliplied by a constant then it is the same as the constant times the derivative. This can easily prove that the derivative of the curvature will also be multiplied by the same constant under size change. If we put the derivative of the curvature/ by the curvature for each point of the curve then it will be invariant in 2D. The constant size multiplier will cancel out.

December 28, 2016by neilvanrijn

Sounds kind of Freudian to me, like ID, EGO, and SUPEREGO. OMG! Freud was an AI engineer>