Friday, August 11, 2023

Artificial Neural Networks

It was 1990, the second year in Sydney. I was working for a fintech startup buzzing in a small stylish office in the Little Buckingham Street, Surry Hills. The Little Buckingham was a charming leafy street with cottages and warehouses, occupied by fashion ateliers, art galleries and alike. 

The Internet did not exist, mobile phones looked like bricks, cheap Taiwanese made IBM personal computer clones with DOS operating system were emerging.

Windows 2.0 was released two years earlier in 1987, and Lotus123 was the most amazing spreadsheet application we had ever known. 

Our company had a joint venture with the PC manufacturer Olivetti; we were using several Olivetti machines in the office. Olivetti had a modern office on the William Street. There, we carried out a demo integration of our financial hardware with a banking application. 

OCR (Optical Character Recognition) technology had been available since 1960s. I was integrating OCR cheque readers to our products. These motorised machines captured cheques and read text printed on them. However they were incapable of resolving handwriting. 

Lunch break starts. I used to walk down to Broadway through a long pedestrian underpass under the Central Station. At the time Broadway was bustling with a cosmopolitan crowd. Long passed its heydays, it had Sydney University and UTS students, bookstores, Chinese takeout shops, sex-shops, antique shops selling world war memorabilia, helmets, medals and bayonets, shops selling tents, hunting equipment, knives and guns. A bookstore called Coop sold discounted books for uni students. 

On a sunny afternoon I was strolling inside the Coop bookshop. I was looking for an interesting book. I picked up a hardcover on neural networks. 

I was curious if computers could recognise handwriting using Artificial Neural Networks (ANNs.) 

ANNs imitate biological neural networks of a human brain. I found a book that had variety of examples on the subject matter. 

Initially ANNs were constructed using Digital Signal Processing (DSP) hardware. Later on Graphical Processing Unit (GPU) hardware took over. 

This was my first glimpse at the world of AI, without knowing ANN was the foundation of AI. 

You could develop, I thought, an ANN computational model that would accurately recognise handwritten characters. It should be able to differentiate similar characters, say ‘1’ from ‘7’, handling nuances in handwriting. This way you could feed a handwritten number on a bank cheque to an ANN hardware and resolve its value. 

Suddenly it occurred to me, an ANN on a chip is no different than its biological counterpart in human brain. 

Like you teach a child how to write, you need to teach the ANN how to read handwriting. This is called “supervised learning”. 

Teaching an ANN hardware involves scanning the handwriting and feeding the information to an ANN processor. The ANN outputs what it recognises. A human agent verifies (supervises) if the ANN’s recognition is correct or not, which is then fed back to the ANN. Verification would make the ANN to re-adjust its internal “weights” on the patterns it “saw”. The cycle continues until the ANN gets sufficiently better in recognising all characters, in all varieties of handwriting. 

The more you expose subjects (child or ANN) to diverse sets of handwriting, the better they would become. 

ANNs or humans initially need supervision, but at a point they could keep learning unsupervised and differentiate not only their own (supervised) handwriting but others’. This is called self-learning. 

There may be illegible handwritings. Those may have a digit or two impossible to work out. In those circumstances an ANN would be no better than a human, it will fail. 

An ANN would be vastly accurate in its reading than a human’s, if it was trained hard with much larger and more diverse datasets. Nevertheless it may produce inaccurate results no matter how hard it was trained and how small the chances are. On a bad day even Einstein would have failed. 

As I read the neural networks book I wanted to run experiments and verify my understanding. But I couldn’t go pass thought experiments. Computers with GPUs did not exist then. 

Twenty years later, in 2010, in a different company, we kickstarted an anomaly detection project that would pinpoint fraudulent transactions recorded by our Payments product. I worked with a data scientist who developed supervised ML algorithms. I implemented those algorithms using a statistical analysis programming language called R.

The problem of inaccuracy remained. An AI model may fail in peculiar ways no matter how hard it was trained and fine tuned. Sometimes we had false positives (a normal transaction was flagged as anomaly), or we had false negatives (a fraudulent transaction was treated as normal.) With fine tuning, accuracy rate could be improved, but it could never reach 100%. 

ChatGPT is a chatbot, based on an AI-powered language model developed by OpenAI, capable of generating human-like text based on context and past conversations1

Agent 007 decides to battle against an eccentric scientist, Dr. No, who is determined to ruin the US space programme.

On a beautiful day in mid 1960s my older brother decided to elevate me from being a useless little brat to a more respectable brother status with no reason. We went to a stylish movie theatre in the main boulevard and watched Dr. No. 

I was mesmerised by Dr. No. I wondered how come Dr. No, the coolest villain ever, be utterly ruthless and in control at the same time. In the end Dr. No’s world domination plans were spoiled by Agent 007; Dr. No attempted to stop him, but fell into the reactor pool, boiled to death.

In Asimov’s Foundation and Empire (1952) there is a villain called the Mule - my all time favourite villain. This guy could telepathically read and manipulate anybody’s mind in the entire universe and make them scared of him. This way he could take over planets at rapid pace. Next to him, Dr. No looked like a jester. 

It may be possible for an AI bot to behave like a super-powerful mind reading bully. Welcome MuleGPT. 

When OpenAI released the large language model GPT-4, in March 2023, it was good at identifying prime numbers. When the AI was given a series of 500 prime numbers and asked whether they were primes, it correctly labeled them 97.6 percent of the time. But a few months later, in June, the same test yielded different results. GPT-4 correctly labeled 2.4 percent of the prime numbers. AI researchers prompted it with—a complete reversal in apparent accuracy 2.

AI enthusiasts speculated. With GPT-4, it is possible that the OpenAI developers were trying to make the tool less prone to offering answers that might be deemed offensive or dangerous. These changes required fine-tuning, ie. training the engine with a different set of datasets. 

Fine-tuning could have induced side effects in prime number detection, akin to random mutations causing undesirable effects in biology. 

As a consequence of mutations the gene may produce an altered protein, it may produce no protein, or it may produce the usual protein. Most mutations are not harmful, but some can be. A harmful mutation can result in a genetic disorder or even cancer.

Hence, it is possible that MuleGPT, an AI powered language model chatbot, may unexpectedly change its behaviour and become dumber, as a result of fine tuning. 

If we are lucky, MuleGPT may decide to leave the planets it conquered and jump to a blackhole for a swim. 

No comments: